-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Facing issue while starting the Training #1
Comments
Hey! I suspect this is a problem with your CUDA installation, not something specific to my code. Are you able to run other Pytorch code successfully in your environment? |
Yes i have been able run other codes. I got this error. When i ran this in my local system, I got the below error 2020-01-25 11:20:08.541504: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'libnvinfer.so.6'; dlerror: libnvinfer.so.6: cannot open shared object file: No such file or directory So I ran this code in google colab environment. I am sharing my colab notebook link. https://colab.research.google.com/drive/1GdEO9zgHLPnXTZ6a00lXyj0B2pHfyOVa Kindly please check and let me know |
It's a bit confusing there is an error occuring in tensorflow since this is a Pytorch project, the only tensorflow code might be used by tensorboard. Did you try running the code in a pip virtualenv where you install only packages listed in Also see these posts where very similar issues are reported, maybe this helps: |
Also it might be that the code is already running normally, it just doesn't say anything during training! Check the output logs via tensorboard! Also pull the latest version of my code, I put in a training progress bar so you should now see text output at each training step! And definitely use |
@f90 Thanks :)... It works now. I can see the epoch progress bar now. I have a few more doubts.
!python Image2Image.py --cuda --batchSize=10 --loadSize 256 --dataset "diff" --num_joint_samples 300 --factorGAN 1 --experiment_name "diff" |
Glad that it works now! Since you are raising a bunch of new points now, I am going to create separate issues for those so we can handle these. Closing this issue now then, please post in the others from now. |
Hi,
I wanted to use your model for my research purpose. I tried to train the model as per the readme.file. But the training does seem to start at all.
Namespace(L2=0.0, batchSize=25, beta1=0.5, cuda=True, dataset='cityscapes', disc_iter=2, epoch_iter=5000, epochs=40, eval=False, experiment_name='25_samples_factorGAN', factorGAN=1, generator_channels=32, lipschitz_p=1, lipschitz_q=1, loadSize=128, lr=0.0001, num_joint_samples=25, nz=50, objective='JSD', out_path='out', seed=1337, use_real_dep_disc=1, workers=1)
Random Seed: 1337
dataset [AlignedDataset] was created
START TRAINING! Writing logs to out/Image2Image_cityscapes/25_samples_factorGAN/logs
2020-04-09 12:26:30.619713: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.1
................................
There is nothing more appears in the console. Please help me out.
The text was updated successfully, but these errors were encountered: