-
Notifications
You must be signed in to change notification settings - Fork 37
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Training a model fails #4
Comments
did you generate hdf5 file first? |
yes, initially I thought that something with the generation didn't go good - since this script never completed:
So I decided to try the sample command that you have put in the README - so it should use the sample hdf5 file from the repo, unfortunately it made no difference. Is it possible that the two fail due to bad hdf5 setup? |
there's no sample hdf5 file, since it is too large. You should let it work till it finishes. |
thanks, I'll try that! How much time does it take on ur setup? Do you advise to increase the jobs? I'm using a Tesla K10 setup |
I managed to get it working, unfortunately it looks like the VRAM (3.5GB) is not enough. What's the best way to reduce the memory footprint? p.s.: I'm familiar with Johnson's implementation and know what I can do there, but I still haven't read your blogpost and the code documentation :( Edit 1: From first glance - looks like reducing the batch_size and n_colors might do the trick? I increased them to 8, maybe that's why it fails.. Edit 2: Is it even possible to squeeze the training into 3.5GB? I started going through the code and I noticed that you are already doing a lot of the memory optimizations (e.g. using cudnn and the ADAM optimizer).. |
Try doing batch_size = 1, do not change ncolors, you can also downsize the image to 256x256 for example |
looks like batch_size=1 did the trick, I previously tried with 2 and 3 with no success. Does this affect the quality or just the speed of the training? |
The quality will be ok, I used batch_size = 1, but at test time you need to experiment with |
BTW, do you recommend this repo for artistic neural transfer? probably to do it well there should be some semantic analysis that determines the masks :? Is there any other approach that you can recommend |
Hi, I tried to run the command from the tutorial for model training, but it failed with the following error:
any ideas why hdf5 might fail with such error?
The text was updated successfully, but these errors were encountered: