New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CUDA out of memory (with light flag) #23
Comments
Same error appear on my test, when I was trained on the 3000 step.(1080ti). I don't know why? |
I know the reason why error appeared at 3000 step. That because at 3000 step,one epoch has finished, but the memory not released, they need more memory(about 100+MB) to start a new data loader. When i chanded the input image size to smaller, it works. |
So every 1000th step it requires 100+mb more and not releasing it? Asking since i'm facing same problem but on 2000th epoch and my images are 256x256 |
Same error appear on my test, When I set print_freq =10000, it works. |
Apparently there is a bug in pytorch , when you open a new dataloader , it seems that the older dataloader will not be released , I have meet this so many times. |
You can open the UGATIT.py then add "with torch.no_grad()" before "step%print_freq"! |
|
I was getting "CUDA out of memory" error at beginning of training, I solved it by setting a lower base channel number than the default one (via --ch). I used 32 while default is 64 |
can you reproduce the results in paper? |
…ted by 07hyx06. Surrounded print output with torch.no_grad()
Hi guys!
I'm using RTX 2080Ti 11GB. At first I tried to train on a dataset of 100K images (1000px) with --light flag . And after the 1000th epoch I got the "CUDA out of memory" error. Then I tried a smaller dataset of 10K images (256px) and got the same error after the 1000th epoch. Finally I tried 3400 images (256px) and there were no changes.
Here is an output:
The text was updated successfully, but these errors were encountered: