Cuda out of memory while training #46

Deadmin1 · 2018-12-05T21:54:11Z

Hi,

first thanks to your work here.

I have a problem. always when i get from epoch 0 to 1 i get an "cuda out of memory" error.
I decreased the batch-size to 1 and still get the error. The first epoch runs fine from 8 down.

I am training on a custom dataset. My imagesizes vary.

Running it on a GTX1070.

Thanks in advance

Edit:
multi_scale is set to false
while training used memory of my gtx is:
2445/8116mib
After the first epoch the usage of vram bloats. I just could check it mid epochchange and it was nearly completly used till it ran out of memory again. Whats running that is so intensive in between epochs?

Deadmin1 · 2018-12-05T22:52:58Z

Ok found my problem...
If i understand it correct after each train epoch there comes a test "epoch".
Forgot to change the settings in test. It tried to load the COCO dataset and that was to big and overloaded my vram.

glenn-jocher · 2018-12-06T11:57:45Z

Yes, training uses up large amounts of GPU ram. Inference to a lesser degree. Try decreasing
-batch_size in test.py to 16 or 8. The default settings work with a 1080 Ti. Anything smaller you'll need to reduce batch size.

glenn-jocher closed this as completed Dec 6, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Cuda out of memory while training #46

Cuda out of memory while training #46

Deadmin1 commented Dec 5, 2018 •

edited

Deadmin1 commented Dec 5, 2018

glenn-jocher commented Dec 6, 2018

Cuda out of memory while training #46

Cuda out of memory while training #46

Comments

Deadmin1 commented Dec 5, 2018 • edited

Deadmin1 commented Dec 5, 2018

glenn-jocher commented Dec 6, 2018

Deadmin1 commented Dec 5, 2018 •

edited