"CUDA out of memory" on dataset with 300 classes. #22

igorvishnevskiy · 2022-05-25T21:07:52Z

Let me mention that training on dataset with 6K inputs and 1 class, works great. However the training with 300 classes and 6000 images large dateset causes the following error:

------------CPU Mode for This Batch-------------
2022-05-25 13:52:53 | INFO     | yolox.models.yolo_head:335 - OOM RuntimeError is raised due to the huge memory cost during label assignment. 
CPU mode is applied in this batch. If you want to avoid this issue, try to reduce the batch size or image size.
OOM RuntimeError is raised due to the huge memory cost during label assignment. 
CPU mode is applied in this batch. If you want to avoid this issue, try to reduce the batch size or image size.

Training continues for some time, then quits completely with "CUDA out of memory":

RuntimeError: CUDA out of memory. Tried to allocate 22.00 MiB (GPU 0; 7.79 GiB total capacity; 6.20 GiB already allocated; 21.44 MiB free; 6.26 GiB reserved in total by PyTorch)

Trying to fix it. Help from everyone else is welcome. Please drop your solutions/thoughts here. As soon as I find a solution, I will share it here too. Thank you.

I'm running on 2 GPUs. GTX 1070 and RTX 3070. Should be plenty. Platform needs more optimization.

P.S. Lowering batch doesn't help. I set batch to "-b 2" and devices also to "-d 2". 1 batch per GPU. Can't get lower than that.

Image size is set to:
self.input_size = (256, 512)
self.test_size = (256, 512)
Also very low res.

The text was updated successfully, but these errors were encountered:

igorvishnevskiy · 2022-05-25T23:23:01Z

Just tried to cut inputs down to 10 images, but left 300 classes. Still issue reproduces. Same low res, same 1 batch per GPU. Issue is definitely caused by the high number of classes.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

"CUDA out of memory" on dataset with 300 classes. #22

"CUDA out of memory" on dataset with 300 classes. #22

igorvishnevskiy commented May 25, 2022 •

edited

igorvishnevskiy commented May 25, 2022

"CUDA out of memory" on dataset with 300 classes. #22

"CUDA out of memory" on dataset with 300 classes. #22

Comments

igorvishnevskiy commented May 25, 2022 • edited

igorvishnevskiy commented May 25, 2022

igorvishnevskiy commented May 25, 2022 •

edited