out of memory #79

manyuyuya · 2018-08-07T08:36:01Z

Hello! When I run the train.py, I met the problem about out of memory after a few epoches. It also happened even if I add the number of GPU. And I found some other people met this question ,too. I don't it's reason. Could you offer some help?Thank you very much!
It's the information about the question below:

step 120, image: 005365.jpg, loss: 6.3531, fps: 3.71 (0.27s per batch)
TP: 0.00%, TF: 100.00%, fg/bg=(14/285)
rpn_cls: 0.6417, rpn_box: 0.0229, rcnn_cls: 1.9303, rcnn_box: 0.1354
step 130, image: 009091.jpg, loss: 4.8151, fps: 3.78 (0.26s per batch)
TP: 0.00%, TF: 100.00%, fg/bg=(22/277)
rpn_cls: 0.6486, rpn_box: 0.2012, rcnn_cls: 1.7988, rcnn_box: 0.1184
step 140, image: 008690.jpg, loss: 4.9961, fps: 3.55 (0.28s per batch)
TP: 0.00%, TF: 100.00%, fg/bg=(30/269)
rpn_cls: 0.6114, rpn_box: 0.0690, rcnn_cls: 1.4801, rcnn_box: 0.1088
THCudaCheck FAIL file=/pytorch/aten/src/THC/generic/THCStorage.cu line=58 error=2 : out of memory
Traceback (most recent call last):
File "train.py", line 138, in
loss.backward()
File "/usr/local/lib/python2.7/dist-packages/torch/tensor.py", line 93, in backward
torch.autograd.backward(self, gradient, retain_graph, create_graph)
File "/usr/local/lib/python2.7/dist-packages/torch/autograd/init.py", line 89, in backward
allow_unreachable=True) # allow_unreachable flag
RuntimeError: cuda runtime error (2) : out of memory at /pytorch/aten/src/THC/generic/THCStorage.cu:58

jinsnowy · 2018-08-17T08:17:15Z

try pytorch version 0.3.1 with cudatoolkit 8.0
I used 0.4.1 version either, but had same error (may be gpu memory leak in code). So I downgraded the version of pytorch.

machanic · 2018-10-05T05:11:19Z

I think the memory leak due to RoI pooling layer, because when I copy the code of RoI pooling layer to my another project. It also memory leak on GPU.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

out of memory #79

out of memory #79

manyuyuya commented Aug 7, 2018

jinsnowy commented Aug 17, 2018

machanic commented Oct 5, 2018

out of memory #79

out of memory #79

Comments

manyuyuya commented Aug 7, 2018

jinsnowy commented Aug 17, 2018

machanic commented Oct 5, 2018