RuntimeError: CUDA out of memory occurred while running demo.py #3

swoook · 2020-11-25T04:52:23Z

Issue description

demo.py fails to run with the error below

RuntimeError: CUDA out of memory. Tried to allocate 62.00 MiB (GPU 0; 10.76 GiB total capacity; 9.65 GiB already allocated; 45.94 MiB free; 9.91 GiB reserved in total by PyTorch)

Code example

Command to reproduce the bug:

python demo.py --trained_model /swook/model/dsfd/WIDERFace_DSFD_RES152.pth --widerface_root /swook/dataset/wider-face/WIDER_val --save_folder ./save --visual_threshold 0.1 --cuda CUDA

Error messages:

RuntimeError: CUDA out of memory. Tried to allocate 62.00 MiB (GPU 0; 10.76 GiB total capacity; 9.65 GiB already allocated; 45.94 MiB free; 9.91 GiB reserved in total by PyTorch)

Whole stack traces:

Traceback (most recent call last):
  File "demo.py", line 222, in <module>
    test_oneimage()
  File "demo.py", line 201, in test_oneimage
    det_b = infer(net , img , transform , thresh , cuda , bt)
  File "demo.py", line 72, in infer
    y = net(x)      # forward pass
  File "/opt/conda/lib/python3.6/site-packages/torch/nn/modules/module.py", line 541, in __call__
    result = self.forward(*input, **kwargs)
  File "/swook/repos/tencent/dsfd/face_ssd.py", line 240, in forward
    conv5_3_x = self.layer3(conv4_3_x)
  File "/opt/conda/lib/python3.6/site-packages/torch/nn/modules/module.py", line 541, in __call__
    result = self.forward(*input, **kwargs)
  File "/opt/conda/lib/python3.6/site-packages/torch/nn/modules/container.py", line 92, in forward
    input = module(input)
  File "/opt/conda/lib/python3.6/site-packages/torch/nn/modules/module.py", line 541, in __call__
    result = self.forward(*input, **kwargs)
  File "/opt/conda/lib/python3.6/site-packages/torch/nn/modules/container.py", line 92, in forward
    input = module(input)
  File "/opt/conda/lib/python3.6/site-packages/torch/nn/modules/module.py", line 541, in __call__
    result = self.forward(*input, **kwargs)
  File "/opt/conda/lib/python3.6/site-packages/torchvision/models/resnet.py", line 109, in forward
    out = self.bn3(out)
  File "/opt/conda/lib/python3.6/site-packages/torch/nn/modules/module.py", line 541, in __call__
    result = self.forward(*input, **kwargs)
  File "/opt/conda/lib/python3.6/site-packages/torch/nn/modules/batchnorm.py", line 79, in forward
    exponential_average_factor, self.eps)
  File "/opt/conda/lib/python3.6/site-packages/torch/nn/functional.py", line 1670, in batch_norm
    training, momentum, eps, torch.backends.cudnn.enabled
RuntimeError: CUDA out of memory. Tried to allocate 62.00 MiB (GPU 0; 10.76 GiB total capacity; 9.65 GiB already allocated; 45.94 MiB free; 9.91 GiB reserved in total by PyTorch)

System Info

PyTorch or Caffe2: PyTorch
How you installed PyTorch (conda, pip, source): docker (nvcr.io/nvidia/pytorch)
Build command you used (if compiling from source): None
OS: Ubuntu 16.04 LTS
PyTorch version: 1.4.0
Python version: 3.6
CUDA/cuDNN version: 10.2
GPU models and configuration: 2080 Ti
GCC version (if compiling from source): None
CMake version: None
Versions of any other relevant libraries: None

The text was updated successfully, but these errors were encountered:

swoook · 2020-11-25T06:29:08Z

Refer to the issues below

RuntimeError: CUDA out of memory. · Issue #60 · Tencent/FaceDetection-DSFD (github.com)
RuntimeError: CUDA out of memory · Issue #44 · Tencent/FaceDetection-DSFD (github.com)

Recall the required version of PyTorch is 0.3.1
However, ours is 1.4.0
Tencent/FaceDetection-DSFD uses some deprecated methods
Trying to replace them with latest methods

swoook · 2020-11-25T06:31:07Z

~~The suggestions i referred are also too old for the latest version.~~
~~Using a docker for torch==0.3.1 would be much easier.~~

swoook · 2020-11-25T06:46:54Z

NVIDIA says the nvidia:pytorch for torch==0.3.1 is nvcr.io/nvidia/pytorch:18.04-py3 [here]
However, it actually contains torch==0.4.0a0
nvcr.io/nvidia/pytorch:18.03-py3 also contains torch==0.4.0a0

swoook · 2020-11-25T09:56:59Z

NVIDIA says the nvidia:pytorch for torch==0.3.1 is nvcr.io/nvidia/pytorch:18.04-py3 [here]

However, it actually contains torch==0.4.0a0

nvcr.io/nvidia/pytorch:18.03-py3 also contains torch==0.4.0a0

~~It seems i have to build it myself.~~

swoook · 2020-11-25T10:02:45Z

Refer to the issues below

RuntimeError: CUDA out of memory. · Issue #60 · Tencent/FaceDetection-DSFD (github.com)

RuntimeError: CUDA out of memory · Issue #44 · Tencent/FaceDetection-DSFD (github.com)

Recall the required version of PyTorch is 0.3.1

However, ours is 1.4.0

Tencent/FaceDetection-DSFD uses some deprecated methods

Trying to replace them with latest methods

Those suggestions point out the reason correctly.
However, the solutions from those suggestions don't solve this problem.

There's other solution in #6 from Tencent/FaceDetection-DSFD.

swoook · 2020-11-25T10:10:07Z

Confirmed this solves the problem.

TODO

Provide the environment for this repo
Support the latest PyTorch

swoook · 2020-11-25T10:25:31Z

Refer to this commit for more details

swoook closed this as completed Nov 25, 2020

swoook added bug Something isn't working question Further information is requested and removed bug Something isn't working labels Nov 25, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RuntimeError: CUDA out of memory occurred while running demo.py #3

RuntimeError: CUDA out of memory occurred while running demo.py #3

swoook commented Nov 25, 2020

swoook commented Nov 25, 2020

swoook commented Nov 25, 2020 •

edited

swoook commented Nov 25, 2020

swoook commented Nov 25, 2020 •

edited

swoook commented Nov 25, 2020 •

edited

swoook commented Nov 25, 2020 •

edited

swoook commented Nov 25, 2020

RuntimeError: CUDA out of memory occurred while running demo.py #3

RuntimeError: CUDA out of memory occurred while running demo.py #3

Comments

swoook commented Nov 25, 2020

Issue description

Code example

System Info

swoook commented Nov 25, 2020

swoook commented Nov 25, 2020 • edited

swoook commented Nov 25, 2020

swoook commented Nov 25, 2020 • edited

swoook commented Nov 25, 2020 • edited

swoook commented Nov 25, 2020 • edited

swoook commented Nov 25, 2020

swoook commented Nov 25, 2020 •

edited

swoook commented Nov 25, 2020 •

edited

swoook commented Nov 25, 2020 •

edited

swoook commented Nov 25, 2020 •

edited