Out Of Memory #8

erdog · 2020-03-17T15:09:19Z

Running zsm_my_video.sh. No matter what I do, I keep getting out of memory errors. I'm using an RTX 2060 with 16GB of system RAM and 6GB of dedicated GPU RAM. I'm running against a 1 second clip from a 480p video.

RuntimeError: CUDA out of memory. Tried to allocate 676.00 MiB (GPU 0; 6.00 GiB total capacity; 3.49 GiB already allocated; 662.13 MiB free; 3.66 GiB reserved in total by PyTorch) (malloc at ..\c10\cuda\CUDACachingAllocator.cpp:289)

erdog · 2020-03-17T15:10:47Z

Full stack trace

Traceback (most recent call last): File "video_to_zsm.py", line 131, in <module> main() File "video_to_zsm.py", line 110, in main output = single_forward(model, imgs_in) File "video_to_zsm.py", line 89, in single_forward model_output = model(imgs_temp) File "C:\Users\anon\AppData\Local\Programs\Python\Python37\lib\site-packages\torch\nn\modules\module.py", line 532, in __call__ result = self.forward(*input, **kwargs) File "C:\Users\anon\Documents\Zooming-Slow-Mo-CVPR-2020\codes\models\modules\Sakuya_arch.py", line 336, in forward feats = self.ConvBLSTM(lstm_feats) File "C:\Users\anon\AppData\Local\Programs\Python\Python37\lib\site-packages\torch\nn\modules\module.py", line 532, in __call__ result = self.forward(*input, **kwargs) File "C:\Users\anon\Documents\Zooming-Slow-Mo-CVPR-2020\codes\models\modules\Sakuya_arch.py", line 252, in forward out_fwd, _ = self.forward_net(x) File "C:\Users\anon\AppData\Local\Programs\Python\Python37\lib\site-packages\torch\nn\modules\module.py", line 532, in __call__ result = self.forward(*input, **kwargs) File "C:\Users\anon\Documents\Zooming-Slow-Mo-CVPR-2020\codes\models\modules\Sakuya_arch.py", line 219, in forward h_temp = self.pcd_h(in_tensor, h) File "C:\Users\anon\AppData\Local\Programs\Python\Python37\lib\site-packages\torch\nn\modules\module.py", line 532, in __call__ result = self.forward(*input, **kwargs) File "C:\Users\anon\Documents\Zooming-Slow-Mo-CVPR-2020\codes\models\modules\Sakuya_arch.py", line 157, in forward aligned_fea = self.pcd_align(fea1, fea2) File "C:\Users\anon\AppData\Local\Programs\Python\Python37\lib\site-packages\torch\nn\modules\module.py", line 532, in __call__ result = self.forward(*input, **kwargs) File "C:\Users\anon\Documents\Zooming-Slow-Mo-CVPR-2020\codes\models\modules\Sakuya_arch.py", line 91, in forward L1_fea = self.L1_dcnpack_1(fea1[0], L1_offset) File "C:\Users\anon\AppData\Local\Programs\Python\Python37\lib\site-packages\torch\nn\modules\module.py", line 532, in __call__ result = self.forward(*input, **kwargs) File "C:\Users\anon\Documents\Zooming-Slow-Mo-CVPR-2020\codes\models\modules\DCNv2\dcn_v2.py", line 140, in forward self.dilation, self.deformable_groups) File "C:\Users\anon\Documents\Zooming-Slow-Mo-CVPR-2020\codes\models\modules\DCNv2\dcn_v2.py", line 27, in forward ctx.dilation[1], ctx.deformable_groups) RuntimeError: CUDA out of memory. Tried to allocate 676.00 MiB (GPU 0; 6.00 GiB total capacity; 3.49 GiB already allocated; 662.13 MiB free; 3.66 GiB reserved in total by PyTorch) (malloc at ..\c10\cuda\CUDACachingAllocator.cpp:289) (no backtrace available)

Mukosame · 2020-03-17T15:18:23Z

Hi, you can edit zsm_my_video.sh and set "--N_out 7" to smaller numbers, like 3 or 5

erdog · 2020-03-17T15:21:19Z

Thanks for fast response. I did try that. I tried 3 and 5, same issue. What is the variable N_in used for? It doesnt appear to be used.

erdog · 2020-03-17T15:22:43Z

Using --N_out 3:

Traceback (most recent call last): File "video_to_zsm.py", line 131, in <module> main() File "video_to_zsm.py", line 110, in main output = single_forward(model, imgs_in) File "video_to_zsm.py", line 89, in single_forward model_output = model(imgs_temp) File "C:\Users\anon\AppData\Local\Programs\Python\Python37\lib\site-packages\torch\nn\modules\module.py", line 532, in __call__ result = self.forward(*input, **kwargs) File "C:\Users\anon\Documents\Zooming-Slow-Mo-CVPR-2020\codes\models\modules\Sakuya_arch.py", line 342, in forward out = self.lrelu(self.pixel_shuffle(self.upconv2(out))) File "C:\Users\anon\AppData\Local\Programs\Python\Python37\lib\site-packages\torch\nn\modules\module.py", line 532, in __call__ result = self.forward(*input, **kwargs) File "C:\Users\anon\AppData\Local\Programs\Python\Python37\lib\site-packages\torch\nn\modules\conv.py", line 345, in forward return self.conv2d_forward(input, self.weight) File "C:\Users\anon\AppData\Local\Programs\Python\Python37\lib\site-packages\torch\nn\modules\conv.py", line 342, in conv2d_forward self.padding, self.dilation, self.groups) RuntimeError: CUDA out of memory. Tried to allocate 3.52 GiB (GPU 0; 6.00 GiB total capacity; 1.98 GiB already allocated; 1.70 GiB free; 2.60 GiB reserved in total by PyTorch)

Mukosame · 2020-03-17T15:23:56Z

Hi, please check this loc: https://github.com/Mukosame/Zooming-Slow-Mo-CVPR-2020/blob/master/codes/video_to_zsm.py#L23
I guess you need to try a smaller size of video for now. I'll try to solve it in the following updates.

erdog · 2020-03-17T15:25:02Z

Thanks. I'll try that.

erdog · 2020-03-18T14:56:23Z

Commenting out line out = self.lrelu(self.pixel_shuffle(self.upconv2(out))) in Sakuya_arch.py, and using N_out=3 allows a few new frames to be created before running out of memory.

Mukosame · 2020-03-18T16:02:18Z

Thanks for bringing this up. I think this can be fixed after I optimizing the workflow of test.py. I'll let you know as soon as I fix it!

JadeWu233 · 2020-04-18T12:05:44Z

Hi, I Commented out line out = self.lrelu(self.pixel_shuffle(self.upconv2(out))） in Sakuya_arch.py. Although no error will be reported, the output video and pictures will all turn blue. I do not recommend this.

Namnodorel · 2020-04-18T13:36:49Z

I have the same issue. I messed around a bit with zsm_my_video.py to make it load smaller batches of images into RAM at a time, but it still quickly runs out of dedicated GPU RAM.
My test video had a resolution of 432x236, but I have only 4GB dedicated and 16GB "normal" RAM available.

NJ2020 · 2020-06-16T08:05:50Z

Hi Jade, Thanks for the response. Yes, the issue disappears once N_out is set to 3 and I am able to run it successfully. Thanks & Best Regards, NJ

…

On Tue, Jun 16, 2020 at 12:06 PM JadeWu233 ***@***.***> wrote: Hi， First, you can't set N_out to 1, because N_out refers to the number of output frames. The N parameter author has said that it can only be set to 3, 5, 7.  I recommended you take a closer look at the code to understand what N_out stands for. Maybe you can try to set N_out  to 3, or lower your input video resolution. Listening to your description seems to be running on the CPU, which is obviously not correct, you should run on the GPU. ------------------ 原始邮件 ------------------ ***@***.***>; 发送时间: 2020年6月11日(星期四) 晚上11:41 收件人: "Mukosame/Zooming-Slow-Mo-CVPR-2020"< ***@***.***>; ***@***.******@***.***>; 主题: Re: [Mukosame/Zooming-Slow-Mo-CVPR-2020] Out Of Memory (#8) Hi, I am running video_to_zsm.py on a system with 2080Ti GPU (11GB) and 32 GB of system RAM on 1 second clip of 480p video. OS is Ubuntu 18.04 LTS. I have set also set N_out = 1. It is running out of system RAM consuming all 32 GB and finally killing the process. Could you please suggest any work around/solution for this? — You are receiving this because you commented. Reply to this email directly, view it on GitHub, or unsubscribe. — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#8 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AL7CX4ZUTF7MBNG4EZ2KR4TRW4HGRANCNFSM4LNQ4NFQ> .

jiqirenno1 · 2020-07-29T03:13:50Z

Encounter the same problem，

Traceback (most recent call last):
  File "./video_to_zsm.py", line 126, in <module>
    main()
  File "./video_to_zsm.py", line 105, in main
    output = single_forward(model, imgs_in)
  File "./video_to_zsm.py", line 86, in single_forward
    model_output = model(imgs_temp)
  File "/home/ubuntu/miniconda3/envs/my/lib/python3.7/site-packages/torch/nn/modules/module.py", line 532, in __call__
    result = self.forward(*input, **kwargs)
  File "/media/ubuntu/6d2d0331-b42f-419a-b343-592105134c85/ubuntu/data/Prjpython/Zooming-Slow-Mo-CVPR-2020/codes/models/modules/Sakuya_arch.py", line 342, in forward
    out = self.lrelu(self.pixel_shuffle(self.upconv2(out)))
  File "/home/ubuntu/miniconda3/envs/my/lib/python3.7/site-packages/torch/nn/modules/module.py", line 532, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/ubuntu/miniconda3/envs/my/lib/python3.7/site-packages/torch/nn/modules/pixelshuffle.py", line 43, in forward
    return F.pixel_shuffle(input, self.upscale_factor)
RuntimeError: CUDA out of memory. Tried to allocate 2.64 GiB (GPU 0; 7.93 GiB total capacity; 4.13 GiB already allocated; 1.08 GiB free; 6.35 GiB reserved in total by PyTorch)

Mukosame · 2021-09-05T07:36:11Z

The newest version should consume less memory and the CUDA memory.

EAGLE50 mentioned this issue Jul 14, 2020

RuntimeError: cuDNN error: CUDNN_STATUS_NOT_SUPPORTED. This error may appear if you passed in a non-contiguous input. #9

Open

ThompsonHe mentioned this issue Oct 26, 2020

RuntimeError: CUDA error: CUBLAS_STATUS_ALLOC_FAILED when calling cublasCreate(handle) #44

Closed

Mukosame mentioned this issue Oct 29, 2020

Is there any way to solve cuda out of memory problem when input image is large? #45

Open

Mukosame self-assigned this Oct 29, 2020

Mukosame removed their assignment Sep 5, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Out Of Memory #8

Out Of Memory #8

erdog commented Mar 17, 2020

erdog commented Mar 17, 2020

Mukosame commented Mar 17, 2020

erdog commented Mar 17, 2020

erdog commented Mar 17, 2020

Mukosame commented Mar 17, 2020

erdog commented Mar 17, 2020

erdog commented Mar 18, 2020

Mukosame commented Mar 18, 2020

JadeWu233 commented Apr 18, 2020 •

edited

Namnodorel commented Apr 18, 2020

NJ2020 commented Jun 16, 2020 via email

jiqirenno1 commented Jul 29, 2020

Mukosame commented Sep 5, 2021

Out Of Memory #8

Out Of Memory #8

Comments

erdog commented Mar 17, 2020

erdog commented Mar 17, 2020

Mukosame commented Mar 17, 2020

erdog commented Mar 17, 2020

erdog commented Mar 17, 2020

Mukosame commented Mar 17, 2020

erdog commented Mar 17, 2020

erdog commented Mar 18, 2020

Mukosame commented Mar 18, 2020

JadeWu233 commented Apr 18, 2020 • edited

Namnodorel commented Apr 18, 2020

NJ2020 commented Jun 16, 2020 via email

jiqirenno1 commented Jul 29, 2020

Mukosame commented Sep 5, 2021

JadeWu233 commented Apr 18, 2020 •

edited