Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CUDA out of memory #9

Open
schmacko234 opened this issue Sep 17, 2020 · 4 comments
Open

CUDA out of memory #9

schmacko234 opened this issue Sep 17, 2020 · 4 comments

Comments

@schmacko234
Copy link

Hi,

as the title says, my gpu ran out of memory. is there some way to reduce the batch size for cuda? i cant determine which variables in DNP are to change to reduce batch size.

@capi1O
Copy link

capi1O commented Oct 8, 2020

same error here RuntimeError: CUDA error: an illegal memory access was encountered

@mosheman5
Copy link
Owner

Perhaps try to run with a shorter file. What's the GPU memory capacity of you're machine?
Running on demo file requires 1520 MiB
This discussion might be helpful:
https://discuss.pytorch.org/t/weird-cuda-illegal-memory-access-error/8848/18
Let me know if any of it helped

@capi1O
Copy link

capi1O commented Oct 15, 2020

I run it on RTX 2070 with 8GB of DDR6, I tried to reduce the number of iterations (python DNP.py --run_name demo --noisy_file demo.wav --samples_dir samples --save_every 50 --num_iter 500 --LR 0.001) but same errors, and I also tried on the demo file.

In my case I had to add Option "Interactive" "0" to my xorg conf to avoid error Cuda runtime error : the launch timed out and was terminated (because GPU is also used to drive the display and kernel kills the CUDA process if it is too long to respond) so that might be related to that.

I will try the solutions proposed in the link you mentioned

@th3geek
Copy link

th3geek commented Nov 4, 2020

I'm having the same problem. I'm not getting the "illegal memory access was encountered" as another user above reported, but its saying my GPU is out of memory when it shouldn't be. Nvidia-smi is showing only 20mb of 8gb in use, but when I run DNP it reports that 6-7.5GB are used and I'm only able to get DNP to work using very small snippets of audio. It must be a driver or code issue. For example

$ nvidia-smi
Wed Nov  4 15:06:54 2020       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 455.32.00    Driver Version: 455.32.00    CUDA Version: 11.1     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  GeForce GTX 107...  On   | 00000000:01:00.0 Off |                  N/A |
| N/A   49C    P8     3W /  N/A |     11MiB /  8119MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|    0   N/A  N/A      2491      G   /usr/lib/xorg/Xorg                  4MiB |
|    0   N/A  N/A      2983      G   /usr/lib/xorg/Xorg                  4MiB |
+-----------------------------------------------------------------------------+
(DNP) ~/DNP$ python DNP.py --run_name demo --noisy_file s3f4-test.wav --samples_dir samples --save_every 50 --num_iter 5000 --LR 0.001
unet
  0%|                                                                                                                                             | 0/5000 [00:00<?, ?it/s]/home/user/anaconda3/envs/DNP/lib/python3.7/site-packages/torch/nn/functional.py:2351: UserWarning: nn.functional.upsample is deprecated. Use nn.functional.interpolate instead.
  warnings.warn("nn.functional.upsample is deprecated. Use nn.functional.interpolate instead.")

Traceback (most recent call last):
  File "DNP.py", line 91, in <module>
    , save_every=opts.save_every)
  File "DNP.py", line 68, in dnp
    optimize(model, criterion, input, target, samples_dir, LR, num_iter, sr, save_every, accumulator)
  File "DNP.py", line 18, in optimize
    out = model(input)
  File "/home/user/anaconda3/envs/DNP/lib/python3.7/site-packages/torch/nn/modules/module.py", line 489, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/apoorshlub/DNP/unet.py", line 51, in forward
    x = torch.cat([x,encoder[self.num_layers - i - 1]],dim=1)
RuntimeError: CUDA out of memory. Tried to allocate 764.75 MiB (GPU 0; 7.93 GiB total capacity; 6.69 GiB already allocated; 619.12 MiB free; 134.34 MiB cached)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants