GPU Out of Memory #43

davidmrau · 2022-06-18T20:49:09Z

I was wondering what parameters I could change to be able to run it on GPU with limited RAM. I tried reducing the layers to 4, which did not help. Also, it seems like batch size is set to 1 by default. I am using 4x TITAN RTX 24GB.

albertfgu · 2022-06-19T21:20:49Z

What model and config are you running? You can always reduce the model size by reducing number of layers (e.g. model.n_layers=1) or model dimension (e.g. model.d_model=64). Also, are you using either the CUDA extension or the pykeops library? Either of those will save a lot of memory.

davidmrau · 2022-06-20T08:51:16Z

I successfully installed the CUDA extension but the script output indicated the extension was not used. Is there anything particular I have to do after installing the extension?

davidmrau · 2022-06-20T08:53:07Z

What model and config are you running? You can always reduce the model size by reducing number of layers (e.g. model.n_layers=1) or model dimension (e.g. model.d_model=64). Also, are you using either the CUDA extension or the pykeops library? Either of those will save a lot of memory.

I am using sashimi and youtube-mix.

albertfgu · 2022-06-20T15:29:11Z

Do you see the extension cauchy-mult listed under pip list? What is the output of the train script that indicates the extension is not being used?

Bingye-Ren · 2022-06-23T00:03:15Z

I am running into OOM too. Where do we set model.n_layers and model.d_model?

Thanks!

albertfgu · 2022-06-23T00:14:16Z

You can pass those in on the command line. Which config/command are you running, and are you using either of the efficient kernels?

Bingye-Ren · 2022-06-23T23:20:15Z

I am running this on colab, so I changed a bit of the code so it works with Python 3.7. I am trying to run the sc09 dataset, and I don't believe I am using any of the efficient kernels. How can I use some of the more efficient kernels?

albertfgu · 2022-06-24T01:39:45Z

You can install pykeops following their instructions: https://www.kernel-operations.io/keops/python/installation.html

If you're using the standalone model and not this codebase, you will have to adjust parameters such as d_model by passing it into the Module in a standard way.

Bingye-Ren · 2022-06-30T17:12:29Z

It worked, I just reduced the layer size. Thank you so much! !python -m train experiment=sashimi-sc09 wandb=null model.n_layers=1 model.d_model=32 was the code for anyone who needs it.

albertfgu · 2022-06-30T17:59:51Z

Great! I'm a little confused why the model needs to be shrunk so aggressively, though. I believe our AR generation configs should all fit in a 16Gb GPU (@krandiash is that right?)

krandiash · 2022-06-30T19:37:18Z

That's correct, you should not need to reduce the size of the model, our models were trained on single 16GiB V100s. If there are differences in terms of available GPU memory I recommend:

reducing dataset.sample_len for YouTubeMix or Beethoven (e.g. try 65536 or 98304)
reducing loader.batch_size for SC09

Bingye-Ren · 2022-07-16T13:17:39Z

Sorry for the late reply, but this worked. Thank you so much!

albertfgu · 2022-08-09T18:16:52Z

@davidmrau , I'm closing this issue for now. Feel free to reopen or open a new issue for further questions about GPU memory or the CUDA extension.

albertfgu closed this as completed Aug 9, 2022

1ucky40nc3 mentioned this issue Sep 24, 2022

sashimi generation not working #68

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

GPU Out of Memory #43

GPU Out of Memory #43

davidmrau commented Jun 18, 2022 •

edited

albertfgu commented Jun 19, 2022

davidmrau commented Jun 20, 2022

davidmrau commented Jun 20, 2022

albertfgu commented Jun 20, 2022

Bingye-Ren commented Jun 23, 2022

albertfgu commented Jun 23, 2022

Bingye-Ren commented Jun 23, 2022

albertfgu commented Jun 24, 2022

Bingye-Ren commented Jun 30, 2022

albertfgu commented Jun 30, 2022

krandiash commented Jun 30, 2022

Bingye-Ren commented Jul 16, 2022

albertfgu commented Aug 9, 2022

GPU Out of Memory #43

GPU Out of Memory #43

Comments

davidmrau commented Jun 18, 2022 • edited

albertfgu commented Jun 19, 2022

davidmrau commented Jun 20, 2022

davidmrau commented Jun 20, 2022

albertfgu commented Jun 20, 2022

Bingye-Ren commented Jun 23, 2022

albertfgu commented Jun 23, 2022

Bingye-Ren commented Jun 23, 2022

albertfgu commented Jun 24, 2022

Bingye-Ren commented Jun 30, 2022

albertfgu commented Jun 30, 2022

krandiash commented Jun 30, 2022

Bingye-Ren commented Jul 16, 2022

albertfgu commented Aug 9, 2022

davidmrau commented Jun 18, 2022 •

edited