Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GPU Out of Memory #43

Closed
davidmrau opened this issue Jun 18, 2022 · 13 comments
Closed

GPU Out of Memory #43

davidmrau opened this issue Jun 18, 2022 · 13 comments

Comments

@davidmrau
Copy link

davidmrau commented Jun 18, 2022

I was wondering what parameters I could change to be able to run it on GPU with limited RAM. I tried reducing the layers to 4, which did not help. Also, it seems like batch size is set to 1 by default. I am using 4x TITAN RTX 24GB.

@albertfgu
Copy link
Contributor

What model and config are you running? You can always reduce the model size by reducing number of layers (e.g. model.n_layers=1) or model dimension (e.g. model.d_model=64). Also, are you using either the CUDA extension or the pykeops library? Either of those will save a lot of memory.

@davidmrau
Copy link
Author

I successfully installed the CUDA extension but the script output indicated the extension was not used. Is there anything particular I have to do after installing the extension?

@davidmrau
Copy link
Author

What model and config are you running? You can always reduce the model size by reducing number of layers (e.g. model.n_layers=1) or model dimension (e.g. model.d_model=64). Also, are you using either the CUDA extension or the pykeops library? Either of those will save a lot of memory.

I am using sashimi and youtube-mix.

@albertfgu
Copy link
Contributor

Do you see the extension cauchy-mult listed under pip list? What is the output of the train script that indicates the extension is not being used?

@Bingye-Ren
Copy link

I am running into OOM too. Where do we set model.n_layers and model.d_model?

Thanks!

@albertfgu
Copy link
Contributor

You can pass those in on the command line. Which config/command are you running, and are you using either of the efficient kernels?

@Bingye-Ren
Copy link

I am running this on colab, so I changed a bit of the code so it works with Python 3.7. I am trying to run the sc09 dataset, and I don't believe I am using any of the efficient kernels. How can I use some of the more efficient kernels?

@albertfgu
Copy link
Contributor

You can install pykeops following their instructions: https://www.kernel-operations.io/keops/python/installation.html

If you're using the standalone model and not this codebase, you will have to adjust parameters such as d_model by passing it into the Module in a standard way.

@Bingye-Ren
Copy link

It worked, I just reduced the layer size. Thank you so much! !python -m train experiment=sashimi-sc09 wandb=null model.n_layers=1 model.d_model=32 was the code for anyone who needs it.

@albertfgu
Copy link
Contributor

Great! I'm a little confused why the model needs to be shrunk so aggressively, though. I believe our AR generation configs should all fit in a 16Gb GPU (@krandiash is that right?)

@krandiash
Copy link
Contributor

That's correct, you should not need to reduce the size of the model, our models were trained on single 16GiB V100s. If there are differences in terms of available GPU memory I recommend:

  • reducing dataset.sample_len for YouTubeMix or Beethoven (e.g. try 65536 or 98304)
  • reducing loader.batch_size for SC09

@Bingye-Ren
Copy link

Sorry for the late reply, but this worked. Thank you so much!

@albertfgu
Copy link
Contributor

@davidmrau , I'm closing this issue for now. Feel free to reopen or open a new issue for further questions about GPU memory or the CUDA extension.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants