Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to avoid "CUDA out of memory" #11

Closed
Lilyo opened this issue Oct 7, 2023 · 5 comments
Closed

How to avoid "CUDA out of memory" #11

Lilyo opened this issue Oct 7, 2023 · 5 comments

Comments

@Lilyo
Copy link

Lilyo commented Oct 7, 2023

Hi,
Thank you for your great work! Now when I run train.py using the example data(creature) I get the following error and I want to know if this script can be run using a single RTX3090 with 24G VRAM?

torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 320.00 MiB (GPU 0; 23.70 GiB total capacity; 22.26 GiB already allocated; 89.44 MiB free; 22.32 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

Look forward to your reply! Thnaks!

@omriav
Copy link
Collaborator

omriav commented Oct 8, 2023

Hi,
I did not try it on RTX myself, but, can you please try to use xformers by adding --enable_xformers_memory_efficient_attention flag when executing the training script?
You may need to install the package first first via pip install xformers.

@Lilyo
Copy link
Author

Lilyo commented Oct 9, 2023

Hi ,
Eventually I solved it by (1) using 8-bit Adam, (2) set none grads, now it works, thank you!

@omriav
Copy link
Collaborator

omriav commented Oct 9, 2023

Great. Thanks for the update.

@omriav omriav closed this as completed Oct 9, 2023
@whiterose199187
Copy link

Hello @Lilyo

Could you please share what hardware are you training on? I am trying on a machine with 48GB VRAM, generating class images of size 768 x 968 and I get the same error as you even with the suggested optimisation flags.

Works fine if I use 512 x 512 sized images

@wdy321
Copy link

wdy321 commented Jun 28, 2024

Hi , Eventually I solved it by (1) using 8-bit Adam, (2) set none grads, now it works, thank you!

Hello, I use the same GPU as you, and still have memory overflow according to your configuration. Can you help me?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants