Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 400.00 GiB #64

Open
Go1denMelody opened this issue Jul 17, 2024 · 3 comments

Comments

@Go1denMelody
Copy link

When I run video super resolution model, there is an error
torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 400.00 GiB (GPU 0; 44.52 GiB total capacity; 12.21 GiB already allocated; 31.33 GiB free; 12.83 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

Why it need to try to allocate 400gb, should i change some settings?

@Go1denMelody
Copy link
Author

Traceback (most recent call last):
File "/home/powerop/work/LaVie/vsr/sample.py", line 151, in
main(OmegaConf.load(args.config))
File "/home/powerop/work/LaVie/vsr/sample.py", line 109, in main
upscaled_video_ = pipeline(
File "/home/powerop/work/LaVie/myenv/lib/python3.10/site-packages/torch/utils/contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
File "/home/powerop/work/LaVie/vsr/models/pipeline_stable_diffusion_upscale_video_3d.py", line 766, in call
image
= self.decode_latents_vsr(latents[start_f:end_f])
File "/home/powerop/work/LaVie/vsr/models/pipeline_stable_diffusion_upscale_video_3d.py", line 356, in decode_latents_vsr
image = self.vae.decode(latents).sample
File "/home/powerop/work/LaVie/myenv/lib/python3.10/site-packages/diffusers/utils/accelerate_utils.py", line 46, in wrapper
return method(self, *args, **kwargs)
File "/home/powerop/work/LaVie/vsr/models/autoencoder_kl.py", line 197, in decode
decoded = self._decode(z).sample
File "/home/powerop/work/LaVie/vsr/models/autoencoder_kl.py", line 184, in _decode
dec = self.decoder(z)
File "/home/powerop/work/LaVie/myenv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/home/powerop/work/LaVie/myenv/lib/python3.10/site-packages/diffusers/models/vae.py", line 233, in forward
sample = self.mid_block(sample)
File "/home/powerop/work/LaVie/myenv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/home/powerop/work/LaVie/myenv/lib/python3.10/site-packages/diffusers/models/unet_2d_blocks.py", line 463, in forward
hidden_states = attn(hidden_states)
File "/home/powerop/work/LaVie/myenv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/home/powerop/work/LaVie/myenv/lib/python3.10/site-packages/diffusers/models/attention.py", line 162, in forward
hidden_states = F.scaled_dot_product_attention(
torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 400.00 GiB (GPU 0; 44.52 GiB total capacity; 12.21 GiB already allocated; 31.35 GiB free; 12.80 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

@Go1denMelody
Copy link
Author

After checking, I found there is something wrong about xformers when I install requirements. After fixing it, the code can run successfully

@johndpope
Copy link

close

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants