torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 400.00 GiB #64

Go1denMelody · 2024-07-17T12:31:04Z

When I run video super resolution model, there is an error
torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 400.00 GiB (GPU 0; 44.52 GiB total capacity; 12.21 GiB already allocated; 31.33 GiB free; 12.83 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

Why it need to try to allocate 400gb, should i change some settings?

Go1denMelody · 2024-07-17T12:32:45Z

Traceback (most recent call last):
File "/home/powerop/work/LaVie/vsr/sample.py", line 151, in
main(OmegaConf.load(args.config))
File "/home/powerop/work/LaVie/vsr/sample.py", line 109, in main
upscaled_video_ = pipeline(
File "/home/powerop/work/LaVie/myenv/lib/python3.10/site-packages/torch/utils/contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
File "/home/powerop/work/LaVie/vsr/models/pipeline_stable_diffusion_upscale_video_3d.py", line 766, in call
image = self.decode_latents_vsr(latents[start_f:end_f])
File "/home/powerop/work/LaVie/vsr/models/pipeline_stable_diffusion_upscale_video_3d.py", line 356, in decode_latents_vsr
image = self.vae.decode(latents).sample
File "/home/powerop/work/LaVie/myenv/lib/python3.10/site-packages/diffusers/utils/accelerate_utils.py", line 46, in wrapper
return method(self, *args, **kwargs)
File "/home/powerop/work/LaVie/vsr/models/autoencoder_kl.py", line 197, in decode
decoded = self._decode(z).sample
File "/home/powerop/work/LaVie/vsr/models/autoencoder_kl.py", line 184, in _decode
dec = self.decoder(z)
File "/home/powerop/work/LaVie/myenv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/home/powerop/work/LaVie/myenv/lib/python3.10/site-packages/diffusers/models/vae.py", line 233, in forward
sample = self.mid_block(sample)
File "/home/powerop/work/LaVie/myenv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/home/powerop/work/LaVie/myenv/lib/python3.10/site-packages/diffusers/models/unet_2d_blocks.py", line 463, in forward
hidden_states = attn(hidden_states)
File "/home/powerop/work/LaVie/myenv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/home/powerop/work/LaVie/myenv/lib/python3.10/site-packages/diffusers/models/attention.py", line 162, in forward
hidden_states = F.scaled_dot_product_attention(
torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 400.00 GiB (GPU 0; 44.52 GiB total capacity; 12.21 GiB already allocated; 31.35 GiB free; 12.80 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

Go1denMelody · 2024-07-18T02:17:04Z

After checking, I found there is something wrong about xformers when I install requirements. After fixing it, the code can run successfully

johndpope · 2024-08-22T18:34:07Z

close

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 400.00 GiB #64

torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 400.00 GiB #64

Go1denMelody commented Jul 17, 2024

Go1denMelody commented Jul 17, 2024

Go1denMelody commented Jul 18, 2024

johndpope commented Aug 22, 2024

torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 400.00 GiB #64

torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 400.00 GiB #64

Comments

Go1denMelody commented Jul 17, 2024

Go1denMelody commented Jul 17, 2024

Go1denMelody commented Jul 18, 2024

johndpope commented Aug 22, 2024