Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CUDA out of memory. #85

Closed
realcarlos opened this issue May 4, 2023 · 5 comments
Closed

CUDA out of memory. #85

realcarlos opened this issue May 4, 2023 · 5 comments

Comments

@realcarlos
Copy link

However I am using a station with 4 x A100(40G)

if_I = IFStageI('/IF/deepfloyd-if/IF-I-XL-v1.0', device='cuda:0')
if_II = IFStageII('/IF/deepfloyd-if/IF-II-L-v1.0', device='cuda:1')
if_III = StableStageIII('/IF/deepfloyd-if/stable-diffusion-x4-upscaler', device='cuda:2')
t5 = T5Embedder(device="cuda:3")

torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 8.00 GiB (GPU 0; 39.39 GiB total capacity; 29.37 GiB already allocated; 6.90 GiB free; 30.95 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

@realcarlos
Copy link
Author

solved by adding "os.environ["FORCE_MEM_EFFICIENT_ATTN"] = "1""

@YeHaijia
Copy link

对我有用,赞

@404-xianjin
Copy link

404-xianjin commented Jun 19, 2023

hi,I got the same error, my error code is as follows,
torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 16.00 GiB (GPU 0; 23.99 GiB total capacity; 17.99 GiB already allocated; 3.23 GiB free; 18.43 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

After seeing your solution, I don't know where to set this parameter? Can you tell me? Thanks @realcarlos

@404-xianjin
Copy link

大哥,我点开你的主页发现你也在北京,我直接
image
目前看好像可以了。

@Delicious-Bitter-Melon
Copy link

大哥,我点开你的主页发现你也在北京,我直接 image 目前看好像可以了。

I have the same problem using the 3090. Although I set the ""os.environ["FORCE_MEM_EFFICIENT_ATTN"] = "1""", I still have the problem. Do you also use the 3090 with the 24GB?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants