Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

I want to use it, but my video memory is only 12G #11

Open
jeerychao opened this issue Mar 23, 2025 · 18 comments
Open

I want to use it, but my video memory is only 12G #11

jeerychao opened this issue Mar 23, 2025 · 18 comments

Comments

@jeerychao
Copy link

Anyone know how to reduce the memory requirements?

@niknah
Copy link

niknah commented Mar 23, 2025

Not working on a 32gb vram card either. Needed a 80gb vram to avoid the out of memory error. But it didn't work for me some other error about float not iterable.

@EndlessSora
Copy link
Collaborator

Thank you for your suggestion. Will consider improving memory usage.

@SGavrylov
Copy link

SGavrylov commented Mar 26, 2025

Anyone know how to reduce the memory requirements?

I ran it on an NVIDIA 3080 10Gb and got 173 s/it (total generation time was about 1.5 hours). Then I reduced the output size to 512×512, and the time dropped to 50 s/it (total generation time was 25 minutes), which is still long, but definitely better.

@jeerychao
Copy link
Author

有人知道如何减少内存要求吗?

我在 NVIDIA 3080 10Gb 上运行它,得到了 173 s/it(总生成时间约为 1.5 小时)。然后我将输出大小缩小到 512×512,时间下降到 50 s/it(总生成时间为 25 分钟),这仍然很长,但肯定更好。

How did you do that? Thank you.

@EndlessSora
Copy link
Collaborator

Please consider following some tips at https://github.com/bytedance/InfiniteYou?#memory-requirements first.

@SGavrylov
Copy link

有人知道如何减少内存要求吗?

我在 NVIDIA 3080 10Gb 上运行它,得到了 173 s/it(总生成时间约为 1.5 小时)。然后我将输出大小缩小到 512×512,时间下降到 50 s/it(总生成时间为 25 分钟),这仍然很长,但肯定更好。

How did you do that? Thank you.

Honestly, I’m not sure — I just installed it and ran it.

My desktop setup:

Ryzen 5 3600
MSI B550-A PRO
64 GB RAM
RTX 3080 10 GB
3x SSD (2×1TB M.2, 1×1TB SATA)

I did get an out-of-memory error after bumping the image size to 1024×1024 though.

100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 30/30 [1:03:21<00:00, 126.72s/it]
CUDA out of memory. Tried to allocate 512.00 MiB. GPU 0 has a total capacty of 10.00 GiB of which 0 bytes is free. Of the allocated memory 38.63 GiB is allocated by PyTorch, and 1.68 G
iB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management an
d PYTORCH_CUDA_ALLOC_CONF

Try to decrease output size to 512 x 512.

@jeerychao
Copy link
Author

有人知道如何降低内存要求吗?

我在 NVIDIA 3080 10Gb 上运行它,得到了 173 秒/它(总生成时间约为 1.5 小时)。然后我将输出大小缩小到 512×512,时间恢复到 50 秒/它(总生成时间为 25 分钟),这仍然很长,但肯定更好。

你是怎么做到的?谢谢。

老实说,我不确定——我只是安装并运行了它。

我的桌面设置:

Ryzen 5 3600 MSI B550-A PRO 64 GB RAM RTX 3080 10 GB 3x SSD(2×1TB M.2、1×1TB SATA)

然而,将图像尺寸增加到 1024×1024 后,确实出现了内存不足错误。

100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 30/30 [1:03:21<00:00, 126.72s/it] CUDA out of memory. Tried to allocate 512.00 MiB. GPU 0 has a total capacty of 10.00 GiB of which 0 bytes is free. Of the allocated memory 38.63 GiB is allocated by PyTorch, and 1.68 G iB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting max_split_size_mb to avoid fragmentation. 请参阅内存管理和 PYTORCH_CUDA_ALLOC_CONF的文档

尝试将输出尺寸减小到 512 x 512。

This seems impossible because after I installed and ran it, the built-in model of InfiniteYou located in the directory InfiniteYou/infu_flux_v1.0/aes_stage2/InfuseNetModel contains diffusion… which is 11 GB. Additionally, it required downloading the flux model, and when I saw the largest one was 23 GB, I immediately aborted the download. Your VRAM likely couldn’t handle running it. However, I truly don’t know how you managed to run it.

@niftyflora
Copy link

it runs on 10gb vram and i couldnt get it running on 24? might be a skill issue from me will try again tomorrow

@SGavrylov
Copy link

This seems impossible because after I installed and ran it, the built-in model of InfiniteYou located in the directory InfiniteYou/infu_flux_v1.0/aes_stage2/InfuseNetModel contains diffusion… which is 11 GB. Additionally, it required downloading the flux model, and when I saw the largest one was 23 GB, I immediately aborted the download. Your VRAM likely couldn’t handle running it. However, I truly don’t know how you managed to run it.

ChatGPT suggested that it's likely the folks at ByteDance are using CPU offloading.
I checked Task Manager, and indeed — Shared GPU memory usage was around 32GB, plus the dedicated 10GB on my RTX 3080.
From what I understand, you probably need at least 64GB of system RAM for this to work.
So if your machine meets those specs, it should run just like it did for me! 😊

@jeerychao
Copy link
Author

This seems impossible because after I installed and ran it, the built-in model of InfiniteYou located in the directory InfiniteYou/infu_flux_v1.0/aes_stage2/InfuseNetModel contains diffusion… which is 11 GB. Additionally, it required downloading the flux model, and when I saw the largest one was 23 GB, I immediately aborted the download. Your VRAM likely couldn’t handle running it. However, I truly don’t know how you managed to run it.

ChatGPT suggested that it's likely the folks at ByteDance are using CPU offloading. I checked Task Manager, and indeed — Shared GPU memory usage was around 32GB, plus the dedicated 10GB on my RTX 3080. From what I understand, you probably need at least 64GB of system RAM for this to work. So if your machine meets those specs, it should run just like it did for me! 😊

I think it should be like this. What LLM tool are you using to run it? Can you share it? Thank you!

@SGavrylov
Copy link

I think it should be like this. What LLM tool are you using to run it? Can you share it? Thank you!

I use PyCharm 2022.3.1 (Community Edition). It isn't an "LLM tool" – it's an IDE for Python development. Just clone the repository from Git, install the required packages from requirements.txt, and run it.
After the first launch, I waited for all the necessary models to download. Once the download was complete, the project folder grew to almost 90 GB.
After all the models were downloaded, some kind of image generation started automatically — as I understood it, this was just a test run to check that everything works correctly and to "warm up" the models.
Once that test generation was finished (in my case, it took about an hour and a half), Gradio was launched — a web interface for selecting images and triggering generation.

@jeerychao
Copy link
Author

I think it should be like this. What LLM tool are you using to run it? Can you share it? Thank you!

I use PyCharm 2022.3.1 (Community Edition). It isn't an "LLM tool" – it's an IDE for Python development. Just clone the repository from Git, install the required packages from requirements.txt, and run it. After the first launch, I waited for all the necessary models to download. Once the download was complete, the project folder grew to almost 90 GB. After all the models were downloaded, some kind of image generation started automatically — as I understood it, this was just a test run to check that everything works correctly and to "warm up" the models. Once that test generation was finished (in my case, it took about an hour and a half), Gradio was launched — a web interface for selecting images and triggering generation.
Congratulations and thank you for sharing! May I ask, are you in China?

@jeerychao
Copy link
Author

FETCH ComfyRegistry Data: 70/80
10%|████████▊ | 3/30 [00:04<00:38, 1.41s/it]Requested to load AutoencodingEngine
loaded completely 223.02720260620117 159.87335777282715 True
Requested to load ControlNetFlux
loaded partially 9721.6182182312 9720.537170410156 0
loaded partially 128.0 127.9365234375 0
30%|██████████████████████████▍ | 9/30 [00:19<00:46, 2.21s/it]FETCH ComfyRegistry Data: 75/80
43%|█████████████████████████████████████▋ | 13/30 [00:27<00:36, 2.18s/it]FETCH ComfyRegistry Data: 80/80
FETCH ComfyRegistry Data [DONE]
[ComfyUI-Manager] default cache updated: https://api.comfy.org/nodes
100%|███████████████████████████████████████████████████████████████████████████████████████| 30/30 [00:58<00:00, 1.93s/it]
Requested to load AutoencodingEngine
loaded completely 190.8359375 159.87335777282715 True
Prompt executed in 336.92 seconds

@pppking9527
Copy link

Anyone know how to reduce the memory requirements?

I ran it on an NVIDIA 3080 10Gb and got 173 s/it (total generation time was about 1.5 hours). Then I reduced the output size to 512×512, and the time dropped to 50 s/it (total generation time was 25 minutes), which is still long, but definitely better.

My RTX5080 only has 100 s/it, why.

@MegaCocos
Copy link

Anyone know how to reduce the memory requirements?

I ran it on an NVIDIA 3080 10Gb and got 173 s/it (total generation time was about 1.5 hours). Then I reduced the output size to 512×512, and the time dropped to 50 s/it (total generation time was 25 minutes), which is still long, but definitely better.

My RTX5080 only has 100 s/it, why.

Try my workflow with GGUF nodes https://civitai.com/models/1424364 . On my RTX 4080 SUPER with 16 Gb VRAM it takes 70 - 120 sec for 1024 x 1024 image generation, consider it very acceptable speed

@AgustinJimenez
Copy link

working on my rtx4090, but took me +3 hours to generate 1 image

@pppking9527
Copy link

Anyone know how to reduce the memory requirements?

I ran it on an NVIDIA 3080 10Gb and got 173 s/it (total generation time was about 1.5 hours). Then I reduced the output size to 512×512, and the time dropped to 50 s/it (total generation time was 25 minutes), which is still long, but definitely better.

My RTX5080 only has 100 s/it, why.

Try my workflow with GGUF nodes https://civitai.com/models/1424364 . On my RTX 4080 SUPER with 16 Gb VRAM it takes 70 - 120 sec for 1024 x 1024 image generation, consider it very acceptable speed

Thank you for sharing. Should workflow be used in Comfyui?

@MegaCocos
Copy link

Anyone know how to reduce the memory requirements?

I ran it on an NVIDIA 3080 10Gb and got 173 s/it (total generation time was about 1.5 hours). Then I reduced the output size to 512×512, and the time dropped to 50 s/it (total generation time was 25 minutes), which is still long, but definitely better.

My RTX5080 only has 100 s/it, why.

Try my workflow with GGUF nodes https://civitai.com/models/1424364 . On my RTX 4080 SUPER with 16 Gb VRAM it takes 70 - 120 sec for 1024 x 1024 image generation, consider it very acceptable speed

Thank you for sharing. Should workflow be used in Comfyui?

@pppking9527 yes, clearly, it is for ComfyUI, the node was made by ZenAI Team https://github.com/ZenAI-Vietnam/ComfyUI_InfiniteYou and modded by me adding GGUF support, LoRA support, SIMILARITY / FLEXIBILITY model switch and minor adjustments. Just remember to install first the requirements https://github.com/ZenAI-Vietnam/ComfyUI_InfiniteYou and then follow carefully the download model and install instruction of ZenAI fork https://github.com/ZenAI-Vietnam/ComfyUI_InfiniteYou

Image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

8 participants