Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

when runing in docker style, gpu memory always increased, running ... then oom killed to restart #245

Closed
PeakLee opened this issue Mar 16, 2024 · 9 comments

Comments

@PeakLee
Copy link

PeakLee commented Mar 16, 2024

image

deploy fooocus-api online, periodly restart !!

fooocus-api on memory management need to improve

@mrhan1993
Copy link
Owner

Which version are you using now?

@PeakLee
Copy link
Author

PeakLee commented Mar 18, 2024

@mrhan1993

v0.3.29

when restart the pod, memory usage progress bar reset to 1%, then process text2img and img2img request, a short time later, it shows 95%

docker command:
sudo docker run --restart always -d -e TZ=Asia/Shanghai -v /data/model_sync:/mnt --name fooocus-v329-cn --cpus 2.5 --gpus '"device=4"' fooocus-v329-v2 python main.py

hope those be helpful! thanks

@mrhan1993
Copy link
Owner

I have generated dozens of pictures in succession with the latest version, and there is basically no fluctuation in memory usage. Maybe you can try the latest version.

@PeakLee
Copy link
Author

PeakLee commented Mar 18, 2024

if Fooocus-API always load the same model and lora related files, it works well as expected,
but when load different model and lora file per request, it will increase memory usage, and hold all in memory till OOM !!
@mrhan1993 help ~~

@mrhan1993
Copy link
Owner

ok,I will do more test

@PeakLee
Copy link
Author

PeakLee commented Mar 19, 2024

just append the option: "-e PYTORCH_CUDA_ALLOC_CONF=garbage_collection_threshold:0.8“ in docker run command, but not sure it's ok or not, could you give me some suggestions to avoid gpu OOM? really appreciated!! @mrhan1993

@mrhan1993
Copy link
Owner

mrhan1993 commented Mar 19, 2024

In another branch, FooocusIntegration , I rewrote the task system and after initial testing, Neither memory OOM nor GPU memory OOM will occur. But this branch has not been fully tested, if you are interested, you can deploy it locally for testing.

@PeakLee
Copy link
Author

PeakLee commented Mar 19, 2024

in version v0.3.29, i added codes in file fooocusapi/worker.py below the codes : print(f'Generating and saving time: {execution_time:.2f} seconds')

appended, then it works well now !

print('--memory stats--:', model_management.get_free_memory(torch_free_too=True))
model_management.cleanup_models() # key1
model_management.soft_empty_cache() # key2
print('--memory stats--:', model_management.get_free_memory(torch_free_too=True))

really appreciated and thanks a lot ! @mrhan1993

@mrhan1993
Copy link
Owner

thx, after a while, I will update it

@PeakLee PeakLee closed this as completed Mar 20, 2024
mrhan1993 added a commit that referenced this issue Mar 20, 2024
After running for a long time, due to the loading problem of the model, the GPU memory and memory footprint will continue to increase, which will lead to the emergence of OOM. Add manual release logic while avoiding repeated loading of the model as much as possible.

thanks for @PeakLee and his code: #245 (comment)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants