-
Notifications
You must be signed in to change notification settings - Fork 22
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Diffusion model not fully unloading from gpu when removed from cache #242
Comments
It looks like there are a few factors involved here:
It's possible that there is a memory leak on the Python side holding that old diffusion model and forcing it to remain active, but it's also possible that I just need to run garbage collection on the GPU side, like For now, increasing the model cache limit to 4-5 (one of each) should be a workaround: set ONNX_WEB_CACHE_MODELS=5 |
When you have a chance, please try running the new > cd api
> onnx_env\Scripts\Activate.bat
> python3 scripts\check-env.py and post the output. I'm looking for the last few items in particular, which should say whether the CUDA garbage collection functions are available. The output should look something like this: > python3 scripts/check-env.py
['required module onnx is present at version 1.13.0', 'required module diffusers is present at version 0.12.1', 'required module safetensors is present at version 0.2.8', 'required module torch is present at version 1.13.1', 'runtime module onnxruntime is present at version 1.14.0', "unable to import runtime module onnxruntime_gpu: No module named 'onnxruntime_gpu'", "unable to import runtime module onnxruntime_rocm: No module named 'onnxruntime_rocm'", "unable to import runtime module onnxruntime_training: No module named 'onnxruntime_training'", 'onnxruntime provider TensorrtExecutionProvider is missing', 'onnxruntime provider CUDAExecutionProvider is missing', 'onnxruntime provider MIGraphXExecutionProvider is missing', 'onnxruntime provider ROCMExecutionProvider is missing', 'onnxruntime provider OpenVINOExecutionProvider is missing', 'onnxruntime provider DnnlExecutionProvider is missing', 'onnxruntime provider TvmExecutionProvider is missing', 'onnxruntime provider VitisAIExecutionProvider is missing', 'onnxruntime provider NnapiExecutionProvider is missing', 'onnxruntime provider CoreMLExecutionProvider is missing', 'onnxruntime provider ArmNNExecutionProvider is missing', 'onnxruntime provider ACLExecutionProvider is missing', 'onnxruntime provider DmlExecutionProvider is missing', 'onnxruntime provider RknpuExecutionProvider is missing', 'onnxruntime provider XnnpackExecutionProvider is missing', 'onnxruntime provider CANNExecutionProvider is missing', 'onnxruntime provider AzureExecutionProvider is missing', 'onnxruntime provider CPUExecutionProvider is available', 'loaded Torch but CUDA was not available'] |
this is the response I get:
|
Based on the last item, If that's not possible, there are a few new optimizations that I want to add that will reduce memory usage (#241) and there will be a button to restart the GPU worker soon (#207). |
After updating to v0.8.1, I noticed that my gpu memory was hitting 100% after generating 1 image, significantly impacting generation speed; the first image would generate at ~3it/s, whereas the second image with no changes to the model, scheduler, or other settings would generate at ~4.5s/it. Following the logs, the diffusion model I was using was getting removed from the cache and then getting reloaded, but was not fully unloading from my gpu. When I upped my model cache from 2 the 3, the issue was resolved.
task manager right after finishing first image:
task manager right after starting second image:
The text was updated successfully, but these errors were encountered: