New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
chore(deps): update onnxruntime-openvino #7854
Conversation
Deploying immich with Cloudflare Pages
|
@mertalev Yep, let me have a look and see if this fixes the issue with model loading in different scenarios. |
I have noticed that there is in-ordinate amount of memory usage when loading the model inside the app and outside it. Inside the app |
Hmm, that's really weird. I wonder if it's using that fancy AUTO mode where it runs things on CPU until the GPU is ready. |
@mertalev maybe I'm missing something, but can't we use the 2023.3.0 openvino docker image? The release page at intel onnxrutime fork mentions support for 2023.3 |
Oh, you're right. I was looking at the normal 1.17 release where it says |
Yeah. Can debug. Even if using auto there is no separate memory for the integrated GPU, it would still load things into the computer's memory. Let me play with some options to see the best option we have. |
I looked at it some more and think the app vs. local difference you noticed might just be the fact that it continues compiling the model in the background. When you run inference the first time, there's a background thread that increases RAM usage as it runs. I tried a few things in the provider options, like setting the But I also haven't compared with the 1.15 version. How much does it use now? |
This silliest of things. We shouldn't use mimalloc. That is causing this issue. I don't think onnxruntime has good support for mimalloc on linux. Without the LD_PRELOAD stuff in start.sh the memory consumption was very less and I was able to run |
And we can remove the chdir stuff as that is not required anymore with this fix. |
Good catch! mimalloc is generally great with ONNX Runtime and recommended by them. We rely on it to avoid memory fragmentation and arenas being created for each session. I think this is specifically OpenVINO not working well with it. |
That might be the case. I think we can just disable it for the openvino docker image, by using an arg with the shell script or something. |
c20f8d5
to
5ccdacb
Compare
I tested the current version of this PR with both the default model and |
I'm sorry but with the latest changes of this PR I'm not able to get smart search to work. With the current release, I'm able to finish the smart search job but then when using the search function it crashes, because of the bugs with OpenVINO which should be fixed with this PR. With this PR, tested both with immich-machine-learning:pr-7854-openvino and with immich-machine-learning:main-openvino, I get this error:
Also the strange thing is that I'm seeing very high CPU usage, and by checking with intel_gpu_top it seems the GPU is not being used (whereas with the current release it is being used during the smart search job). This is using the default model and concurrency=1 |
Might be due to a low value of worker timeout. Can you try to increase worker timeout? Like set it to 600 or something. |
Thank you, I've increased the timeout to 600 and now I am able to run the smart search job and also to make searches! I am using the model XLM-Roberta-Large-Vit-B-32, do you think around 7-8 GB of memory usage is normal for this model? Also, just to have a reference, what is your memory usage during search and which model are you using? |
Do you have the request thread pool disabled? I don't see why this should block the server since it runs in a background thread. Or maybe it's just stressing the CPU enough that everything slows to a crawl. The memory usage when running Caching the compiled model would at least make this a one-time thing since it can be reused. |
Nope, I didn't set the MACHINE_LEARNING_REQUEST_THREADS variable in my .env file. The CPU is an Intel N100, so it is not a beast.
The model I am using is about half the parameters size of the Anyway I am happy to be able to use the smart search feature, thanks to both of you! |
@mertalev Should I make a small PR for this? Also, should it be an option in the form of an environment variable? |
A PR would be great! And no, we can just make the cache dir an |
Description
This should hopefully fix some of the issues around OpenVINO.
Can you test this? @agrawalsourav98