chore(deps): update onnxruntime-openvino #7854

mertalev · 2024-03-11T16:09:45Z

Description

This should hopefully fix some of the issues around OpenVINO.

cloudflare-pages · 2024-03-11T16:12:40Z

Deploying immich with Cloudflare Pages

Latest commit:	`9c93286`
Status:	✅ Deploy successful!
Preview URL:	https://204d279c.immich.pages.dev
Branch Preview URL:	https://chore-upgrade-onnxruntime-op.immich.pages.dev

View logs

agrawalsourav98 · 2024-03-11T17:36:00Z

@mertalev Yep, let me have a look and see if this fixes the issue with model loading in different scenarios.

agrawalsourav98 · 2024-03-11T19:07:00Z

@mertalev Yep, let me have a look and see if this fixes the issue with model loading in different scenarios.

I have noticed that there is in-ordinate amount of memory usage when loading the model inside the app and outside it. Inside the app ViT-B-32__openai takes almost 6 GBs of RAM whereas outside (in a interpreter) it takes 1.5 GBs. I think if you can fix the docker issue I can run it inside a docker environment to confirm this behavior. If this behaviour is true we should ideally investigate before we mark this a fix to openvino issues.

mertalev · 2024-03-11T20:59:49Z

Hmm, that's really weird. I wonder if it's using that fancy AUTO mode where it runs things on CPU until the GPU is ready.

dvdblg · 2024-03-11T21:14:25Z

@mertalev maybe I'm missing something, but can't we use the 2023.3.0 openvino docker image? The release page at intel onnxrutime fork mentions support for 2023.3

mertalev · 2024-03-11T21:17:23Z

Oh, you're right. I was looking at the normal 1.17 release where it says Added support for OpenVINO 2023.2..

agrawalsourav98 · 2024-03-14T16:28:11Z

Yeah. Can debug. Even if using auto there is no separate memory for the integrated GPU, it would still load things into the computer's memory. Let me play with some options to see the best option we have.

mertalev · 2024-03-14T16:48:25Z

I looked at it some more and think the app vs. local difference you noticed might just be the fact that it continues compiling the model in the background. When you run inference the first time, there's a background thread that increases RAM usage as it runs.

I tried a few things in the provider options, like setting the cache_dir, disable_dynamic_shapes=True, num_streams=1, etc., but the memory usage barely changed. Their docs mention lowering compilation threads, but it already looks single-threaded to me.

But I also haven't compared with the 1.15 version. How much does it use now?

agrawalsourav98 · 2024-03-14T18:05:51Z

This silliest of things. We shouldn't use mimalloc. That is causing this issue. I don't think onnxruntime has good support for mimalloc on linux. Without the LD_PRELOAD stuff in start.sh the memory consumption was very less and I was able to run ViT-H-14__laion2b-s32b-b79k and run it on OpenVINO GPU. One thing we should add to README is to use a larger WORKER_TIMEOUT as large models take long time to load on GPU and by the time they are loaded, they sort of enter a race condition.

agrawalsourav98 · 2024-03-14T18:13:18Z

And we can remove the chdir stuff as that is not required anymore with this fix.

mertalev · 2024-03-14T18:41:56Z

Good catch! mimalloc is generally great with ONNX Runtime and recommended by them. We rely on it to avoid memory fragmentation and arenas being created for each session. I think this is specifically OpenVINO not working well with it.

agrawalsourav98 · 2024-03-14T18:57:04Z

Good catch! mimalloc is generally great with ONNX Runtime and recommended by them. We rely on it to avoid memory fragmentation and arenas being created for each session. I think this is specifically OpenVINO not working well with it.

That might be the case. I think we can just disable it for the openvino docker image, by using an arg with the shell script or something.

mertalev · 2024-03-16T04:04:21Z

I tested the current version of this PR with both the default model and ViT-H-14-378-quickgelu__dfn5b on a 13700H and all is well. It seems like they've fixed things for the most part. The memory usage being so different with mimalloc and jemalloc is strange, though, so I think I'll make an issue for that.

dvdblg · 2024-03-16T09:09:44Z

I'm sorry but with the latest changes of this PR I'm not able to get smart search to work.

With the current release, I'm able to finish the smart search job but then when using the search function it crashes, because of the bugs with OpenVINO which should be fixed with this PR.

With this PR, tested both with immich-machine-learning:pr-7854-openvino and with immich-machine-learning:main-openvino, I get this error:

[03/16/24 09:46:15] INFO     Starting gunicorn 21.2.0                           
[03/16/24 09:46:15] INFO     Listening at: http://0.0.0.0:3003 (9)              
[03/16/24 09:46:15] INFO     Using worker: app.config.CustomUvicornWorker       
[03/16/24 09:46:15] INFO     Booting worker with pid: 13                        
[03/16/24 09:46:20] WARNING  Matplotlib created a temporary cache directory at  
                             /tmp/matplotlib-bpxxgisk because the default path  
                             (/.config/matplotlib) is not a writable directory; 
                             it is highly recommended to set the MPLCONFIGDIR   
                             environment variable to a writable directory, in   
                             particular to speed up the import of Matplotlib and
                             to better support multiprocessing.                 
[03/16/24 09:46:22] INFO     Started server process [13]                        
[03/16/24 09:46:22] INFO     Waiting for application startup.                   
[03/16/24 09:46:22] INFO     Created in-memory cache with unloading after 300s  
                             of inactivity.                                     
[03/16/24 09:46:22] INFO     Initialized request thread pool with 4 threads.    
[03/16/24 09:46:22] INFO     Application startup complete.                      
[03/16/24 09:47:21] INFO     Setting 'ViT-B-32__openai' execution providers to  
                             ['OpenVINOExecutionProvider',                      
                             'CPUExecutionProvider'], in descending order of    
                             preference                                         
[03/16/24 09:47:21] INFO     Loading clip model 'ViT-B-32__openai' to memory    
[03/16/24 09:48:23] CRITICAL WORKER TIMEOUT (pid:13)                            
[03/16/24 09:48:23] ERROR    Worker (pid:13) was sent SIGABRT!

Also the strange thing is that I'm seeing very high CPU usage, and by checking with intel_gpu_top it seems the GPU is not being used (whereas with the current release it is being used during the smart search job).

This is using the default model and concurrency=1

agrawalsourav98 · 2024-03-16T10:18:59Z

I'm sorry but with the latest changes of this PR I'm not able to get smart search to work.

With the current release, I'm able to finish the smart search job but then when using the search function it crashes, because of the bugs with OpenVINO which should be fixed with this PR.

With this PR, tested both with immich-machine-learning:pr-7854-openvino and with immich-machine-learning:main-openvino, I get this error:

[03/16/24 09:46:15] INFO     Starting gunicorn 21.2.0                           
[03/16/24 09:46:15] INFO     Listening at: http://0.0.0.0:3003 (9)              
[03/16/24 09:46:15] INFO     Using worker: app.config.CustomUvicornWorker       
[03/16/24 09:46:15] INFO     Booting worker with pid: 13                        
[03/16/24 09:46:20] WARNING  Matplotlib created a temporary cache directory at  
                             /tmp/matplotlib-bpxxgisk because the default path  
                             (/.config/matplotlib) is not a writable directory; 
                             it is highly recommended to set the MPLCONFIGDIR   
                             environment variable to a writable directory, in   
                             particular to speed up the import of Matplotlib and
                             to better support multiprocessing.                 
[03/16/24 09:46:22] INFO     Started server process [13]                        
[03/16/24 09:46:22] INFO     Waiting for application startup.                   
[03/16/24 09:46:22] INFO     Created in-memory cache with unloading after 300s  
                             of inactivity.                                     
[03/16/24 09:46:22] INFO     Initialized request thread pool with 4 threads.    
[03/16/24 09:46:22] INFO     Application startup complete.                      
[03/16/24 09:47:21] INFO     Setting 'ViT-B-32__openai' execution providers to  
                             ['OpenVINOExecutionProvider',                      
                             'CPUExecutionProvider'], in descending order of    
                             preference                                         
[03/16/24 09:47:21] INFO     Loading clip model 'ViT-B-32__openai' to memory    
[03/16/24 09:48:23] CRITICAL WORKER TIMEOUT (pid:13)                            
[03/16/24 09:48:23] ERROR    Worker (pid:13) was sent SIGABRT!

Also the strange thing is that I'm seeing very high CPU usage, and by checking with intel_gpu_top it seems the GPU is not being used (whereas with the current release it is being used during the smart search job).

This is using the default model and concurrency=1

Might be due to a low value of worker timeout. Can you try to increase worker timeout? Like set it to 600 or something.

dvdblg · 2024-03-16T10:55:56Z

Might be due to a low value of worker timeout. Can you try to increase worker timeout? Like set it to 600 or something.

Thank you, I've increased the timeout to 600 and now I am able to run the smart search job and also to make searches!
To make the search work I also had to increase the memory limit to 10GB otherwise I got a SIGKILL because of running out of memory.

I am using the model XLM-Roberta-Large-Vit-B-32, do you think around 7-8 GB of memory usage is normal for this model? Also, just to have a reference, what is your memory usage during search and which model are you using?

mertalev · 2024-03-16T16:20:02Z

Do you have the request thread pool disabled? I don't see why this should block the server since it runs in a background thread. Or maybe it's just stressing the CPU enough that everything slows to a crawl.

The memory usage when running ViT-H-14-378-quickgelu__dfn5b was definitely much higher than for CUDA or CPU. I think the smart search job used about 8gb. Based on the OpenVINO docs, this seems to be intended behavior since there's a separate memory-intensive compilation step.

Caching the compiled model would at least make this a one-time thing since it can be reused.

dvdblg · 2024-03-16T16:51:02Z

Do you have the request thread pool disabled? I don't see why this should block the server since it runs in a background thread. Or maybe it's just stressing the CPU enough that everything slows to a crawl.

Nope, I didn't set the MACHINE_LEARNING_REQUEST_THREADS variable in my .env file. The CPU is an Intel N100, so it is not a beast.

The memory usage when running ViT-H-14-378-quickgelu__dfn5b was definitely much higher than for CUDA or CPU. I think the smart search job used about 8gb. Based on the OpenVINO docs, this seems to be intended behavior since there's a separate memory-intensive compilation step.

The model I am using is about half the parameters size of the ViT-H-14-378-quickgelu__dfn5b model, so the memory usage should be at least a bit lower than 8gb, right?

Anyway I am happy to be able to use the smart search feature, thanks to both of you!

dvdblg · 2024-03-16T22:56:03Z

@mertalev
small update: I tried adding cache_dir to OpenVINO EP as you suggested and It indeed greatly reduced memory usage, just ~2.5gb during smart search with the same model I mentioned above.

Should I make a small PR for this? Also, should it be an option in the form of an environment variable?

mertalev · 2024-03-16T23:36:58Z

A PR would be great! And no, we can just make the cache dir an openvino folder next to the normal model.onnx, etc.

update onnxruntime-openvino

7631021

mertalev added the 🧠machine-learning label Mar 11, 2024

jrasm91 approved these changes Mar 11, 2024

View reviewed changes

mertalev added 2 commits March 11, 2024 18:12

use 2023.3.0

13e18af

fix build

a84824b

mertalev added 2 commits March 15, 2024 18:12

disable mimalloc for openvino

093c89f

remove chdir jank

5ccdacb

mertalev force-pushed the chore/upgrade-onnxruntime-openvino branch from c20f8d5 to 5ccdacb Compare March 15, 2024 22:15

mertalev added 2 commits March 15, 2024 18:37

fix

1dc3cba

fix

9c93286

mertalev merged commit 3a045b3 into main Mar 16, 2024
24 checks passed

mertalev deleted the chore/upgrade-onnxruntime-openvino branch March 16, 2024 04:04

mertalev mentioned this pull request Mar 16, 2024

Issues with OpenVINO #7487

Closed

3 tasks

dvdblg mentioned this pull request Mar 17, 2024

feat(ml): add cache_dir option to OpenVINO EP #8018

Merged

agrawalsourav98 mentioned this pull request Mar 19, 2024

Fix external weights loading during runtime in OpenVINO EP #7525

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

chore(deps): update onnxruntime-openvino #7854

chore(deps): update onnxruntime-openvino #7854

mertalev commented Mar 11, 2024

cloudflare-pages bot commented Mar 11, 2024 •

edited

agrawalsourav98 commented Mar 11, 2024

agrawalsourav98 commented Mar 11, 2024

mertalev commented Mar 11, 2024

dvdblg commented Mar 11, 2024

mertalev commented Mar 11, 2024

agrawalsourav98 commented Mar 14, 2024

mertalev commented Mar 14, 2024

agrawalsourav98 commented Mar 14, 2024 •

edited

agrawalsourav98 commented Mar 14, 2024

mertalev commented Mar 14, 2024

agrawalsourav98 commented Mar 14, 2024

mertalev commented Mar 16, 2024

dvdblg commented Mar 16, 2024 •

edited

agrawalsourav98 commented Mar 16, 2024 •

edited

dvdblg commented Mar 16, 2024

mertalev commented Mar 16, 2024

dvdblg commented Mar 16, 2024

dvdblg commented Mar 16, 2024

mertalev commented Mar 16, 2024

chore(deps): update onnxruntime-openvino #7854

chore(deps): update onnxruntime-openvino #7854

Conversation

mertalev commented Mar 11, 2024

Description

cloudflare-pages bot commented Mar 11, 2024 • edited

Deploying immich with Cloudflare Pages

agrawalsourav98 commented Mar 11, 2024

agrawalsourav98 commented Mar 11, 2024

mertalev commented Mar 11, 2024

dvdblg commented Mar 11, 2024

mertalev commented Mar 11, 2024

agrawalsourav98 commented Mar 14, 2024

mertalev commented Mar 14, 2024

agrawalsourav98 commented Mar 14, 2024 • edited

agrawalsourav98 commented Mar 14, 2024

mertalev commented Mar 14, 2024

agrawalsourav98 commented Mar 14, 2024

mertalev commented Mar 16, 2024

dvdblg commented Mar 16, 2024 • edited

agrawalsourav98 commented Mar 16, 2024 • edited

dvdblg commented Mar 16, 2024

mertalev commented Mar 16, 2024

dvdblg commented Mar 16, 2024

dvdblg commented Mar 16, 2024

mertalev commented Mar 16, 2024

cloudflare-pages bot commented Mar 11, 2024 •

edited

agrawalsourav98 commented Mar 14, 2024 •

edited

dvdblg commented Mar 16, 2024 •

edited

agrawalsourav98 commented Mar 16, 2024 •

edited