feat(ml)!: cuda and openvino acceleration #5619

mertalev · 2023-12-10T19:43:21Z

Description

Potentially breaking change: hwaccel.yml is renamed to hwaccel.transcoding.yml and the way it's used in the docker-compose is changed. Existing docker-compose / hwaccel.yml setups will continue to work, but if a user who used the hwaccel.yml file updates their docker-compose.yml , they will need to change to the new format for it to work (or keep the older extends section).

This PR adds hardware acceleration support for Nvidia and Intel devices through CUDA and OpenVINO. It uses prebuilt onnxruntime packages for these APIs and updates the ML Dockerfile to conditionally target a device based on the DEVICE build arg. There is a check at runtime to detect the available execution providers and set them accordingly.

Current limitations:

onnxruntime-openvino doesn't currently support Python 3.11, so targeting an OpenVINO build will also target 3.10
The CUDA image is massive, but it can't be helped

Edit: I'm removing TensorRT support as it's slow to load and uses much more RAM than normal CUDA.

How has this been tested?

I ran the CPU, CUDA and OpenVINO variants, confirmed successful responses for each task when querying with Postman, and confirmed the CUDA and OpenVINO variants were running on GPU. For OpenVINO, I tested on Linux, but also included the WSL2 configuration recommended by Intel for the OpenVINO image it uses.

While testing, I also ended up increasing test coverage from 72% to 80%.

yodatak · 2023-12-11T08:34:57Z

Does this acceleration apply to darktable generation with opencl and thumbnails? For large collections it could be Awsome thanks a lot for this

docker/docker-compose.yml

cloudflare-pages · 2023-12-14T04:04:20Z

Deploying with Cloudflare Pages

Latest commit:	`b68f17e`
Status:	✅ Deploy successful!
Preview URL:	https://e27c0135.immich.pages.dev
Branch Preview URL:	https://feat-ml-tensorrt.immich.pages.dev

View logs

kkoshelev · 2023-12-14T18:32:52Z

Got some issues with this build.

[12/14/23 18:23:57] INFO     Starting gunicorn 21.2.0                           
[12/14/23 18:23:57] INFO     Listening at: http://0.0.0.0:3003 (9)              
[12/14/23 18:23:57] INFO     Using worker: uvicorn.workers.UvicornWorker        
[12/14/23 18:23:57] INFO     Booting worker with pid: 25                        
[12/14/23 18:24:10] INFO     Created in-memory cache with unloading after 300s  
                             of inactivity.                                     
[12/14/23 18:24:11] INFO     Initialized request thread pool with 16 threads.   
[12/14/23 18:26:37] INFO     Loading clip model 'ViT-B-32__openai'              
Exception in ASGI application
Traceback (most recent call last):
  File "/opt/venv/lib/python3.11/site-packages/uvicorn/protocols/http/httptools_impl.py", line 435, in run_asgi
    result = await app(  # type: ignore[func-returns-value]
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/venv/lib/python3.11/site-packages/uvicorn/middleware/proxy_headers.py", line 78, in __call__
    return await self.app(scope, receive, send)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/venv/lib/python3.11/site-packages/fastapi/applications.py", line 276, in __call__
    await super().__call__(scope, receive, send)
  File "/opt/venv/lib/python3.11/site-packages/starlette/applications.py", line 122, in __call__
    await self.middleware_stack(scope, receive, send)
  File "/opt/venv/lib/python3.11/site-packages/starlette/middleware/errors.py", line 184, in __call__
    raise exc
  File "/opt/venv/lib/python3.11/site-packages/starlette/middleware/errors.py", line 162, in __call__
    await self.app(scope, receive, _send)
  File "/opt/venv/lib/python3.11/site-packages/starlette/middleware/exceptions.py", line 79, in __call__
    raise exc
  File "/opt/venv/lib/python3.11/site-packages/starlette/middleware/exceptions.py", line 68, in __call__
    await self.app(scope, receive, sender)
  File "/opt/venv/lib/python3.11/site-packages/fastapi/middleware/asyncexitstack.py", line 21, in __call__
    raise e
  File "/opt/venv/lib/python3.11/site-packages/fastapi/middleware/asyncexitstack.py", line 18, in __call__
    await self.app(scope, receive, send)
  File "/opt/venv/lib/python3.11/site-packages/starlette/routing.py", line 718, in __call__
    await route.handle(scope, receive, send)
  File "/opt/venv/lib/python3.11/site-packages/starlette/routing.py", line 276, in handle
    await self.app(scope, receive, send)
  File "/opt/venv/lib/python3.11/site-packages/starlette/routing.py", line 66, in app
    response = await func(request)
               ^^^^^^^^^^^^^^^^^^^
  File "/opt/venv/lib/python3.11/site-packages/fastapi/routing.py", line 237, in app
    raw_response = await run_endpoint_function(
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/venv/lib/python3.11/site-packages/fastapi/routing.py", line 163, in run_endpoint_function
    return await dependant.call(**values)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/src/app/main.py", line 82, in predict
    model = await load(await app.state.model_cache.get(model_name, model_type, **kwargs))
                       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/src/app/models/cache.py", line 55, in get
    model = from_model_type(model_type, model_name, **model_kwargs)
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/src/app/models/__init__.py", line 21, in from_model_type
    return FaceRecognizer(model_name, **model_kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/src/app/models/facial_recognition.py", line 27, in __init__
    super().__init__(clean_name(model_name), cache_dir, **model_kwargs)
TypeError: InferenceModel.__init__() got an unexpected keyword argument 'maxDistance'
Exception in ASGI application
Traceback (most recent call last):
  File "/opt/venv/lib/python3.11/site-packages/uvicorn/protocols/http/httptools_impl.py", line 435, in run_asgi
    result = await app(  # type: ignore[func-returns-value]
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/venv/lib/python3.11/site-packages/uvicorn/middleware/proxy_headers.py", line 78, in __call__
    return await self.app(scope, receive, send)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/venv/lib/python3.11/site-packages/fastapi/applications.py", line 276, in __call__
    await super().__call__(scope, receive, send)
  File "/opt/venv/lib/python3.11/site-packages/starlette/applications.py", line 122, in __call__
    await self.middleware_stack(scope, receive, send)
  File "/opt/venv/lib/python3.11/site-packages/starlette/middleware/errors.py", line 184, in __call__
    raise exc
  File "/opt/venv/lib/python3.11/site-packages/starlette/middleware/errors.py", line 162, in __call__
    await self.app(scope, receive, _send)
  File "/opt/venv/lib/python3.11/site-packages/starlette/middleware/exceptions.py", line 79, in __call__
    raise exc
  File "/opt/venv/lib/python3.11/site-packages/starlette/middleware/exceptions.py", line 68, in __call__
    await self.app(scope, receive, sender)
  File "/opt/venv/lib/python3.11/site-packages/fastapi/middleware/asyncexitstack.py", line 21, in __call__
    raise e
  File "/opt/venv/lib/python3.11/site-packages/fastapi/middleware/asyncexitstack.py", line 18, in __call__
    await self.app(scope, receive, send)
  File "/opt/venv/lib/python3.11/site-packages/starlette/routing.py", line 718, in __call__
    await route.handle(scope, receive, send)
  File "/opt/venv/lib/python3.11/site-packages/starlette/routing.py", line 276, in handle
    await self.app(scope, receive, send)
  File "/opt/venv/lib/python3.11/site-packages/starlette/routing.py", line 66, in app
    response = await func(request)
               ^^^^^^^^^^^^^^^^^^^
  File "/opt/venv/lib/python3.11/site-packages/fastapi/routing.py", line 237, in app
    raw_response = await run_endpoint_function(
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/venv/lib/python3.11/site-packages/fastapi/routing.py", line 163, in run_endpoint_function
    return await dependant.call(**values)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/src/app/main.py", line 82, in predict
    model = await load(await app.state.model_cache.get(model_name, model_type, **kwargs))
                       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/src/app/models/cache.py", line 55, in get
    model = from_model_type(model_type, model_name, **model_kwargs)
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/src/app/models/__init__.py", line 21, in from_model_type
    return FaceRecognizer(model_name, **model_kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/src/app/models/facial_recognition.py", line 27, in __init__
    super().__init__(clean_name(model_name), cache_dir, **model_kwargs)
TypeError: InferenceModel.__init__() got an unexpected keyword argument 'maxDistance'
[12/14/23 18:26:40] INFO     Loading image classification model                 
                             'microsoft/resnet-50'                              
Could not find image processor class in the image processor config or the model config. Loading based on pattern matching with the model's feature extractor configuration.
Could not find image processor class in the image processor config or the model config. Loading based on pattern matching with the model's feature extractor configuration.
/opt/venv/lib/python3.11/site-packages/transformers/models/convnext/feature_extraction_convnext.py:28: FutureWarning: The class ConvNextFeatureExtractor is deprecated and will be removed in version 5 of Transformers. Please use ConvNextImageProcessor instead.
  warnings.warn(
[12/14/23 18:31:51] INFO     Shutting down due to inactivity.                   
[12/14/23 18:31:52] ERROR    Worker (pid:25) exited with code 1                 
[12/14/23 18:31:52] ERROR    Worker (pid:25) exited with code 1.                
[12/14/23 18:31:52] INFO     Booting worker with pid: 85

[Nest] 7  - 12/14/2023, 6:26:37 PM     LOG [MediaService] Start encoding video dbbfbc1b-1b7d-424f-8bea-fc8957d7811e {"inputOptions":["-init_hw_device cuda=cuda:0","-filter_hw_device cuda"],"outputOptions":["-tune hq","-qmin 0","-rc-lookahead 20","-i_qfactor 0.75","-c:v h264_nvenc","-c:a aac","-movflags faststart","-fps_mode passthrough","-map 0:0","-map 0:1","-g 256","-v verbose","-vf format=nv12,hwupload_cuda","-preset p6","-cq:v 18"],"twoPass":false}
[Nest] 7  - 12/14/2023, 6:26:37 PM     LOG [MediaService] Successfully generated WEBP video thumbnail for asset dbbfbc1b-1b7d-424f-8bea-fc8957d7811e
[Nest] 7  - 12/14/2023, 6:26:39 PM     LOG [MediaService] Successfully generated JPEG image thumbnail for asset 01ea340d-cf30-4c11-b42f-0802e0e0ab81
[Nest] 7  - 12/14/2023, 6:26:39 PM     LOG [MediaService] Start encoding video dbbfbc1b-1b7d-424f-8bea-fc8957d7811e {"inputOptions":["-init_hw_device cuda=cuda:0","-filter_hw_device cuda"],"outputOptions":["-tune hq","-qmin 0","-rc-lookahead 20","-i_qfactor 0.75","-c:v h264_nvenc","-c:a aac","-movflags faststart","-fps_mode passthrough","-map 0:0","-map 0:1","-g 256","-v verbose","-vf format=nv12,hwupload_cuda","-preset p6","-cq:v 18"],"twoPass":false}
[Nest] 7  - 12/14/2023, 6:26:40 PM   ERROR [JobService] Unable to run job handler (recognizeFaces/recognize-faces): Error: Request for facial recognition failed with status 500: Internal Server Error
[Nest] 7  - 12/14/2023, 6:26:40 PM   ERROR [JobService] Error: Request for facial recognition failed with status 500: Internal Server Error
    at MachineLearningRepository.post (/usr/src/app/dist/infra/repositories/machine-learning.repository.js:18:19)
    at process.processTicksAndRejections (node:internal/process/task_queues:95:5)
    at async PersonService.handleRecognizeFaces (/usr/src/app/dist/domain/person/person.service.js:247:23)
    at async /usr/src/app/dist/domain/job/job.service.js:112:37
    at async Worker.processJob (/usr/src/app/node_modules/bullmq/dist/cjs/classes/worker.js:387:28)
    at async Worker.retryIfFailed (/usr/src/app/node_modules/bullmq/dist/cjs/classes/worker.js:574:24)
[Nest] 7  - 12/14/2023, 6:26:40 PM   ERROR [JobService] Object:
{
  "id": "dbbfbc1b-1b7d-424f-8bea-fc8957d7811e",
  "source": "upload"
}
[Nest] 7  - 12/14/2023, 6:26:40 PM   ERROR [JobService] Unable to run job handler (recognizeFaces/recognize-faces): Error: Request for facial recognition failed with status 500: Internal Server Error
[Nest] 7  - 12/14/2023, 6:26:40 PM   ERROR [JobService] Error: Request for facial recognition failed with status 500: Internal Server Error
    at MachineLearningRepository.post (/usr/src/app/dist/infra/repositories/machine-learning.repository.js:18:19)
    at process.processTicksAndRejections (node:internal/process/task_queues:95:5)
    at async PersonService.handleRecognizeFaces (/usr/src/app/dist/domain/person/person.service.js:247:23)
    at async /usr/src/app/dist/domain/job/job.service.js:112:37
    at async Worker.processJob (/usr/src/app/node_modules/bullmq/dist/cjs/classes/worker.js:387:28)
    at async Worker.retryIfFailed (/usr/src/app/node_modules/bullmq/dist/cjs/classes/worker.js:574:24)
[Nest] 7  - 12/14/2023, 6:26:40 PM   ERROR [JobService] Object:
{
  "id": "01ea340d-cf30-4c11-b42f-0802e0e0ab81",
  "source": "upload"
}
[Nest] 7  - 12/14/2023, 6:26:41 PM     LOG [MediaService] Encoding success dbbfbc1b-1b7d-424f-8bea-fc8957d7811e
[Nest] 7  - 12/14/2023, 6:26:42 PM     LOG [MediaService] Encoding success dbbfbc1b-1b7d-424f-8bea-fc8957d7811e
[Nest] 7  - 12/14/2023, 6:26:42 PM     LOG [MediaService] Successfully generated WEBP image thumbnail for asset 01ea340d-cf30-4c11-b42f-0802e0e0ab81

mertalev · 2023-12-14T21:17:56Z

Thanks for testing! It looks like I forgot to change the maxDistance being sent for facial recognition requests.

Are you just running on CPU, or are you trying to use with an acceleration device? Could you try disabling facial recognition for now just to see that the other jobs work?

kkoshelev · 2023-12-14T21:24:18Z

I'm using CUDA, see below.

+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 535.129.03             Driver Version: 535.129.03   CUDA Version: 12.2     |
|-----------------------------------------+----------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |         Memory-Usage | GPU-Util  Compute M. |
|                                         |                      |               MIG M. |
|=========================================+======================+======================|
|   0  GRID P4-1B4                    On  | 00000000:06:10.0 Off |                  N/A |
| N/A   N/A    P8              N/A /  N/A |      0MiB /  7488MiB |      0%      Default |
|                                         |                      |             Disabled |
+-----------------------------------------+----------------------+----------------------+
                                                                                         
+---------------------------------------------------------------------------------------+
| Processes:                                                                            |
|  GPU   GI   CI        PID   Type   Process name                            GPU Memory |
|        ID   ID                                                             Usage      |
|=======================================================================================|
|  No running processes found                                                           |
+---------------------------------------------------------------------------------------+

I had to revert back the release build (1.90.2)

mertalev · 2023-12-14T22:50:11Z

That issue should be fixed now.

mertalev · 2023-12-21T23:59:34Z

Rebased on #5667 so that should be merged first

mertalev · 2024-01-14T00:09:53Z

I did some testing with this and both CUDA and OpenVINO work correctly and actually use the GPU.

jrasm91

LGTM. There are a few places you need to replace hwaccel.yml still. Notable in the prepare-release.yml file as well as links in the docs to the release artifact.

.github/workflows/docker.yml

fyfrey

LGTM! Really like the overhaul of the docker hwaccel, so much more consistent and streamlined now. btw, thanks for adding the documentation for ARMNN :-)

docker/docker-compose.dev.yml

docker/docker-compose.prod.yml

docs/docs/features/hardware-transcoding.md

tbleiker · 2024-01-21T08:05:41Z

I tested the feat/ml-tensorrt branch yesterday: I let the ml container run on my computer (Archlinux, GTX 1060) and adjusted the url for the machine learning server accordingly. @mertalev: Everything worked so far! The face detection step was way faster then on my server CPU (Xeon E3-1220v3). However, the facial recognition part still seemed to be running on the server. Is this not done by the ml container?

mertalev · 2024-01-21T22:49:05Z

I tested the feat/ml-tensorrt branch yesterday: I let the ml container run on my computer (Archlinux, GTX 1060) and adjusted the url for the machine learning server accordingly. @mertalev: Everything worked so far! The face detection step was way faster then on my server CPU (Xeon E3-1220v3). However, the facial recognition part still seemed to be running on the server. Is this not done by the ml container?

Really happy to hear that!

Yes, the clustering is all done on CPU. The face detection outputs are stored in Postgres and queried with a special vector search index. The ML service is designed to be very independent - it doesn't integrate with Postgres, Redis, etc. and has no knowledge about what earlier model outputs were. All of that is orchestrated by immich-microservices.

alextran1502 · 2024-01-21T23:01:32Z

I think we can merge this after splitting the documentation into a separate PR that will get merged after the next release.

Please feel free to press the green button after doing so! Thank you so much

mertalev changed the title ~~feat(ml): tensorrt and openvino acceleration~~ feat(ml): cuda and openvino acceleration Dec 11, 2023

mertalev marked this pull request as ready for review December 11, 2023 02:52

alextran1502 reviewed Dec 11, 2023

View reviewed changes

docker/docker-compose.yml Outdated Show resolved Hide resolved

bo0tzz added the 🧠machine-learning label Dec 13, 2023

mertalev force-pushed the feat/ml-tensorrt branch 5 times, most recently from 0622173 to 7361e69 Compare December 21, 2023 23:58

mertalev force-pushed the feat/ml-tensorrt branch 3 times, most recently from 561e1ee to be91703 Compare December 22, 2023 22:35

mertalev mentioned this pull request Dec 22, 2023

Add Intel GPU Hardware Acceleration for Image Classification using OpenVINO #5546

Closed

mertalev force-pushed the feat/ml-tensorrt branch 2 times, most recently from 57ac438 to 81c3112 Compare December 22, 2023 23:05

mertalev mentioned this pull request Jan 3, 2024

feat(ml): CUDA acceleration and ONNX compilation #2574

Closed

mertalev force-pushed the feat/ml-tensorrt branch from 6dd2415 to 1c81373 Compare January 13, 2024 06:35

mertalev changed the title ~~feat(ml): cuda and openvino acceleration~~ feat(ml)!: cuda and openvino acceleration Jan 14, 2024

mertalev requested a review from fyfrey January 16, 2024 22:57

mertalev added 3 commits January 18, 2024 01:42

cuda and openvino ep, refactor, update dockerfile

e1199da

updated workflow

54168a7

typing fixes

6c9727b

mertalev added 13 commits January 18, 2024 01:42

added armnn prerequisite docs

0611a5e

support 3.10

6b9f071

updated docker-compose comments

7a2fc6f

formatting

d8db13b

test coverage

35ef695

don't set arena extend strategy for openvino

8b332c9

working openvino

94afa26

formatting

719667d

fix dockerfile

7f684e2

added type annotation

0df73ba

add wsl configuration for openvino

55a5675

updated lock file

7395dd0

copy python3

1eb305e

mertalev force-pushed the feat/ml-tensorrt branch from 4ddf503 to 1eb305e Compare January 18, 2024 06:43

jrasm91 approved these changes Jan 18, 2024

View reviewed changes

.github/workflows/docker.yml Outdated Show resolved Hide resolved

bo0tzz requested changes Jan 18, 2024

View reviewed changes

.github/workflows/docker.yml Outdated Show resolved Hide resolved

fyfrey approved these changes Jan 18, 2024

View reviewed changes

docker/docker-compose.dev.yml Outdated Show resolved Hide resolved

docker/docker-compose.prod.yml Outdated Show resolved Hide resolved

mertalev added 3 commits January 18, 2024 18:01

comment out extends section

6e7bd55

fix platforms

0a30e98

simplify workflow suffix tagging

8440f05

mertalev requested a review from bo0tzz January 18, 2024 23:28

bo0tzz approved these changes Jan 19, 2024

View reviewed changes

docs/docs/features/hardware-transcoding.md Outdated Show resolved Hide resolved

mertalev added 2 commits January 20, 2024 01:06

simplify aio transcoding doc

f6c2b2b

update docs and workflow for hwaccel.yml change

665e676

revert docs

b68f17e

mertalev merged commit 95cfe22 into main Jan 21, 2024
24 checks passed

mertalev deleted the feat/ml-tensorrt branch January 21, 2024 23:22

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(ml)!: cuda and openvino acceleration #5619

feat(ml)!: cuda and openvino acceleration #5619

mertalev commented Dec 10, 2023 •

edited

Loading

yodatak commented Dec 11, 2023

cloudflare-pages bot commented Dec 14, 2023 •

edited

Loading

kkoshelev commented Dec 14, 2023

mertalev commented Dec 14, 2023

kkoshelev commented Dec 14, 2023 •

edited

Loading

mertalev commented Dec 14, 2023

mertalev commented Dec 21, 2023

mertalev commented Jan 14, 2024

jrasm91 left a comment

fyfrey left a comment

tbleiker commented Jan 21, 2024 •

edited

Loading

mertalev commented Jan 21, 2024

alextran1502 commented Jan 21, 2024

feat(ml)!: cuda and openvino acceleration #5619

feat(ml)!: cuda and openvino acceleration #5619

Conversation

mertalev commented Dec 10, 2023 • edited Loading

Description

How has this been tested?

yodatak commented Dec 11, 2023

cloudflare-pages bot commented Dec 14, 2023 • edited Loading

Deploying with Cloudflare Pages

kkoshelev commented Dec 14, 2023

mertalev commented Dec 14, 2023

kkoshelev commented Dec 14, 2023 • edited Loading

mertalev commented Dec 14, 2023

mertalev commented Dec 21, 2023

mertalev commented Jan 14, 2024

jrasm91 left a comment

Choose a reason for hiding this comment

fyfrey left a comment

Choose a reason for hiding this comment

tbleiker commented Jan 21, 2024 • edited Loading

mertalev commented Jan 21, 2024

alextran1502 commented Jan 21, 2024

mertalev commented Dec 10, 2023 •

edited

Loading

cloudflare-pages bot commented Dec 14, 2023 •

edited

Loading

kkoshelev commented Dec 14, 2023 •

edited

Loading

tbleiker commented Jan 21, 2024 •

edited

Loading