feat(ml): ARMNN acceleration #5667

fyfrey · 2023-12-13T00:04:56Z

a very long journey to find a way to use ARMNN (on ARM Mali GPUs) to offload and accelerate Immich ML models (CLIP vision, face detection & face recognition)

code overview

new native library libann that provides c/python (ANN) access to C++ ARMNN
libann is wrapped as a ONNX-session like object allowing it to be a drop-in-replacement for all compatible ONNX models (see the minimal code changes to clip.py, facial_recognition.py)
modified docker image for ARM to always include ARMNN+ANN
added a new exporter that can produce .armnn model files for CLIP vision, arcface & retinaface (cannot be integrated into existing exporter as it needs pytorch==1.x) suggested workflow: iterate over compatible models already on Immich huggignface, export armnn models, upload/add to existing HF model
compatible armnn models need to be uploaded to existing Immich huggingface models
ANN is automatically selected at runtime if 1. it is enabled in the config (default true), 2. the required libraries are mapped into the container and 3. the model has not only .onnx but also .armnn file

benchmarks on an Orange Pi 5 (RK3588 SOC)

"ENCODE CLIP" + "RECOGNIZE FACES" job for 3000 images, concurrency 4

ARMNN (fp16):
ram 680mb
225% CPU
total time for 3000 images: 3 minutes
SoC temp: 60.1°C

ONNX (fp32):
ram 980mb
745% CPU
total time for 3000 images: 45 minutes
SoC temp: 81.3°C

how has this been tested

build on x86_64 and ARM64
manually added the armnn models into the model-cache folder
checked that ONNX models still work (and produce sensible results)
cheked that ARMNN models work (produce sensible results)
performed above benchmark

cloudflare-pages · 2023-12-13T22:52:22Z

Deploying with Cloudflare Pages

Latest commit:	`9f64cd3`
Status:	✅ Deploy successful!
Preview URL:	https://f0d81e3b.immich.pages.dev
Branch Preview URL:	https://ml-armnn.immich.pages.dev

View logs

mertalev

This is really cool! It's good aside from some code style comments.

machine-learning/ann/ann.py

machine-learning/app/models/ann.py

machine-learning/ann/ann.py

machine-learning/app/models/ann.py

machine-learning/app/models/base.py

fyfrey · 2023-12-19T12:52:14Z

This is really cool! It's good aside from some code style comments.

Thanks! Also for the good review. Only remaining problem is the test / mock failure. Not sure how to make the mocking work again.

mertalev

LGTM

wrap ANN as ONNX-Session try fix tests review feedback fix mock mypy formatting

bo0tzz added the 🧠machine-learning label Dec 13, 2023

fyfrey force-pushed the ml/armnn branch from 4b50543 to b86f4de Compare December 13, 2023 22:52

fyfrey force-pushed the ml/armnn branch 2 times, most recently from 9b426b4 to 81622d8 Compare December 13, 2023 23:04

fyfrey changed the title ~~feat(ml): ARMNN acceleration for CLIP~~ feat(ml): ARMNN acceleration Dec 18, 2023

fyfrey requested a review from mertalev December 18, 2023 23:12

fyfrey marked this pull request as ready for review December 18, 2023 23:12

fyfrey force-pushed the ml/armnn branch 6 times, most recently from 7e46963 to 1810624 Compare December 18, 2023 23:59

mertalev reviewed Dec 19, 2023

View reviewed changes

fyfrey force-pushed the ml/armnn branch from 4e3befb to c5943c5 Compare December 19, 2023 12:47

mertalev force-pushed the ml/armnn branch from d4dcdff to aa0dcb7 Compare December 21, 2023 04:17

mertalev approved these changes Dec 21, 2023

View reviewed changes

mertalev mentioned this pull request Dec 21, 2023

feat(ml)!: cuda and openvino acceleration #5619

Merged

fyfrey added the feel-free-to-merge-👍 label Dec 22, 2023

fyfrey and others added 8 commits January 11, 2024 18:14

feat(ml): ARMNN acceleration for CLIP

809ae4d

wrap ANN as ONNX-Session try fix tests review feedback fix mock mypy formatting

strict typing

8abb4a3

encode strings to bytes

729a5ed

normalize ARMNN CLIP embedding

5c164be

mutex to handle concurrent execution

62f5475

make inputs contiguous

080c248

mypy fixes

73f0656

fine-grained locking; concurrent network execution

9f64cd3

fyfrey force-pushed the ml/armnn branch from 3f51cfa to 9f64cd3 Compare January 11, 2024 17:15

fyfrey merged commit 7532929 into main Jan 11, 2024
21 checks passed

fyfrey deleted the ml/armnn branch January 11, 2024 17:26

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(ml): ARMNN acceleration #5667

feat(ml): ARMNN acceleration #5667

fyfrey commented Dec 13, 2023 •

edited

cloudflare-pages bot commented Dec 13, 2023 •

edited

mertalev left a comment

fyfrey commented Dec 19, 2023

mertalev left a comment

feat(ml): ARMNN acceleration #5667

feat(ml): ARMNN acceleration #5667

Conversation

fyfrey commented Dec 13, 2023 • edited

code overview

benchmarks on an Orange Pi 5 (RK3588 SOC)

how has this been tested

cloudflare-pages bot commented Dec 13, 2023 • edited

Deploying with Cloudflare Pages

mertalev left a comment

Choose a reason for hiding this comment

fyfrey commented Dec 19, 2023

mertalev left a comment

Choose a reason for hiding this comment

fyfrey commented Dec 13, 2023 •

edited

cloudflare-pages bot commented Dec 13, 2023 •

edited