Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(ml): export clip models to ONNX and host models on Hugging Face #4700

Merged
merged 16 commits into from
Oct 31, 2023

Conversation

mertalev
Copy link
Contributor

@mertalev mertalev commented Oct 29, 2023

Description

We currently use clip-as-service for downloading CLIP models. The motivation of using this was to avoid the need to export models ourselves, as well as to have models ready to use after downloading without exporting to ONNX at runtime. However, this has caused a number of issues, particularly due to the hosting server being intermittently unavailable.

This PR transitions away from using clip-as-service to handle model exporting and hosting ourselves. The full ONNX catalog of clip-as-service is supported for feature parity and backwards compatibility, and models are downloaded with a different cache structure than before. As a result, this is a drop-in replacement that should not require any manual intervention.

Exported models are uploaded to a brand new set of Hugging Face repos with a new organization. Relevant model repos are downloaded at runtime and are completely self-contained in the files they need.

The CLIP implementation in the ML service has been refactored to integrate with these repos. Moreover, all dependence on PyTorch has been removed from this section of the code: preprocessing is now exclusively done in Pillow and NumPy. This paves the way for shrinking the image size considerably, leaving the image classification code as the only remaining reliance on PyTorch.

While this PR is focused on CLIP, using our own Hugging Face repos for models enables many exciting possibilities in the future. This is just the start.

How has this been tested?

Every model listed here has been tested with Postman for both image and text. Additionally, I tested text search with ViT-B-32__openai before running an Encode CLIP job, confirming the results were relevant (i.e. the model outputs are correct and compatible with existing embeddings). The Encode CLIP job ran successfully as well, as did changing the model to XLM-Roberta-Large-Vit-L-14 (an M-CLIP model that is handled differently than OpenAI and OpenCLIP models).

Fixes #4117

@vercel
Copy link

vercel bot commented Oct 29, 2023

The latest updates on your projects. Learn more about Vercel for Git ↗︎

1 Ignored Deployment
Name Status Preview Comments Updated (UTC)
immich ⬜️ Ignored (Inspect) Visit Preview Oct 29, 2023 11:17pm

Copy link
Contributor

@fyfrey fyfrey left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is really good work! Looking forward to finally removing the large pytorch dependency. Even though, removing the dependency on clip-as-a-service is great! The fewer deps, the better :-)
I really like the new export functionality/image.

machine-learning/export/env.yaml Show resolved Hide resolved
machine-learning/app/models/clip.py Outdated Show resolved Hide resolved
@fyfrey
Copy link
Contributor

fyfrey commented Oct 30, 2023

Could you add a short README for the exporter? What it does, how to use it for some example model.

@alextran1502 alextran1502 merged commit 87a0ba3 into main Oct 31, 2023
21 checks passed
@alextran1502 alextran1502 deleted the chore/ml-export-models branch October 31, 2023 10:02
@aviv926
Copy link
Contributor

aviv926 commented Oct 31, 2023

I'm trying to ask for permission to check if it's not a problem in terms of copyright to create another download source for the models, from the license condition and from what I understand there shouldn't be a problem, but I'd rather ask anyway

If you know the answer to my question, I would love to hear it

waclaw66 pushed a commit to waclaw66/immich that referenced this pull request Nov 1, 2023
…mmich-app#4700)

* export clip models

* export to hf

refactored export code

* export mclip, general refactoring

cleanup

* updated conda deps

* do transforms with pillow and numpy, add tokenization config to export, general refactoring

* moved conda dockerfile, re-added poetry

* minor fixes

* updated link

* updated tests

* removed `requirements.txt` from workflow

* fixed mimalloc path

* removed torchvision

* cleaner np typing

* review suggestions

* update default model name

* update test
@nodis
Copy link

nodis commented Nov 1, 2023

The model used in my Smart Search is "M-CLIP/XLM Robertsa Large Vit B-16Plus". Do I need to modify this name? For example, modify it to: "immich app/XLM Robertsa Large Vit B-16Plus", thank you

@mertalev
Copy link
Contributor Author

mertalev commented Nov 1, 2023

It ignores anything before the slash, so you should be fine.

@nodis
Copy link

nodis commented Nov 1, 2023

It ignores anything before the slash, so you should be fine.

My "model-cache\clip" directory originally had a folder named "M-CLIP_XLM-Robertsa-Large Vit-B-16Plus". After upgrading to 1.84, a folder named "XLM-Robertsa-Large Vit-B-16Plus" appeared. Can I delete the "M-CLIP_XLM-Robertsa-Large Vit-B-16Plus" folder?

@mertalev
Copy link
Contributor Author

mertalev commented Nov 1, 2023

Yes, that's a stale folder at this point.

rikifrank pushed a commit to shefing/immich that referenced this pull request Nov 6, 2023
…mmich-app#4700)

* export clip models

* export to hf

refactored export code

* export mclip, general refactoring

cleanup

* updated conda deps

* do transforms with pillow and numpy, add tokenization config to export, general refactoring

* moved conda dockerfile, re-added poetry

* minor fixes

* updated link

* updated tests

* removed `requirements.txt` from workflow

* fixed mimalloc path

* removed torchvision

* cleaner np typing

* review suggestions

* update default model name

* update test
claabs pushed a commit to claabs/immich-machine-learning-openvino-kernel-fix that referenced this pull request Aug 11, 2024
…mmich-app#4700)

* export clip models

* export to hf

refactored export code

* export mclip, general refactoring

cleanup

* updated conda deps

* do transforms with pillow and numpy, add tokenization config to export, general refactoring

* moved conda dockerfile, re-added poetry

* minor fixes

* updated link

* updated tests

* removed `requirements.txt` from workflow

* fixed mimalloc path

* removed torchvision

* cleaner np typing

* review suggestions

* update default model name

* update test
@sushilkhadkaanon
Copy link

Hi @mertalev , @aviv926 . Do you provide any script for conversion of open_clip model to onnx. I've been stuck at converting apple/DFN2B-CLIP-ViT-L-14. Also the model immich-app/ViT-L-14-quickgelu__dfn2b under your hgface repos same?

@aviv926
Copy link
Contributor

aviv926 commented Sep 6, 2024

Hi @mertalev , @aviv926 . Do you provide any script for conversion of open_clip model to onnx. I've been stuck at converting apple/DFN2B-CLIP-ViT-L-14. Also the model immich-app/ViT-L-14-quickgelu__dfn2b under your hgface repos same?

@mertalev Write a script that does this.

The export code is available here https://github.com/immich-app/immich/tree/main/machine-learning/export

It downloads the openclip model, traces it to torchscript and exports the torchscript model to ONNX.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[BUG] Unable to download CLIP model for search
6 participants