Support for matryoshka embeddings by wirthual · Pull Request #490 · michaelfeil/infinity

wirthual · 2024-12-06T04:13:38Z

Related Issue

Checklist

I have read the CONTRIBUTING guidelines.
I have added tests to cover my changes.
I have updated the documentation (docs folder) accordingly.

Additional Notes

WIP to add matryoshka embeddings.

Is there a CLAP model which supports matryoshka embedding for testing?
Is there a TinyCLIP model which supoprts matryoshka embedding for testing?

Currently missing:
[ ] Integration into client
[ ] Implementation for dummy model

greptile-apps

PR Summary

Here's my summary of the key changes in this PR:

Adds support for matryoshka (variable-length) embeddings across the infinity library with the following major changes:

Added dimensions field to OpenAI embedding input model in pymodels.py to specify desired embedding length
Modified BatchHandler to truncate embeddings to requested dimension after generation in batch_handler.py
Added matryoshka_dim parameter to embedding methods in AsyncEmbeddingEngine and AsyncEngineArray
Added comprehensive test coverage verifying matryoshka functionality:
- Tests with nomic-embed-text-v1.5 and jina-clip-v2 models
- Validates truncated embeddings maintain semantic similarity
- Verifies correct dimensions in API responses

The implementation enables compatibility with models like OpenAI's text-embedding-3 that support variable-length embeddings while maintaining backward compatibility.

Note: PR is marked WIP and still needs:

Integration into client
Implementation for dummy model
Additional test coverage for edge cases

_{💡 (2/5) Greptile learns from your feedback when you react with 👍/👎!}

_{7 file(s) reviewed, 14 comment(s)}
_{Edit PR Review Bot Settings | Greptile}

libs/infinity_emb/infinity_emb/fastapi_schemas/pymodels.py

greptile-apps · 2024-12-06T04:14:09Z

libs/infinity_emb/infinity_emb/sync_engine.py


    @add_start_docstrings(AsyncEngineArray.embed.__doc__)
-    def embed(self, *, model: str, sentences: list[str]):
+    def embed(self, *, model: str, sentences: list[str], matryoshka_dim=None):


style: matryoshka_dim parameter lacks type annotation. Should be Optional[int]

greptile-apps · 2024-12-06T04:14:10Z

libs/infinity_emb/infinity_emb/sync_engine.py


    @add_start_docstrings(AsyncEngineArray.image_embed.__doc__)
-    def image_embed(self, *, model: str, images: list[Union[str, bytes]]):
+    def image_embed(self, *, model: str, images: list[Union[str, bytes]], matryoshka_dim=None):


style: matryoshka_dim parameter lacks type annotation. Should be Optional[int]

greptile-apps · 2024-12-06T04:14:11Z

libs/infinity_emb/infinity_emb/sync_engine.py


    @add_start_docstrings(AsyncEngineArray.audio_embed.__doc__)
-    def audio_embed(self, *, model: str, audios: list[Union[str, bytes]]):
+    def audio_embed(self, *, model: str, audios: list[Union[str, bytes]], matryoshka_dim=None):


style: matryoshka_dim parameter lacks type annotation. Should be Optional[int]

libs/infinity_emb/tests/end_to_end/test_api_with_dummymodel.py

greptile-apps · 2024-12-06T04:14:21Z

libs/infinity_emb/infinity_emb/engine.py


    async def image_embed(
-        self, *, model: str, images: list[Union[str, "ImageClassType"]]
+        self, *, model: str, images: list[Union[str, "ImageClassType"]], matryoshka_dim=None


style: matryoshka_dim parameter is missing type annotation, should be Optional[int]

greptile-apps · 2024-12-06T04:14:21Z

libs/infinity_emb/infinity_emb/engine.py


    async def audio_embed(
-        self, *, model: str, audios: list[Union[str, bytes]]
+        self, *, model: str, audios: list[Union[str, bytes]], matryoshka_dim=None


style: matryoshka_dim parameter is missing type annotation, should be Optional[int]

greptile-apps · 2024-12-06T04:14:37Z

libs/infinity_emb/tests/unit_test/test_engine.py

+    )
+    assert engine.capabilities == {"embed"}
+    async with engine:
+        embeddings, usage = await engine.embed(sentences=sentences, matryoshka_dim=matryoshka_dim)


logic: matryoshka_dim parameter should be validated against model's supported dimensions

greptile-apps · 2024-12-06T04:14:38Z

libs/infinity_emb/tests/unit_test/test_engine.py

+        embeddings = np.array(embeddings)
+        assert usage == sum([len(s) for s in sentences])
+        assert embeddings.shape[0] == len(sentences)
+        assert embeddings.shape[1] >= 10


style: redundant assertion since line 408 already checks exact dimension

libs/infinity_emb/tests/end_to_end/test_openapi_client_compat.py

codecov-commenter · 2024-12-06T16:53:33Z

⚠️ Please install the to ensure uploads and comments are reliably processed by Codecov.

Codecov Report

Attention: Patch coverage is 92.00000% with 2 lines in your changes missing coverage. Please review.

Project coverage is 79.63%. Comparing base (be48378) to head (9c811fb).
Report is 19 commits behind head on main.

Files with missing lines	Patch %	Lines
libs/infinity_emb/infinity_emb/engine.py	87.50%	1 Missing ⚠️
libs/infinity_emb/infinity_emb/sync_engine.py	83.33%	1 Missing ⚠️

❗ Your organization needs to install the Codecov GitHub app to enable full functionality.

Additional details and impacted files

@@            Coverage Diff             @@
##             main     #490      +/-   ##
==========================================
+ Coverage   79.59%   79.63%   +0.04%     
==========================================
  Files          41       41              
  Lines        3430     3438       +8     
==========================================
+ Hits         2730     2738       +8     
  Misses        700      700

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

wirthual · 2024-12-06T17:02:51Z

I did a quick test like this:

from openai import OpenAI

client = OpenAI(
    base_url="http://0.0.0.0:7997",
    api_key="sk", 
)

result = client.embeddings.create(
    input=["input","input2"],
    model="nomic-ai/nomic-embed-text-v1.5",
    dimensions=64
)

assert len(result.data[0].embedding) == 64

michaelfeil · 2024-12-09T01:44:16Z

libs/infinity_emb/infinity_emb/fastapi_schemas/pymodels.py

    model: str = "default/not-specified"
    encoding_format: EmbeddingEncodingFormat = EmbeddingEncodingFormat.float
    user: Optional[str] = None
+    dimensions: Optional[int] = None


int should be 0 < x < 8193, using pydantic v2 conint

michaelfeil · 2024-12-09T01:46:11Z

LGTM, if you change the OpenAPI spec for the validation of input and add an end-to-end test

michaelfeil · 2024-12-09T02:05:19Z

@wirthual
Potentially this. could be a end-to-end test under open

from openai import OpenAI

client = OpenAI(
    base_url="http://0.0.0.0:7997",
    api_key="sk", 
)

result = client.embeddings.create(
    input=["input","input2"],
    model="nomic-ai/nomic-embed-text-v1.5",
    dimensions=64
)

assert len(result.data[0].embedding) == 64

wirthual · 2024-12-09T17:15:33Z

@wirthual Potentially this. could be a end-to-end test under open

from openai import OpenAI

client = OpenAI(
    base_url="http://0.0.0.0:7997",
    api_key="sk", 
)

result = client.embeddings.create(
    input=["input","input2"],
    model="nomic-ai/nomic-embed-text-v1.5",
    dimensions=64
)

assert len(result.data[0].embedding) == 64

Sounds good. Is there an exmaple on how to start a fastapi server within a pytest method without using AsyncOpenAI client?

michaelfeil · 2024-12-09T17:49:30Z

Just add one here:
https://github.com/michaelfeil/infinity/blob/be483785f23c3e2a738c85028cbac3a390ec2bab/libs/infinity_emb/tests/end_to_end/test_openapi_client_compat.py#L115C9-L115C21
Also with the other tests - mostly we don't use pytest.mark.parametrize here to that it does not need to restart the server every time.

wirthual · 2024-12-09T17:54:33Z

Like this?

michaelfeil

Okay, nevermind.. :)

initial commits for matryoshka_dim

a0b5cc4

greptile-apps bot reviewed Dec 6, 2024

View reviewed changes

wirthual added 3 commits December 6, 2024 05:20

add missing type hints

7917c56

format. Use future annotations

d8ad010

add dims to server

1de6c52

michaelfeil reviewed Dec 9, 2024

View reviewed changes

add constraints for dimensions

9c811fb

michaelfeil approved these changes Dec 9, 2024

View reviewed changes

wirthual merged commit efe6096 into main Dec 10, 2024

wirthual deleted the matryoshka_dim branch December 10, 2024 02:07

Conversation

wirthual commented Dec 6, 2024

Related Issue

Checklist

Additional Notes

Uh oh!

greptile-apps bot left a comment

Choose a reason for hiding this comment

PR Summary

Uh oh!

Uh oh!

greptile-apps bot Dec 6, 2024

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Dec 6, 2024

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Dec 6, 2024

Choose a reason for hiding this comment

Uh oh!

Uh oh!

greptile-apps bot Dec 6, 2024

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Dec 6, 2024

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Dec 6, 2024

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Dec 6, 2024

Choose a reason for hiding this comment

Uh oh!

Uh oh!

codecov-commenter commented Dec 6, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

wirthual commented Dec 6, 2024

Uh oh!

michaelfeil Dec 9, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

michaelfeil commented Dec 9, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

michaelfeil commented Dec 9, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

wirthual commented Dec 9, 2024

Uh oh!

michaelfeil commented Dec 9, 2024

Uh oh!

wirthual commented Dec 9, 2024

Uh oh!

michaelfeil left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

codecov-commenter commented Dec 6, 2024 •

edited

Loading

michaelfeil Dec 9, 2024 •

edited

Loading

michaelfeil commented Dec 9, 2024 •

edited

Loading

michaelfeil commented Dec 9, 2024 •

edited

Loading