Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

adds custom vectorizer support #156

Merged
merged 9 commits into from
Jun 25, 2024
Merged

adds custom vectorizer support #156

merged 9 commits into from
Jun 25, 2024

Conversation

justin-cechmanek
Copy link
Collaborator

This PR introduces a new CustomTextVectorizer class that allows for users to wrap their own embedding functions and methods to be compatible with RedisVL.

@justin-cechmanek justin-cechmanek self-assigned this Jun 6, 2024
@justin-cechmanek justin-cechmanek marked this pull request as ready for review June 6, 2024 18:25
Copy link
Collaborator

@tylerhutcherson tylerhutcherson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few suggestions


super().__init__(model=model, dims=self._set_model_dims())

def _validate_embed(self, func: Callable):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given this is wrapping a pydantic BaseModel object, these validators should be implemented using the validator operators. (there should be some other examples of this here)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wrapping these with @validator decorators calls them before the embed methods are set in super().__init__. Pydantic enforces a certain order in initialization so we end up getting trapped in a circular dependency.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fine for now, but we will look at this later though (maybe for 0.3.0) because I believe we can find a cleaner way to use built-in validation, pre or post init.

redisvl/utils/vectorize/text/custom.py Show resolved Hide resolved
Copy link
Collaborator

@tylerhutcherson tylerhutcherson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a few more tweaks to the notebook

@tylerhutcherson
Copy link
Collaborator

@justin-cechmanek Can you work on rebasing these changes on top of main? Then we can get this in today.

Copy link
Collaborator

@tylerhutcherson tylerhutcherson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice!

@tylerhutcherson tylerhutcherson merged commit e332bea into main Jun 25, 2024
20 checks passed
@tylerhutcherson tylerhutcherson deleted the jc/vectorizer-support branch June 25, 2024 13:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Improve & Enrich the LLM client implementation for TextVectorizer
2 participants