Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature Request]: add more operators in where filter by metadata #2036

Open
stu-GYULA opened this issue Apr 21, 2024 · 3 comments
Open

[Feature Request]: add more operators in where filter by metadata #2036

stu-GYULA opened this issue Apr 21, 2024 · 3 comments
Labels
enhancement New feature or request

Comments

@stu-GYULA
Copy link

Describe the problem

I create a collection with every document composed by page_content and a metadata 'style'. the text of style is some style types and joined by comma, for example, 'modern style, log style, french style'. when I retrieved in vector, I want to filter documents in which the style metadata has certain style name. but I find $contains was only supported by where_document filter but not by where filter. whether this feature will be added.

Describe the proposed solution

add contains and not_contains operators in where filter.

Alternatives considered

No response

Importance

nice to have

Additional Information

No response

@stu-GYULA stu-GYULA added the enhancement New feature or request label Apr 21, 2024
@jeffchuber
Copy link
Contributor

@tazarov im pretty sure there is an issue for this. can you help locate it? thank you! (ironically a good use case for vector search that github does not have natively (yet))

@tazarov
Copy link
Contributor

tazarov commented Apr 22, 2024

We have PR (#1196) with this functionality, which has been pending for a while. A lot of people seem to be interested in this. The main challenge is feature parity with distributed/hosted Chroma. Some things are easy to implement in both relatively simple, and from discussion with @HammadB and @beggers, it would appear that the feature might take some time to support in the Rust backend.

It is worth spending some cycles thinking about the most efficient way to incorporate features for experimentation so that people can try them out and decide if this is worth it, and, of course, the team to figure out what the feature parity for this might be in distributed/hosted.

Here is a practical view of things:

Hypothesis - like/contains mechanics for metadata fields seems like a good idea.
Reality: Empirical evidence of metadata performance shows that a feature like this can be challenging to scale beyond trivial database sizes (e.g. 100k+ records) on a single-node Chroma. (side note: Metadata performance is something that I am actively exploring)

@stu-GYULA
Copy link
Author

ok, thanks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants