New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[FEATURE] Support filtering by Metadata during search in EmbeddingStore #151
Comments
use two collections, or use field to split them during query. |
Hi @TyCoding can you describe your use case in more detail? What is app1 and what is app2? From the first glance it seems that you need 2 different embedding stores for your use case. |
@langchain4j |
The project is great, Currently it may be the best in the open source java project of LLM。The problem is that the project has just started, and some functions are not perfect, Such as embedding store, it is the fields encapsulated into the collection are hardcoded, and It only encapsulates part of the functions of add,If you want to define fields by yourself, or add modification and deletion functions, you need to extend the class yourself, or repackage。 |
@TyCoding metadata filtering is something we will work on soon. |
Any update on the metadata filtering, it's a feature I could also use for my own project! #Thanks |
Hi @stephanj may I ask you about your use case? This is one of our top priorities, and we will start working on it next week. |
We have various properties for documents we want the user to be able to filter before the vector search. For example document type. |
Would also very much appreciate to have meta data filtering. Spring AI has something like this already, the call it metadata filters Our use case is that we index different types of documents, e.g. web pages, pages from our documentation, support tickets, knowledge base articles, etc. In total we have several thousand documents. With meta data filter we could for example retrieve relevant support tickets, but leave out other document types Following the SpringAI conecpt dev.langchain4j.store.embedding.EmbeddingStore.findRelevant could be extended by adding an optional Filter parameter. A FilterExpressionBuilder could be used to create a Filter. |
Hi all, please share for which embedding stores do you need this feature, so that we can prioritize. |
We are using ElasticSearch on our side. Thx! |
Our team is using Milvus |
As I quick workaround I created my own extension of
Plus I added a
When calling
the above query will limit searches to Documents that have 'metadata.source.keyword' equal to 'TICKET'. But obviously you could add any type of Query you like. This is obviously not a full solution, as you definitely would not want OpenAI specific Queries in langchain4j API. |
@andyflury Thanks for the insights! I am not very familiar with Elasticsearch, but shouldn't filtering be done outside of |
I'm also not an Elasticsearch expert. I basically copied this logic here: |
Hi all, here is a draft, comments are welcome! |
Metadata filtering is now supported in version 0.28.0 for |
Thanks for the project, and I'd like to ask about EmbeddingStore. I know that documents can be stored using EmbeddingStore, but are there any categorizations or divisions between different documents? For example, if I have App1 associated with Doc1 and App2 associated with Doc2 (both stored), when using App1 for chatting, I shouldn't be able to query data from Doc2. How can this be achieved?
The text was updated successfully, but these errors were encountered: