-
Notifications
You must be signed in to change notification settings - Fork 2.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
New feat: Self Query Retriever #1266
Conversation
…anslator, but a basic translator that can be used with pinecone and chroma is included and can be imported
The latest updates on your projects. Learn more about Vercel for Git ↗︎
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Excellent work 🔥, just small nitpicks and notes
langchain/src/output_parsers/expression_type_handlers/string_literal_handler.ts
Show resolved
Hide resolved
langchain/src/output_parsers/expression_type_handlers/call_expression_handler.ts
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm!
This commit introduces the SelfQueryRetriever, a new class of object translated directly from the python version (https://github.com/hwchase17/langchain/tree/master/langchain/retrievers/self_query) that retrieves documents from vector stores by using a query-constructing LLM chain to write a structured query. This is achieved through the development of a query translator that translates LLM-generated output into a query that the vector stores can understand. I included basic query translators that works with Chroma and Pinecone, though it shouldn't be hard to create new query translators, as long as the retriever/vector store can also accept filter.
To support this functionality, three new parsers have been added:
ExpressionParser: Parses javascript expressions outputted by the LLM. This is achieved by using meriyah package to parse the output. This parser is general enough that it can parse any javascript call expressions.
AsymmetricOutputParser: This is like StructuredOutputParser, but can handle different input and output shape.
StructuredQueryOutputParser: Parses output from query-constructing LLM into structured query that can be used by retrievers as filter. This is a child of the new AsymmetricOutputParser class. The output from this class still needs to be translated by a Translator which then can be used as filter.
Additionally, this commit includes new unit and integration tests, examples, and docs for Pinecone and Chroma self query retrievers.