-
Notifications
You must be signed in to change notification settings - Fork 115
Adds weight
parameter type definition for RRF retrievers.
#5502
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR adds support for weighted retrievers in the RRF (Reciprocal Rank Fusion) retriever by introducing a new weight
parameter. The change allows individual retrievers to have different influence levels when combining their results.
- Introduced
RRFRetrieverContainer
class extendingRetrieverContainer
with optionalweight
parameter - Updated
RRFRetriever.retrievers
type fromRetrieverContainer[]
toRRFRetrieverContainer[]
- Added comprehensive documentation for the weight parameter behavior and constraints
Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.
Following you can find the validation changes against the target branch for the API.
You can validate this API yourself by using the |
You'll need to backport this to 9.2 right? |
Thanks for the reminder! I've added the |
The backport job only runs after the PR is merged (if there are conflicts might need to use the manual backport tool to clean those up) |
Thanks for the PR @mridula-s109 ! From looking at the JSON example provided in elastic/elasticsearch#130658, @pmpailis seems to be right that the structure is not correct this way. "retrievers": [ // <- RRFRetriever.retrievers
{
"retriever": { // <- RRFRetrieverComponent.retriever
"standard": { // <- RetrieverContainer variant
"query": { "match": { "description": "pizza" } }
}
},
"weight": 0.7 // <- RRFRetrieverComponent.weigth
}, I think it should rather look like this: export class RRFRetrieverComponent { // name taken from the server code
retriever: RetrieverContainer
weigth?: float
} and // In RRFRetriever:
retrievers: RRFRetrieverComponent[] Btw.: I'm a little bit surprised that we ship this breaking change in a minor release. Is this expected? Side note: For |
I think that this is an extension, i.e. an alternative format that we support, and not a breaking change. The previous format has not gone away (which also implies some adjustments in this PR) |
Ah right, I just saw the corresponding test cases 👍
Unfortunately. This kind of extension will result in an untagged/undiscriminated union in the specification - which is the worst possible construct to have for all statically typed clients 😐 |
I have removed the |
@mridula-s109 Are you going to update the PR? Happy to assist, if you need help. I would suggest something like this: export class RRFRetrieverComponent { // name taken from the server code
retriever: RetrieverContainer
weigth?: float
}
export type RRFRetrieverDescriptor = RetrieverContainer | RRFRetrieverComponent;
// ^ naming is hard. Open for suggestions. // In RRFRetriever:
retrievers: RRFRetrieverDescriptor[] |
Thanks for the suggestion @flobernd, Yes i will update the PR accordingly today and rerequest review on the same. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
Copilot reviewed 1 out of 1 changed files in this pull request and generated no new comments.
Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.
* Either a direct RetrieverContainer (backward compatible) or an RRFRetrieverComponent with weight. | ||
* @codegen_names retriever, weighted | ||
*/ | ||
export type RRFRetrieverEntry = RetrieverContainer | RRFRetrieverComponent |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 that name fits better
export class RRFRetrieverComponent { | ||
/** The nested retriever configuration. */ | ||
retriever: RetrieverContainer | ||
/** Weight multiplier for this retriever's contribution to the RRF score. Higher values increase influence. Defaults to 1.0 if not specified. Must be non-negative. */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/nit: we should also add the @server_default tag as a machine readable alternative to mentioning the default value in the doc string.
Changes
RRFRetriever.retrievers
:RetrieverContainer[]
→RRFRetrieverContainer[]
weight?: float
Related: elastic/elasticsearch#130658, elastic/elasticsearch#136477