New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support multiple rescores #4749
Conversation
This isn't quite ready but it is worth reviewing I think. TODO:
|
Added test for explanation. It caught that I was building the explanations backwards so the first rescore looked like it processed output from the second when in fact the opposite is true. |
Wait! I had it backwards! The code was right and the test was wrong. Fixed. |
Added documentation and while I was in there documentation for |
I don't think I have anything to do for the rest testing/rest spec because it just describes the body of the search request as "The search definition using the Query DSL" which I think the asciidoc covers. |
linearly to produce the final `_score` for each document. The relative | ||
importance of the original query and of the rescore query can be | ||
controlled with the `query_weight` and `rescore_query_weight` | ||
By defaulthe scores from the original query and the rescore query are |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
space missing between 'default' and 'the'
This looks good in general. However it seems to me that the second rescorer would be applied to all top docs but not only the first 10? (QueryRescorer.rescore rescores all top docs which was probably ok when there could be a single rescorer but now that there should be several ones, I think the TopDocsFilter should take the window size into account?) |
cool stuff I like the feature! |
@jpountz I'll have a look at that in a bit. I thought I had a test that checked that if the second window is smaller then the first and the first doesn't pull the match into the window then the second one doesn't see it. |
@jpountz you were right of course. My test was actually backwards. It was making sure that the second rescore took effect when I wanted the opposite. I've pushed a fix. I also did some reworking on QueryRescorer#rescore because it was a little twisted. The only real change is that TopDocsFilter now takes a maximum number of docs to filter and I set it to the rescore window. I also set the maximum number of docs returned by the searcher to the rescore window rather than the size of topdocs. |
Rebased. Is there anything else I should change in this? |
This looks very good . My only concern right now is about the client API. Now that it is possible to have several rescorers per request, it feels wrong to me to have the Something that would be nice also would be to validate that rescore window sizes are in strictly descending order. Otherwise applying a rescorer that is followed by a rescorer with a greater window size would be useless I think? |
IMO the rescore window should be |
I was thinking it might be nice to have a multiply rescore with a big window after a total rescore with a smaller window. The multiply must come after so you multiply the totalled score. The total has a smaller window because it is more expensive then the multiply.
I'll make this change and see what it looks like. |
Agreed, let's not check the window sizes in order to allow for this kind of usage. |
Added another commit to make |
Looks good to me. I'm going to merge this PR if there are no objections. |
Cool. Want me to squash the commits or will you handle it? |
LGTM - would be awesome if we could have an issue for this as well to mark the versions etc. otherwise +1 to the feature thanks nik! |
Is #4748 what you need? |
yeah @jpountz made see it too :) sorry for the noise! ;) |
Actually what would be nice would be to split the change into one commit for documentation of score mode that I'll merge into 1.0,1.x and master and the rest that I'll merge into 1.x and master. |
Detects if rescores arrive as an array instead of a plain object. If so then parse each element of the array as a separate rescore to be executed one after another. It looks like this: "rescore" : [ { "window_size" : 100, "query" : { "rescore_query" : { "match" : { "field1" : { "query" : "the quick brown", "type" : "phrase", "slop" : 2 } } }, "query_weight" : 0.7, "rescore_query_weight" : 1.2 } }, { "window_size" : 10, "query" : { "score_mode": "multiply", "rescore_query" : { "function_score" : { "script_score": { "script": "log10(doc['numeric'].value + 2)" } } } } } ] Rescores as a single object are still supported. Closes elastic#4748
Done. |
Merged, thanks again Nik! |
Support multiple rescores
Detects if rescores arrive as an array instead of a plain object. If so
then parse each element of the array as a separate rescore to be executed
one after another. It looks like this:
Rescores as a single object are still supported.
Also add documentation on score_mode when adding documentation about multiple
rescores.
Closes #4748
Closes #4742