-
Notifications
You must be signed in to change notification settings - Fork 988
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
LUCENE-10480: Move scoring from advance to TwoPhaseIterator#matches to improve disjunction within conjunction #1006
LUCENE-10480: Move scoring from advance to TwoPhaseIterator#matches to improve disjunction within conjunction #1006
Conversation
…o improve disjunction within conjunction
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should be able to remove the while loop now, otherwise LGTM.
do { | ||
top.doc = top.iterator.nextDoc(); | ||
top = essentialsScorers.updateTop(); | ||
} while (top.doc == docId); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Advancing the priority queue here shouldn't be necessary, we should be able to remove this while loop. Scorers will advance on the next call to advance
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yup good catch! I've removed it.
28c02cd
to
c6a3e5a
Compare
Here are the latest benchmark results:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand why top-level disjunctions get faster but the change looks good to me.
Oh sorry @jpountz I should have mentioned it explicitly as well. The benchmark baseline I used was prior to all BMM changes (specifically, the head of baseline is 08a9dfd) , so that the results reflect net effect of BMM changes on those boolean queries. Before changes in this PR, I was able to reproduce the slow-down consistently as noted in https://issues.apache.org/jira/browse/LUCENE-10480:
|
Ah, that makes sense to me now! Thanks for explaining. |
No problem! |
…o improve disjunction within conjunction (apache#1006) (cherry picked from commit da8143b)
Description (or a Jira issue link if you have one)
Follow-up changes for https://issues.apache.org/jira/browse/LUCENE-10480 to improve performance for disjunction within conjunction queries.
Benchmark results with
wikinightly.tasks
boolean queries below: