-
Notifications
You must be signed in to change notification settings - Fork 989
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add BS1 optimization to MaxScoreBulkScorer. #12444
Conversation
Lucene's scorers that can dynamically prune on score provide great speedups when they manage to skip many hits. Unfortunately, there are also cases when they cannot skip hits efficiently, one example case being when there are many clauses in the query. In this case, exhaustively evaluating the set of matches with `BooleanScorer` (BS1) may perform several times faster. This commit adds to `MaxScoreBulkScorer` the BS1 optimization that consists of collecting hits into a bitset to save the overhead of reordering priority queues. This helps make performance degrade much more gracefully when dynamic pruning cannot help much. Closes apache#12439
I played with the following tasks file to evaluate the impact of this change:
And here are QPS numbers for various scorers on wikimedium10m. 🔶 denotes the implementation that is used today, 🔷 denotes the implementation that would get used with this change.
|
Here is the usual set of queries, still on wikimedium10m. Sparser disjunctive queries like
|
Here is a similar table as above but with low-cardinality clauses instead of high-cardinality clauses in order to show how the overhead of the bitset manifests:
With high-frequency clauses, I'd like to avoid trying to go too far wrt picking the optimal implementation based on the query, which could get quite messy. Maybe we could introduce simple heuristics in a follow-up, such as only using the bulk scorer if the cost is high enough that we'd expect more than X matches per 2048-bits window on average. In general, this new |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's really cool that we handle more cases to apply max score in the bulk scorer!
I'd like to avoid trying to go too far wrt picking the optimal implementation based on the query, which could get quite messy. Maybe we could introduce simple heuristics in a follow-up, such as only using the bulk scorer if the cost is high enough that we'd expect more than X matches per 2048-bits window on average.
The numbers you shared are a good compromise but ++ to remain open and add more heuristics in followups.
I pushed a couple changes that helped improve performance on sparse clauses a bit, and updated the above performance numbers:
|
Lucene's scorers that can dynamically prune on score provide great speedups when they manage to skip many hits. Unfortunately, there are also cases when they cannot skip hits efficiently, one example case being when there are many clauses in the query. In this case, exhaustively evaluating the set of matches with `BooleanScorer` (BS1) may perform several times faster. This commit adds to `MaxScoreBulkScorer` the BS1 optimization that consists of collecting hits into a bitset to save the overhead of reordering priority queues. This helps make performance degrade much more gracefully when dynamic pruning cannot help much. Closes #12439
Lucene's scorers that can dynamically prune on score provide great speedups when they manage to skip many hits. Unfortunately, there are also cases when they cannot skip hits efficiently, one example case being when there are many clauses in the query. In this case, exhaustively evaluating the set of matches with
BooleanScorer
(BS1) may perform several times faster.This commit adds to
MaxScoreBulkScorer
the BS1 optimization that consists of collecting hits into a bitset to save the overhead of reordering priority queues. This helps make performance degrade much more gracefully when dynamic pruning cannot help much.Closes #12439