-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
LUCENE-10480: Use BulkScorer to limit BMMScorer to only top-level disjunctions #1018
Conversation
Benchmark results with
|
optionalScorers.add(ss.get(Long.MAX_VALUE)); | ||
} | ||
|
||
return new BulkScorer() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if we could reuse DefaultBulkScorer
instead of this anonymous bulk scorer?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the suggestion! I gave that a try and it did work, but it would reduce the performance boost for OrHighMed from around 110+% to 70+%, most likely due to the extra logic inside DefaultBulkScorer
. I guess my preference would be to use the anonymous bulk scorer to maintain the performance advantage, but I'm also good with using DefaultBulkScorer
if reducing potentially duplicated code and keeping things consistent are preferred?
TaskQPS baseline StdDevQPS my_modified_version StdDev Pct diff p-value
TermBGroup1M1P 55.89 (7.1%) 54.01 (6.1%) -3.4% ( -15% - 10%) 0.108
TermDateFacets 34.46 (5.8%) 33.58 (5.0%) -2.5% ( -12% - 8%) 0.138
AndHighOrMedMed 90.90 (5.6%) 88.59 (4.6%) -2.5% ( -12% - 8%) 0.115
BrowseDayOfYearSSDVFacets 28.63 (12.5%) 28.01 (14.5%) -2.2% ( -25% - 28%) 0.612
MedTermDayTaxoFacets 79.74 (5.1%) 78.04 (4.3%) -2.1% ( -10% - 7%) 0.150
TermGroup100 36.28 (3.5%) 35.54 (3.1%) -2.1% ( -8% - 4%) 0.050
TermBGroup1M 30.37 (3.7%) 29.87 (3.8%) -1.6% ( -8% - 6%) 0.165
TermGroup10K 41.33 (3.2%) 40.66 (3.3%) -1.6% ( -7% - 5%) 0.117
PKLookup 330.84 (5.1%) 326.20 (4.3%) -1.4% ( -10% - 8%) 0.349
SloppyPhrase 13.56 (2.8%) 13.39 (2.5%) -1.2% ( -6% - 4%) 0.139
TermGroup1M 39.76 (3.2%) 39.32 (3.2%) -1.1% ( -7% - 5%) 0.272
AndMedOrHighHigh 88.13 (5.5%) 87.22 (4.4%) -1.0% ( -10% - 9%) 0.511
BrowseDateSSDVFacets 4.17 (29.0%) 4.13 (29.0%) -0.8% ( -45% - 80%) 0.933
SpanNear 169.70 (2.6%) 168.59 (2.0%) -0.7% ( -5% - 4%) 0.369
Fuzzy2 83.59 (2.4%) 83.12 (2.2%) -0.6% ( -5% - 4%) 0.442
Respell 96.22 (3.1%) 95.85 (2.6%) -0.4% ( -5% - 5%) 0.672
IntervalsOrdered 23.02 (4.3%) 22.94 (4.2%) -0.3% ( -8% - 8%) 0.799
Wildcard 231.51 (4.5%) 230.74 (5.0%) -0.3% ( -9% - 9%) 0.827
AndHighMed 143.73 (5.8%) 143.48 (4.0%) -0.2% ( -9% - 10%) 0.914
AndHighHighDayTaxoFacets 54.80 (1.3%) 54.71 (1.5%) -0.2% ( -2% - 2%) 0.717
Fuzzy1 154.41 (2.8%) 154.44 (1.9%) 0.0% ( -4% - 4%) 0.981
BrowseMonthSSDVFacets 28.06 (10.4%) 28.06 (13.3%) 0.0% ( -21% - 26%) 0.995
OrHighMedDayTaxoFacets 7.38 (5.7%) 7.39 (5.4%) 0.0% ( -10% - 11%) 0.981
AndHighMedDayTaxoFacets 134.39 (2.0%) 134.59 (1.9%) 0.2% ( -3% - 4%) 0.809
Phrase 38.80 (1.8%) 38.86 (2.5%) 0.2% ( -4% - 4%) 0.823
TermMonthSort 357.58 (5.2%) 359.31 (6.7%) 0.5% ( -10% - 13%) 0.801
TermTitleSort 274.32 (5.1%) 275.77 (6.8%) 0.5% ( -10% - 13%) 0.781
TermDayOfYearSort 259.44 (2.5%) 261.32 (5.9%) 0.7% ( -7% - 9%) 0.615
TermDTSort 208.81 (2.6%) 210.65 (5.8%) 0.9% ( -7% - 9%) 0.532
Term 2535.91 (3.4%) 2562.20 (3.1%) 1.0% ( -5% - 7%) 0.317
AndHighHigh 32.15 (3.3%) 32.55 (3.1%) 1.2% ( -5% - 7%) 0.231
BrowseMonthTaxoFacets 32.02 (34.2%) 32.57 (35.8%) 1.7% ( -50% - 109%) 0.876
Prefix3 404.00 (9.7%) 411.80 (8.5%) 1.9% ( -14% - 22%) 0.505
IntNRQ 86.95 (9.5%) 88.74 (8.7%) 2.1% ( -14% - 22%) 0.475
BrowseRandomLabelSSDVFacets 19.82 (7.6%) 20.34 (9.2%) 2.7% ( -13% - 21%) 0.318
BrowseDayOfYearTaxoFacets 28.80 (33.0%) 31.25 (33.8%) 8.5% ( -43% - 112%) 0.420
BrowseDateTaxoFacets 28.68 (33.1%) 31.15 (33.9%) 8.6% ( -43% - 113%) 0.415
BrowseRandomLabelTaxoFacets 28.81 (45.1%) 31.60 (47.4%) 9.7% ( -57% - 186%) 0.508
OrHighHigh 23.12 (3.5%) 29.67 (6.8%) 28.3% ( 17% - 39%) 0.000
OrHighMed 103.75 (3.7%) 180.03 (11.5%) 73.5% ( 56% - 92%) 0.000
TaskQPS baseline StdDevQPS my_modified_version StdDev Pct diff p-value
BrowseDateSSDVFacets 4.27 (27.4%) 3.92 (26.3%) -8.2% ( -48% - 62%) 0.333
BrowseRandomLabelTaxoFacets 32.05 (49.5%) 30.51 (47.1%) -4.8% ( -67% - 181%) 0.753
BrowseDateTaxoFacets 30.22 (32.6%) 29.15 (32.4%) -3.6% ( -51% - 91%) 0.730
BrowseDayOfYearTaxoFacets 30.27 (32.4%) 29.24 (32.4%) -3.4% ( -51% - 90%) 0.738
TermDateFacets 34.74 (4.2%) 33.81 (5.9%) -2.7% ( -12% - 7%) 0.101
Prefix3 475.23 (5.7%) 462.80 (5.7%) -2.6% ( -13% - 9%) 0.146
TermBGroup1M1P 55.99 (7.5%) 54.91 (8.1%) -1.9% ( -16% - 14%) 0.434
MedTermDayTaxoFacets 77.78 (3.6%) 76.33 (4.7%) -1.9% ( -9% - 6%) 0.160
BrowseMonthTaxoFacets 30.33 (37.8%) 29.78 (35.6%) -1.8% ( -54% - 115%) 0.877
PKLookup 332.94 (5.3%) 327.77 (5.3%) -1.6% ( -11% - 9%) 0.351
BrowseRandomLabelSSDVFacets 21.30 (9.6%) 21.01 (8.8%) -1.3% ( -18% - 18%) 0.648
AndHighMedDayTaxoFacets 140.47 (1.7%) 138.62 (4.1%) -1.3% ( -6% - 4%) 0.186
IntervalsOrdered 130.34 (5.8%) 128.98 (6.8%) -1.0% ( -12% - 12%) 0.602
SloppyPhrase 97.23 (7.7%) 96.30 (6.1%) -1.0% ( -13% - 13%) 0.667
AndMedOrHighHigh 35.81 (4.0%) 35.48 (3.3%) -0.9% ( -7% - 6%) 0.431
TermGroup100 36.34 (4.1%) 36.02 (3.9%) -0.9% ( -8% - 7%) 0.481
AndHighOrMedMed 92.43 (6.3%) 91.64 (5.2%) -0.9% ( -11% - 11%) 0.635
AndHighMed 151.18 (5.3%) 150.06 (5.0%) -0.7% ( -10% - 10%) 0.651
AndHighHighDayTaxoFacets 13.90 (2.1%) 13.82 (3.0%) -0.6% ( -5% - 4%) 0.461
Phrase 53.96 (4.9%) 53.66 (4.6%) -0.5% ( -9% - 9%) 0.716
Wildcard 382.71 (4.6%) 380.67 (4.5%) -0.5% ( -9% - 8%) 0.710
TermDTSort 142.16 (3.3%) 141.51 (3.2%) -0.5% ( -6% - 6%) 0.659
Term 3700.77 (3.7%) 3686.00 (3.5%) -0.4% ( -7% - 7%) 0.726
TermBGroup1M 36.19 (4.7%) 36.07 (3.5%) -0.3% ( -8% - 8%) 0.808
TermTitleSort 191.76 (6.3%) 191.17 (7.4%) -0.3% ( -13% - 14%) 0.888
TermDayOfYearSort 211.60 (3.5%) 210.97 (4.6%) -0.3% ( -8% - 8%) 0.819
TermMonthSort 224.12 (6.1%) 223.53 (7.6%) -0.3% ( -13% - 14%) 0.902
TermGroup1M 23.75 (3.9%) 23.72 (3.0%) -0.1% ( -6% - 7%) 0.924
OrHighMedDayTaxoFacets 15.65 (5.1%) 15.64 (5.3%) -0.0% ( -9% - 10%) 0.985
IntNRQ 110.02 (0.4%) 109.99 (0.4%) -0.0% ( 0% - 0%) 0.823
Fuzzy1 164.30 (1.3%) 164.42 (2.0%) 0.1% ( -3% - 3%) 0.889
Fuzzy2 83.64 (1.4%) 83.85 (2.0%) 0.2% ( -3% - 3%) 0.649
AndHighHigh 44.91 (3.5%) 45.09 (3.7%) 0.4% ( -6% - 7%) 0.724
Respell 117.19 (2.2%) 117.69 (2.7%) 0.4% ( -4% - 5%) 0.585
SpanNear 28.79 (2.9%) 29.02 (3.4%) 0.8% ( -5% - 7%) 0.426
TermGroup10K 43.33 (3.8%) 43.73 (2.4%) 0.9% ( -5% - 7%) 0.357
BrowseDayOfYearSSDVFacets 26.67 (12.7%) 26.94 (8.8%) 1.0% ( -18% - 25%) 0.767
BrowseMonthSSDVFacets 27.30 (8.6%) 28.47 (10.1%) 4.3% ( -13% - 25%) 0.149
OrHighHigh 25.20 (4.7%) 39.24 (4.8%) 55.7% ( 44% - 68%) 0.000
OrHighMed 103.99 (5.1%) 184.11 (5.5%) 77.0% ( 63% - 92%) 0.000
TaskQPS baseline StdDevQPS my_modified_version StdDev Pct diff p-value
BrowseMonthSSDVFacets 29.26 (10.2%) 28.24 (5.9%) -3.5% ( -17% - 13%) 0.185
BrowseRandomLabelSSDVFacets 20.62 (6.1%) 20.36 (7.4%) -1.2% ( -13% - 13%) 0.565
Prefix3 475.09 (6.1%) 470.79 (6.7%) -0.9% ( -12% - 12%) 0.655
IntervalsOrdered 13.21 (3.1%) 13.12 (2.9%) -0.7% ( -6% - 5%) 0.461
TermDateFacets 48.80 (6.2%) 48.56 (6.2%) -0.5% ( -12% - 12%) 0.800
PKLookup 329.55 (4.9%) 328.09 (3.9%) -0.4% ( -8% - 8%) 0.750
AndHighOrMedMed 117.25 (3.9%) 116.74 (4.9%) -0.4% ( -8% - 8%) 0.759
MedTermDayTaxoFacets 78.41 (5.5%) 78.16 (5.5%) -0.3% ( -10% - 11%) 0.855
Phrase 232.22 (2.8%) 231.49 (2.3%) -0.3% ( -5% - 4%) 0.694
TermTitleSort 224.25 (6.4%) 223.78 (6.3%) -0.2% ( -12% - 13%) 0.916
TermBGroup1M 39.69 (3.2%) 39.60 (3.9%) -0.2% ( -6% - 7%) 0.852
TermMonthSort 277.13 (6.3%) 276.61 (6.4%) -0.2% ( -12% - 13%) 0.925
TermDayOfYearSort 259.60 (3.2%) 259.15 (3.6%) -0.2% ( -6% - 6%) 0.872
AndHighMed 185.96 (3.9%) 185.74 (5.3%) -0.1% ( -8% - 9%) 0.936
TermGroup100 51.51 (3.3%) 51.52 (3.8%) 0.0% ( -6% - 7%) 0.976
TermGroup1M 39.83 (2.7%) 39.85 (3.4%) 0.1% ( -5% - 6%) 0.957
TermGroup10K 41.38 (2.9%) 41.40 (3.1%) 0.1% ( -5% - 6%) 0.950
Wildcard 238.58 (4.1%) 238.92 (3.4%) 0.1% ( -7% - 7%) 0.904
SloppyPhrase 4.75 (4.1%) 4.76 (3.0%) 0.2% ( -6% - 7%) 0.834
IntNRQ 265.85 (1.0%) 266.60 (0.7%) 0.3% ( -1% - 1%) 0.298
AndHighHighDayTaxoFacets 54.33 (1.6%) 54.49 (1.5%) 0.3% ( -2% - 3%) 0.554
TermBGroup1M1P 62.07 (4.4%) 62.41 (5.6%) 0.5% ( -9% - 11%) 0.735
AndHighMedDayTaxoFacets 202.86 (2.6%) 204.19 (2.6%) 0.7% ( -4% - 6%) 0.434
Respell 111.78 (3.4%) 112.52 (3.1%) 0.7% ( -5% - 7%) 0.519
Fuzzy1 155.04 (3.6%) 156.09 (3.4%) 0.7% ( -6% - 7%) 0.541
SpanNear 13.68 (1.7%) 13.77 (1.3%) 0.7% ( -2% - 3%) 0.142
Fuzzy2 144.74 (2.9%) 145.81 (2.8%) 0.7% ( -4% - 6%) 0.419
TermDTSort 214.37 (7.3%) 216.04 (8.8%) 0.8% ( -14% - 18%) 0.761
AndMedOrHighHigh 35.72 (3.2%) 36.03 (3.8%) 0.9% ( -5% - 8%) 0.441
AndHighHigh 44.40 (2.8%) 44.83 (4.7%) 1.0% ( -6% - 8%) 0.429
OrHighMedDayTaxoFacets 4.34 (5.5%) 4.38 (7.7%) 1.1% ( -11% - 15%) 0.621
BrowseRandomLabelTaxoFacets 29.93 (46.8%) 30.38 (47.5%) 1.5% ( -63% - 179%) 0.921
BrowseDateSSDVFacets 3.98 (33.0%) 4.04 (32.0%) 1.6% ( -47% - 99%) 0.879
Term 2674.87 (3.8%) 2721.55 (4.1%) 1.7% ( -5% - 10%) 0.164
BrowseDayOfYearTaxoFacets 29.20 (31.2%) 29.75 (33.2%) 1.9% ( -47% - 96%) 0.853
BrowseDateTaxoFacets 29.07 (31.3%) 29.63 (33.1%) 1.9% ( -47% - 96%) 0.851
BrowseDayOfYearSSDVFacets 26.74 (9.6%) 27.76 (9.4%) 3.8% ( -13% - 25%) 0.205
BrowseMonthTaxoFacets 30.73 (33.3%) 32.84 (35.6%) 6.9% ( -46% - 113%) 0.528
OrHighHigh 117.70 (4.1%) 161.58 (10.4%) 37.3% ( 21% - 54%) 0.000
OrHighMed 105.09 (4.3%) 186.95 (10.2%) 77.9% ( 60% - 96%) 0.000
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It shouldn't be slower than the current code in main
since main
is using DefaultBulkScorer
, is it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK to keep it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It shouldn't be slower than the current code in main since main is using DefaultBulkScorer, is it?
The baseline of all of the above benchmark results are still using the head prior to all BMM changes. Since this approach (anonymous bulk scorer + BMM scorer) still has similar performance boost with the previous one (just BMM scorer) for top-level disjunctions, but no impact to nested boolean queries, I would think so? I'm not sure I'm fully understanding this question though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for explaining the motivation for the dedicated bulk scorer, I left some suggestions.
@Override | ||
public int score(LeafCollector collector, Bits acceptDocs, int min, int max) | ||
throws IOException { | ||
max = Math.min(max, maxDoc); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we need this, do tests fail without it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yup this is indeed optional and tests didn't fail without it. I've removed it.
return DocIdSetIterator.NO_MORE_DOCS; | ||
} else if (advancedDoc >= max) { | ||
return max; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the entire above if
statement could become:
if (advanceDoc >= max) {
return advanceDoc;
}
since max
is guaranteed to be less than maxDoc
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated.
optionalScorers.add(ss.get(Long.MAX_VALUE)); | ||
} | ||
|
||
return new BulkScorer() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK to keep it.
doc = advancedDoc + 1; | ||
} | ||
|
||
return max == maxDoc ? DocIdSetIterator.NO_MORE_DOCS : max; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we could remove the end condition from the for loop, so that we would hit the if (advanceDoc >= max)
condition instead, and remove the above line?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated.
No problem and thanks for the suggestions! I have incorporated them and like how clean the bulk scorer looks now! |
Here are the latest
|
@jpountz If this approach to limiting BMM scorer to top-level disjunctions looks good to you, I can go ahead and update the corresponding tests to make this PR ready? |
I've updated / added the tests, and also tried out the idea to use a dedicated test query to ensure the right scorer under tests. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I left two comments but it looks good to me otherwise.
|
||
int doc = min; | ||
while (true) { | ||
doc = iterator.advance(doc); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we should only advance the scorer if the current doc is not already greater than or equal to min
, otherwise we should retrieve the doc ID with Scorer#docID
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you mean something like this?
new BulkScorer() {
final Scorer bmmScorer = new BlockMaxMaxscoreScorer(BooleanWeight.this, optionalScorers);
final DocIdSetIterator iterator = bmmScorer.iterator();
@Override
public int score(LeafCollector collector, Bits acceptDocs, int min, int max)
throws IOException {
collector.setScorer(bmmScorer);
int doc = min;
while (true) {
// updated logic below
if (doc < min) {
doc = iterator.advance(doc);
} else {
doc = iterator.docID();
}
if (doc >= max) {
return doc;
}
if (acceptDocs == null || acceptDocs.get(doc)) {
collector.collect(doc);
}
doc++;
}
}
}
I'm not sure how doc
would get advanced with this though, and the benchmark tests also seems stuck in a loop?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, indeed, though we might be able to simplify it to look like below:
new BulkScorer() {
final Scorer bmmScorer = new BlockMaxMaxscoreScorer(BooleanWeight.this, optionalScorers);
final DocIdSetIterator iterator = bmmScorer.iterator();
@Override
public int score(LeafCollector collector, Bits acceptDocs, int min, int max)
throws IOException {
collector.setScorer(bmmScorer);
int doc = bmmScorer.docID();
if (doc < min) {
doc = bmmScorer.advance(min);
}
while (doc < max) {
if (acceptDocs == null || acceptDocs.get(doc)) {
collector.collect(doc);
}
doc = bmmScorer.nextDoc();
}
return doc;
}
}
The reason is that a consumer of the bulk scorer could do something like:
bulkScorer.score(collector, null, 0, 1000);
bulkScorer.score(collector, null, 1000, 2000);
If the last match of the first window is say 998 and the first match after the first window is 1005. Then we should make sure to score 1005 when scoring the second window before starting to advance.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the last match of the first window is say 998 and the first match after the first window is 1005. Then we should make sure to score 1005 when scoring the second window before starting to advance.
Thanks for the explanation, it makes sense! I thought the old implementation would take care of this boundary case as it initiated doc
with min
before advancing, but the behavior is indeed undefined if the previous call to bulkScorer#score
already advanced its internal scorer past the next min
used. I've updated the code with the above solution.
|
||
@Override | ||
public long cost() { | ||
return maxDoc; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Return bmmScorer.cost()
instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated.
Here are the latest benchmark results:
It appears this implementation would bring a -4% performance impact to OrXNotY tasks in general. |
I limited the implementation to pure disjunctions here f4fdfea as well and it does seems to help:
|
Latest
|
Thanks @jpountz for the feedback and approval! |
Sure, thanks for doing all this work! |
Let's merge this PR to have it in 9.3 and resolve LUCENE-10480? |
Sure sounds good! |
…junctions (apache#1018) (cherry picked from commit 28ce8ab)
Description (or a Jira issue link if you have one)
Use BulkScorer to limit BMMScorer to only top-level disjunctions
Note: Tests update pendingTest updated