LUCENE-10380: Further optimize FastTaxonomyFacetCounts#countAll by moving the liveDocs null check outside the loops by gsmiller · Pull Request #606 · apache/lucene

gsmiller · 2022-01-14T15:52:59Z

This change attempts to bring in the other piece of the LUCENE-10350 change without the regressions. See LUCENE-10374 for more details.

rmuir · 2022-01-14T16:01:42Z

+              values[(int) singleValued.longValue()]++;
+            }
+          }
+        } else {


i'm also suspicious of making count() and countAll() bigger and bigger with all these specializations.

I would recommend trying to factor out these little "accumulator" loops into separate methods. They could then be shared across count() and countAll(). At least when I looked at this stuff for solr DocValuesFacets, it was needed to get performance across the various specializations there (admittedly this was a while ago, maybe compiler is smarter now):

You can see what I mean if you start here in this file and scroll down:

https://github.com/apache/solr/blob/0f3893b8e08c7aaa81addda926303f7a0c6ee18c/solr/core/src/java/org/apache/solr/request/DocValuesFacets.java#L262

That's interesting. It's tricky without being able to reproduce that nightly benchmark regression locally, but I'll give it a shot. This change as I have it appears to have no performance impact at all locally, and since it just adds code complexity, it would be silly to move forward with it except as an academic exercise to try to figure out why the nightly benchmarks are regressing. That's interesting and may be worthwhile, but I'll experiment with your idea more before moving forward. Thanks!

Hmm, breaking out separate methods sent qps tanking in my local benchmarks. Any thoughts @rmuir? Maybe I missed the mark on what you were suggesting (entirely possible)? Here's the change: d084f85

TaskQPS baseline StdDevQPS candidate StdDev Pct diff p-value BrowseMonthTaxoFacets 27.87 (23.7%) 11.73 (1.3%) -57.9% ( -67% - -43%) 0.000 BrowseDateTaxoFacets 21.90 (20.9%) 11.77 (7.9%) -46.2% ( -62% - -21%) 0.000 BrowseDayOfYearTaxoFacets 21.88 (21.1%) 11.83 (8.1%) -45.9% ( -62% - -21%) 0.000 BrowseRandomLabelTaxoFacets 18.22 (17.8%) 9.96 (6.8%) -45.3% ( -59% - -25%) 0.000

I would make these simple static methods.

See the solr example again, just like those methods there. Instance methods are probably no good in facets, there are many abstractions, probably just drives compiler more crazy.

we still have the issue of inconsistent loop types between while and for loops? Maybe now that the accumulators are shared, it becomes more of a problem?

also, is there really a reason anymore to have count vs countAll? They look the same to me. The only difference is livedocs check which is shown to do nothing? So if we remove livedocs specialization, and remove count-vs-countAll specialization, it should start to be a bit more manageable?

also, is there really a reason anymore to have count vs countAll? They look the same to me. The only difference is livedocs check which is shown to do nothing? So if we remove livedocs specialization, and remove count-vs-countAll specialization, it should start to be a bit more manageable?

The only option I can think of for this is to put the liveDoc checking behind a DISI abstraction. Then the implementation could be consolidated to just operate on a DISI (which would either be backed by collected hits or by a doc value field with liveDocs validation). The nuance here is that the "standard" count functionality doesn't need to check for deleted docs as its assumes everything in the FacetsCollector is "live," whereas countAll needs to check for deleted docs. So this check needs to happen somewhere, unless liveDocs is null (indicating there are no deleted docs in the index).

I went ahead and tried this out, but am still seeing pretty horrific qps regressions.

TaskQPS baseline StdDevQPS candidate StdDev Pct diff p-value BrowseMonthTaxoFacets 29.20 (20.0%) 13.73 (15.6%) -53.0% ( -73% - -21%) 0.000 BrowseRandomLabelTaxoFacets 18.33 (14.4%) 10.98 (10.4%) -40.1% ( -56% - -17%) 0.000 BrowseDateTaxoFacets 21.36 (16.4%) 12.98 (10.6%) -39.2% ( -56% - -14%) 0.000 BrowseDayOfYearTaxoFacets 21.33 (16.4%) 13.04 (10.5%) -38.8% ( -56% - -14%) 0.000

I'm not sure it's really worth pursuing this further at the moment. No matter how I try to break out functionality into small, static methods, qps is regressing. The first revision of this PR that kept everything in one method but pulled the liveDocs null check out appeared to be flat. Besides trying to chase the oddity of the different results in the nightly run, I don't think there's much value in this change. (That said, I'm still really curious what was going on with that nightly benchmark regression... but I'm not sure chasing it this way is going to be very productive).

Actually, looking at the nightly bench runs over the weekend, at least one task that we were focused on looks like it might just be noisy? https://home.apache.org/~mikemccand/lucenebench/BrowseMonthTaxoFacets.html

So maybe this is just noise after all?

gsmiller · 2022-01-14T19:11:58Z

Benchmarking this change locally shows no impact at all. So I don't think it's actually worth pushing this change unless we just want to isolate where the nightly benchmark runs are different (i.e., see if this change regresses in the nightly run). So if I were to merge this, it would just be to see the nightly benchmark results and then likely revert it back out since it just adds complexity with no apparent value. So I won't merge it righ tnow.

                            TaskQPS baseline      StdDevQPS candidate      StdDev                Pct diff p-value
           BrowseMonthSSDVFacets       15.90     (23.7%)       15.46     (24.2%)   -2.7% ( -40% -   59%) 0.717
           BrowseMonthTaxoFacets       27.47     (26.8%)       27.05     (26.5%)   -1.5% ( -43% -   70%) 0.858
     BrowseRandomLabelSSDVFacets        9.58      (7.0%)        9.44      (4.8%)   -1.4% ( -12% -   11%) 0.455
                    OrHighNotLow     1265.12      (3.1%)     1251.37      (2.6%)   -1.1% (  -6% -    4%) 0.234
                   OrNotHighHigh      765.18      (3.2%)      757.45      (3.1%)   -1.0% (  -7% -    5%) 0.311
                    OrNotHighMed     1020.03      (3.0%)     1010.06      (3.3%)   -1.0% (  -7% -    5%) 0.327
                          IntNRQ      221.28      (1.3%)      219.28      (1.2%)   -0.9% (  -3% -    1%) 0.019
                   OrHighNotHigh      888.59      (3.8%)      880.64      (3.7%)   -0.9% (  -8% -    6%) 0.452
                    OrNotHighLow      828.68      (2.5%)      821.68      (1.7%)   -0.8% (  -4% -    3%) 0.216
            MedTermDayTaxoFacets       31.65      (4.2%)       31.41      (4.2%)   -0.7% (  -8% -    7%) 0.574
                HighSloppyPhrase        3.04      (4.7%)        3.02      (4.6%)   -0.7% (  -9% -    9%) 0.656
                    OrHighNotMed     1008.99      (3.8%)     1002.52      (3.8%)   -0.6% (  -8% -    7%) 0.597
     BrowseRandomLabelTaxoFacets       17.52     (19.6%)       17.41     (18.9%)   -0.6% ( -32% -   47%) 0.916
                      OrHighHigh       17.58      (3.7%)       17.47      (3.8%)   -0.6% (  -7% -    7%) 0.600
                       OrHighMed      137.10      (3.8%)      136.28      (4.6%)   -0.6% (  -8% -    8%) 0.655
                 LowSloppyPhrase       11.93      (3.8%)       11.88      (3.8%)   -0.4% (  -7% -    7%) 0.750
                         LowTerm     1631.30      (2.6%)     1625.23      (2.3%)   -0.4% (  -5% -    4%) 0.630
                 MedSloppyPhrase       67.24      (2.5%)       66.99      (2.6%)   -0.4% (  -5% -    4%) 0.649
                          Fuzzy1       80.85      (1.6%)       80.68      (1.8%)   -0.2% (  -3% -    3%) 0.699
                         Respell       51.74      (1.5%)       51.65      (1.7%)   -0.2% (  -3% -    3%) 0.729
                       OrHighLow      861.47      (2.9%)      860.91      (3.0%)   -0.1% (  -5% -    6%) 0.944
         AndHighMedDayTaxoFacets      110.66      (1.4%)      110.63      (1.7%)   -0.0% (  -3% -    3%) 0.958
                     MedSpanNear       50.07      (3.7%)       50.08      (3.0%)    0.0% (  -6% -    6%) 0.996
            HighTermTitleBDVSort       67.00     (23.3%)       67.01     (17.4%)    0.0% ( -32% -   53%) 0.998
                    HighSpanNear       10.62      (3.7%)       10.62      (2.9%)    0.0% (  -6% -    6%) 0.985
                          Fuzzy2       71.03      (1.5%)       71.06      (1.7%)    0.0% (  -3% -    3%) 0.953
            BrowseDateTaxoFacets       20.83     (22.2%)       20.84     (22.8%)    0.1% ( -36% -   57%) 0.992
                       LowPhrase      604.18      (2.9%)      604.76      (2.7%)    0.1% (  -5% -    5%) 0.914
       BrowseDayOfYearTaxoFacets       20.82     (22.4%)       20.85     (23.0%)    0.1% ( -36% -   58%) 0.984
             MedIntervalsOrdered       79.55      (5.1%)       79.68      (4.5%)    0.2% (  -8% -   10%) 0.915
                      HighPhrase      163.13      (2.8%)      163.42      (2.6%)    0.2% (  -5% -    5%) 0.837
          OrHighMedDayTaxoFacets        7.74      (5.3%)        7.76      (4.4%)    0.2% (  -8% -   10%) 0.888
       BrowseDayOfYearSSDVFacets       12.12     (14.1%)       12.17     (13.8%)    0.4% ( -24% -   32%) 0.930
             LowIntervalsOrdered      187.35      (8.7%)      188.22      (7.6%)    0.5% ( -14% -   18%) 0.857
                         MedTerm     2446.71      (4.1%)     2458.80      (4.8%)    0.5% (  -8% -    9%) 0.728
                      AndHighLow     1427.63      (2.7%)     1435.06      (2.5%)    0.5% (  -4% -    5%) 0.527
        AndHighHighDayTaxoFacets        9.02      (1.7%)        9.06      (2.4%)    0.5% (  -3% -    4%) 0.415
                       MedPhrase       34.65      (2.8%)       34.84      (2.6%)    0.6% (  -4% -    6%) 0.509
                     LowSpanNear       33.70      (5.5%)       33.91      (4.9%)    0.6% (  -9% -   11%) 0.705
            HighIntervalsOrdered       10.85      (8.8%)       10.92      (7.9%)    0.7% ( -14% -   19%) 0.803
                        HighTerm     1543.65      (4.2%)     1556.45      (4.4%)    0.8% (  -7% -    9%) 0.543
                      AndHighMed       83.36      (4.0%)       84.36      (4.2%)    1.2% (  -6% -    9%) 0.356
                        PKLookup      169.56      (3.5%)      171.59      (3.4%)    1.2% (  -5% -    8%) 0.275
                     AndHighHigh       71.11      (4.2%)       72.08      (4.5%)    1.4% (  -7% -   10%) 0.324
               HighTermMonthSort       96.47     (12.4%)       98.07     (18.7%)    1.7% ( -26% -   37%) 0.741
                        Wildcard      112.10      (4.5%)      114.33      (4.8%)    2.0% (  -7% -   11%) 0.180
                      TermDTSort       98.09     (12.0%)      100.68     (18.9%)    2.6% ( -25% -   38%) 0.598
                         Prefix3      206.94     (12.2%)      213.82     (10.4%)    3.3% ( -17% -   29%) 0.351
           HighTermDayOfYearSort       83.83     (17.8%)       86.78     (24.5%)    3.5% ( -32% -   55%) 0.603

…ving the liveDocs null check outside the loops

…e-loops where possible)

rmuir reviewed Jan 14, 2022

View reviewed changes

gsmiller added 5 commits January 17, 2022 15:16

LUCENE-10380: Further optimize FastTaxonomyFacetCounts#countAll by mo…

b78deb6

…ving the liveDocs null check outside the loops

move common functionality out into separate methods (and move to whil…

5f78a14

…e-loops where possible)

move to static methods

f53f632

move to all for-loops

0f224a4

refactor liveDoc checking behind disi abstraction

52bf236

gsmiller force-pushed the LUCENE-10380-move-livedoc-check branch from 47a7be4 to 52bf236 Compare January 17, 2022 23:35

gsmiller closed this Jan 27, 2022

gsmiller deleted the LUCENE-10380-move-livedoc-check branch July 6, 2022 17:32

asfimport mentioned this pull request Jan 27, 2022

Move liveDocs null check outside the loops in FastTaxonomyFacetCounts#countAll [LUCENE-10380] #11416

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

LUCENE-10380: Further optimize FastTaxonomyFacetCounts#countAll by moving the liveDocs null check outside the loops#606

LUCENE-10380: Further optimize FastTaxonomyFacetCounts#countAll by moving the liveDocs null check outside the loops#606
gsmiller wants to merge 5 commits into
apache:mainfrom
gsmiller:LUCENE-10380-move-livedoc-check

gsmiller commented Jan 14, 2022

Uh oh!

rmuir Jan 14, 2022 •

edited

Loading

Uh oh!

gsmiller Jan 14, 2022

Uh oh!

gsmiller Jan 14, 2022

Uh oh!

rmuir Jan 14, 2022

Uh oh!

rmuir Jan 14, 2022

Uh oh!

rmuir Jan 14, 2022

Uh oh!

rmuir Jan 14, 2022

Uh oh!

gsmiller Jan 17, 2022

Uh oh!

gsmiller Jan 17, 2022

Uh oh!

gsmiller Jan 18, 2022

Uh oh!

gsmiller commented Jan 14, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

gsmiller commented Jan 14, 2022

Uh oh!

rmuir Jan 14, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

gsmiller commented Jan 14, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

rmuir Jan 14, 2022 •

edited

Loading