Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

LUCENE-10332: enable twin reading in LongValues #555

Closed
wants to merge 1 commit into from

Conversation

gf2121
Copy link
Contributor

@gf2121 gf2121 commented Dec 20, 2021

In {{Lucene90DocValuesProducer}}, there are several places reading LongValues like this pattern:

long startOffset = addresses.get(doc);
bytes.length = (int) (addresses.get(doc + 1L) - startOffset);

In these cases, we are needing to read 2 numbers stored together. It would be great if we can read 2 longs once. The luceneutil benchmark shows that some Facets tasks were speed up nearly 20% by this approach:

Benchmark

                            TaskQPS baseline      StdDevQPS my_modified_version      StdDev                Pct diff p-value
           BrowseMonthSSDVFacets       17.25      (8.6%)       16.78     (17.8%)   -2.7% ( -26% -   25%) 0.536
                         LowTerm     1458.66      (3.6%)     1438.15      (4.4%)   -1.4% (  -9% -    6%) 0.268
           HighTermDayOfYearSort      108.55     (10.0%)      108.04      (9.1%)   -0.5% ( -17% -   20%) 0.874
                      HighPhrase      168.65      (1.9%)      168.06      (2.3%)   -0.3% (  -4% -    3%) 0.602
                    OrNotHighLow     1201.79      (3.4%)     1197.93      (4.6%)   -0.3% (  -8% -    7%) 0.801
                    HighSpanNear       15.26      (1.6%)       15.21      (1.4%)   -0.3% (  -3% -    2%) 0.499
                         Respell       62.61      (1.8%)       62.45      (1.9%)   -0.3% (  -3% -    3%) 0.649
                       MedPhrase       57.57      (1.4%)       57.44      (1.8%)   -0.2% (  -3% -    2%) 0.648
                       OrHighMed      129.10      (3.0%)      128.83      (3.1%)   -0.2% (  -6% -    6%) 0.830
                     MedSpanNear       19.45      (2.3%)       19.41      (2.2%)   -0.2% (  -4% -    4%) 0.784
                      OrHighHigh       34.85      (1.5%)       34.79      (1.4%)   -0.2% (  -3% -    2%) 0.722
            HighIntervalsOrdered       26.92      (4.7%)       26.89      (4.9%)   -0.1% (  -9% -    9%) 0.929
                          IntNRQ      343.52      (1.6%)      343.16      (2.0%)   -0.1% (  -3% -    3%) 0.855
                   OrHighNotHigh      595.61      (3.2%)      595.10      (4.3%)   -0.1% (  -7% -    7%) 0.944
             MedIntervalsOrdered       17.66      (3.6%)       17.65      (3.8%)   -0.1% (  -7% -    7%) 0.961
             LowIntervalsOrdered      109.23      (3.3%)      109.18      (3.5%)   -0.0% (  -6% -    7%) 0.969
                     AndHighHigh       81.09      (1.5%)       81.10      (2.0%)    0.0% (  -3% -    3%) 0.967
                     LowSpanNear      203.33      (2.1%)      203.41      (1.8%)    0.0% (  -3% -    3%) 0.948
                 MedSloppyPhrase       27.15      (1.5%)       27.17      (1.2%)    0.1% (  -2% -    2%) 0.907
                       LowPhrase       75.76      (1.8%)       75.81      (2.0%)    0.1% (  -3% -    3%) 0.904
         AndHighMedDayTaxoFacets       97.27      (1.9%)       97.35      (1.9%)    0.1% (  -3% -    4%) 0.888
                HighSloppyPhrase       14.32      (2.7%)       14.34      (1.8%)    0.1% (  -4% -    4%) 0.870
                          Fuzzy2       76.00      (3.9%)       76.12      (3.4%)    0.2% (  -6% -    7%) 0.894
                        Wildcard      123.51      (1.8%)      123.71      (2.1%)    0.2% (  -3% -    4%) 0.796
                    OrHighNotLow      722.64      (4.4%)      724.15      (5.4%)    0.2% (  -9% -   10%) 0.894
                      AndHighLow      929.73      (4.0%)      931.75      (3.8%)    0.2% (  -7% -    8%) 0.859
                         Prefix3      240.13      (1.5%)      240.69      (1.9%)    0.2% (  -3% -    3%) 0.675
                      AndHighMed      210.17      (1.7%)      210.84      (1.6%)    0.3% (  -2% -    3%) 0.532
                 LowSloppyPhrase      142.83      (1.8%)      143.54      (2.0%)    0.5% (  -3% -    4%) 0.410
                    OrNotHighMed      709.24      (4.4%)      712.78      (4.3%)    0.5% (  -7% -    9%) 0.715
                          Fuzzy1       85.33      (5.7%)       85.77      (6.3%)    0.5% ( -10% -   13%) 0.786
                         MedTerm     1466.50      (3.5%)     1474.85      (3.9%)    0.6% (  -6% -    8%) 0.629
                      TermDTSort      105.51      (7.7%)      106.33      (7.3%)    0.8% ( -13% -   17%) 0.746
                        PKLookup      206.18      (2.9%)      208.68      (2.9%)    1.2% (  -4% -    7%) 0.179
                    OrHighNotMed      876.71      (3.0%)      887.84      (3.9%)    1.3% (  -5% -    8%) 0.251
                   OrNotHighHigh      774.25      (4.7%)      785.03      (6.0%)    1.4% (  -8% -   12%) 0.411
               HighTermMonthSort       74.33      (9.4%)       75.47     (16.3%)    1.5% ( -22% -   30%) 0.716
                       OrHighLow      518.73      (5.2%)      528.27      (5.4%)    1.8% (  -8% -   13%) 0.272
                        HighTerm     1892.16      (3.4%)     1934.63      (5.5%)    2.2% (  -6% -   11%) 0.120
        AndHighHighDayTaxoFacets       16.46      (2.7%)       16.84      (2.3%)    2.3% (  -2% -    7%) 0.004
            HighTermTitleBDVSort      141.39     (14.6%)      145.33     (15.1%)    2.8% ( -23% -   38%) 0.554
            MedTermDayTaxoFacets       27.81      (2.1%)       29.54      (2.3%)    6.2% (   1% -   10%) 0.000
          OrHighMedDayTaxoFacets        3.05      (1.9%)        3.30      (2.2%)    8.3% (   4% -   12%) 0.000
       BrowseDayOfYearSSDVFacets       17.36     (13.0%)       18.97     (15.8%)    9.3% ( -17% -   43%) 0.042
       BrowseDayOfYearTaxoFacets        3.02      (3.6%)        3.79      (2.5%)   25.4% (  18% -   32%) 0.000
            BrowseDateTaxoFacets        3.01      (3.6%)        3.79      (2.5%)   25.6% (  18% -   32%) 0.000
           BrowseMonthTaxoFacets        3.14      (2.1%)        3.99      (2.5%)   27.0% (  21% -   32%) 0.000

PS: I'm posting code here to quickly see if this approach makes sense to you, i'll add some more tests later :)

@gf2121 gf2121 closed this Dec 20, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

1 participant