Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

LUCENE-10444 BACKPORT: Support alternate aggregation functions in association facets #719

Merged

Conversation

gsmiller
Copy link
Contributor

@gsmiller gsmiller commented Mar 2, 2022

This is a backport of #718. It provides backwards-compatiblity by delegating existing "sum association faceting" implementations to the new classes. This also provided an easy way to benchmark the change against the existing version. Results of luceneutil on wikimedium10m are here:

                            TaskQPS baseline      StdDevQPS candidate      StdDev                Pct diff p-value
       BrowseDayOfYearSSDVFacets       12.45     (17.5%)       12.15     (16.2%)   -2.4% ( -30% -   38%) 0.658
     BrowseRandomLabelTaxoFacets       18.18     (17.9%)       17.78     (16.8%)   -2.2% ( -31% -   39%) 0.693
                       MedPhrase       10.66      (3.1%)       10.53      (3.0%)   -1.2% (  -7% -    5%) 0.220
                     AndHighHigh       70.63      (3.6%)       69.99      (4.3%)   -0.9% (  -8% -    7%) 0.462
                HighSloppyPhrase       31.76      (3.4%)       31.52      (3.3%)   -0.7% (  -7% -    6%) 0.483
                 MedSloppyPhrase        7.89      (4.0%)        7.84      (2.7%)   -0.7% (  -7% -    6%) 0.524
                         Respell       51.61      (1.0%)       51.29      (1.5%)   -0.6% (  -3% -    1%) 0.135
                         LowTerm     1877.52      (3.6%)     1866.08      (4.0%)   -0.6% (  -7% -    7%) 0.615
                      AndHighMed      215.42      (3.1%)      214.17      (3.2%)   -0.6% (  -6% -    5%) 0.560
                          Fuzzy2       71.94      (1.0%)       71.56      (1.5%)   -0.5% (  -2% -    1%) 0.180
                      OrHighHigh       26.01      (3.9%)       25.88      (4.5%)   -0.5% (  -8% -    8%) 0.702
                       LowPhrase      141.37      (2.0%)      140.65      (2.1%)   -0.5% (  -4% -    3%) 0.440
                       OrHighMed       91.32      (3.6%)       90.93      (4.1%)   -0.4% (  -7% -    7%) 0.722
           BrowseMonthTaxoFacets       28.47     (23.2%)       28.35     (23.4%)   -0.4% ( -38% -   60%) 0.955
                          IntNRQ      127.41      (1.5%)      126.93      (1.6%)   -0.4% (  -3% -    2%) 0.449
                          Fuzzy1       64.42      (1.1%)       64.24      (1.9%)   -0.3% (  -3% -    2%) 0.577
                       OrHighLow     1010.14      (2.5%)     1007.54      (2.8%)   -0.3% (  -5% -    5%) 0.756
                      AndHighLow     1629.56      (2.7%)     1626.70      (2.3%)   -0.2% (  -5% -    4%) 0.825
                 LowSloppyPhrase       54.82      (2.1%)       54.75      (1.3%)   -0.1% (  -3% -    3%) 0.810
                        Wildcard      290.56     (10.4%)      290.30     (10.1%)   -0.1% ( -18% -   22%) 0.978
            BrowseDateSSDVFacets        2.35      (7.5%)        2.35      (6.3%)   -0.1% ( -12% -   14%) 0.969
                        HighTerm     1446.58      (4.3%)     1446.18      (4.6%)   -0.0% (  -8% -    9%) 0.985
                      HighPhrase      336.00      (1.2%)      335.97      (1.6%)   -0.0% (  -2% -    2%) 0.983
                    HighSpanNear       17.08      (3.9%)       17.09      (4.2%)    0.0% (  -7% -    8%) 0.993
                     MedSpanNear       18.96      (2.7%)       18.97      (3.3%)    0.0% (  -5% -    6%) 0.966
             MedIntervalsOrdered       16.32      (2.3%)       16.34      (2.0%)    0.1% (  -4% -    4%) 0.857
                         MedTerm     1927.39      (3.3%)     1932.58      (4.8%)    0.3% (  -7% -    8%) 0.836
            HighIntervalsOrdered       14.37      (3.9%)       14.41      (4.4%)    0.3% (  -7% -    8%) 0.828
             LowIntervalsOrdered       13.96      (3.1%)       14.00      (2.1%)    0.3% (  -4% -    5%) 0.699
         AndHighMedDayTaxoFacets       57.83      (2.2%)       58.05      (2.2%)    0.4% (  -3% -    4%) 0.582
                     LowSpanNear       25.48      (2.5%)       25.60      (2.2%)    0.5% (  -4% -    5%) 0.504
        AndHighHighDayTaxoFacets       20.55      (1.9%)       20.66      (2.5%)    0.5% (  -3% -    5%) 0.445
                         Prefix3      174.40     (13.7%)      175.42     (12.2%)    0.6% ( -22% -   30%) 0.887
            MedTermDayTaxoFacets       36.25      (4.1%)       36.46      (4.0%)    0.6% (  -7% -    9%) 0.649
            BrowseDateTaxoFacets       21.85     (21.0%)       21.98     (20.5%)    0.6% ( -33% -   53%) 0.927
                    OrHighNotLow     1422.88      (5.2%)     1431.97      (3.2%)    0.6% (  -7% -    9%) 0.639
                   OrHighNotHigh      909.10      (5.1%)      915.05      (3.0%)    0.7% (  -7% -    9%) 0.617
     BrowseRandomLabelSSDVFacets        9.34      (7.1%)        9.41      (7.2%)    0.7% ( -12% -   16%) 0.764
                    OrHighNotMed      963.33      (4.5%)      969.96      (3.7%)    0.7% (  -7% -    9%) 0.598
       BrowseDayOfYearTaxoFacets       21.89     (21.2%)       22.05     (20.7%)    0.7% ( -33% -   54%) 0.913
           BrowseMonthSSDVFacets       13.44     (15.1%)       13.55     (14.4%)    0.8% ( -24% -   35%) 0.861
                   OrNotHighHigh      890.96      (4.6%)      899.54      (3.2%)    1.0% (  -6% -    9%) 0.444
                        PKLookup      170.01      (3.6%)      171.68      (4.2%)    1.0% (  -6% -    9%) 0.425
                    OrNotHighMed     1002.77      (4.5%)     1013.80      (2.9%)    1.1% (  -5% -    8%) 0.354
                    OrNotHighLow     1401.09      (2.6%)     1416.61      (2.9%)    1.1% (  -4% -    6%) 0.198
          OrHighMedDayTaxoFacets        6.95      (4.5%)        7.04      (4.8%)    1.3% (  -7% -   11%) 0.378
            HighTermTitleBDVSort       45.01     (27.2%)       45.86     (22.6%)    1.9% ( -37% -   71%) 0.810
           HighTermDayOfYearSort       99.77     (16.7%)      103.20     (20.4%)    3.4% ( -28% -   48%) 0.561
               HighTermMonthSort       94.79     (17.3%)      102.32     (26.9%)    7.9% ( -30% -   63%) 0.268
                      TermDTSort       81.57     (26.2%)       89.10     (23.4%)    9.2% ( -31% -   79%) 0.240

@gsmiller gsmiller changed the title LUCENE-10444: Support alternate aggregation functions in association facets LUCENE-10444 BACKPORT: Support alternate aggregation functions in association facets Mar 2, 2022
@gsmiller gsmiller force-pushed the LUCENE-10444-association-facets-on9-pr branch from 195f86f to 4d143c2 Compare March 14, 2022 15:41
@gsmiller gsmiller merged commit 9e10ba0 into apache:branch_9x Apr 7, 2022
@gsmiller gsmiller deleted the LUCENE-10444-association-facets-on9-pr branch April 7, 2022 14:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

1 participant