Skip to content

Conversation

@jpountz
Copy link
Contributor

@jpountz jpountz commented Jan 30, 2025

This is a rare bug (for instance none of the queries in nightly benchmarks return different top hits with the fix, and I haven't been able to create a proper test) but still a bug.

This is a rare bug (for instance none of the queries in nightly
benchmarks return different top hits with the fix, and I haven't been
able to create a proper test) but still a bug.
@jpountz jpountz added this to the 10.2.0 milestone Jan 30, 2025
@jpountz
Copy link
Contributor Author

jpountz commented Jan 30, 2025

I am annoyed that I'm not able to create a proper test for this. It seems to require very specific conditions, I found it while playing with score quantization.

@jpountz
Copy link
Contributor Author

jpountz commented Jan 30, 2025

Interestingly, this makes queries run faster:

                            TaskQPS baseline      StdDevQPS my_modified_version      StdDev                Pct diff p-value
                      DismaxTerm      812.60      (4.5%)      796.40      (5.8%)   -2.0% ( -11% -    8%) 0.223
                          Phrase       24.37      (3.0%)       23.99      (3.0%)   -1.6% (  -7% -    4%) 0.096
                     CountPhrase        7.23      (2.5%)        7.12      (3.4%)   -1.5% (  -7% -    4%) 0.110
                  FilteredPhrase       56.57      (2.2%)       56.12      (1.9%)   -0.8% (  -4% -    3%) 0.219
             CountFilteredPhrase       51.57      (2.1%)       51.20      (2.4%)   -0.7% (  -5% -    3%) 0.316
             CombinedAndHighHigh       24.95      (2.8%)       24.77      (2.4%)   -0.7% (  -5% -    4%) 0.394
                    TermBGroup1M       43.17      (2.8%)       42.89      (2.6%)   -0.6% (  -5% -    4%) 0.452
                DismaxOrHighHigh      163.56      (3.3%)      162.61      (3.2%)   -0.6% (  -6% -    6%) 0.576
                      OrHighHigh       67.97      (2.5%)       67.59      (2.9%)   -0.6% (  -5% -    4%) 0.509
                CountAndHighHigh      510.48      (4.5%)      507.63      (5.2%)   -0.6% (  -9% -    9%) 0.715
                 DismaxOrHighMed      252.23      (1.7%)      251.05      (1.6%)   -0.5% (  -3% -    2%) 0.371
                       OrHighMed      296.25      (2.5%)      295.00      (2.2%)   -0.4% (  -4% -    4%) 0.571
                    TermGroup10K       30.45      (2.2%)       30.34      (2.6%)   -0.4% (  -5% -    4%) 0.628
                 AndHighOrMedMed       72.04      (2.3%)       71.77      (2.0%)   -0.4% (  -4% -    4%) 0.591
                     TermGroup1M       32.17      (2.0%)       32.06      (2.1%)   -0.3% (  -4% -    3%) 0.604
              CombinedOrHighHigh       29.06      (2.4%)       29.01      (2.0%)   -0.2% (  -4% -    4%) 0.777
              CombinedAndHighMed       96.64      (2.7%)       96.48      (2.0%)   -0.2% (  -4% -    4%) 0.818
                       CountTerm    15186.79      (4.1%)    15167.52      (4.5%)   -0.1% (  -8% -    8%) 0.927
                      OrHighRare      500.40     (11.0%)      499.80     (10.8%)   -0.1% ( -19% -   24%) 0.972
                          Fuzzy1      112.95      (2.9%)      112.83      (2.5%)   -0.1% (  -5% -    5%) 0.896
                    TermGroup100       34.29      (2.7%)       34.26      (2.2%)   -0.1% (  -4% -    4%) 0.895
                          Fuzzy2      106.40      (2.7%)      106.30      (2.2%)   -0.1% (  -4% -    4%) 0.898
                         Respell       76.49      (1.8%)       76.41      (2.5%)   -0.1% (  -4% -    4%) 0.885
                        Or3Terms      224.72      (3.9%)      224.85      (3.6%)    0.1% (  -7% -    7%) 0.963
                AndMedOrHighHigh       90.66      (2.6%)       90.72      (3.3%)    0.1% (  -5% -    6%) 0.940
                    CombinedTerm       50.53      (2.4%)       50.59      (1.8%)    0.1% (  -3% -    4%) 0.861
          CountFilteredOrHighMed      241.92      (1.4%)      242.33      (1.0%)    0.2% (  -2% -    2%) 0.664
                     AndHighHigh       57.20      (3.0%)       57.31      (2.7%)    0.2% (  -5% -    6%) 0.828
                        PKLookup      369.68      (4.1%)      370.44      (4.9%)    0.2% (  -8% -    9%) 0.886
               CombinedOrHighMed      116.27      (2.5%)      116.53      (1.6%)    0.2% (  -3% -    4%) 0.732
               TermDayOfYearSort     3390.38      (2.8%)     3399.86      (2.7%)    0.3% (  -5% -    5%) 0.748
                   TermTitleSort      238.54      (2.3%)      239.22      (2.2%)    0.3% (  -4% -    4%) 0.688
         CountFilteredOrHighHigh      221.59      (1.5%)      222.27      (1.1%)    0.3% (  -2% -    3%) 0.471
                     OrStopWords       44.77      (3.9%)       44.96      (2.9%)    0.4% (  -6% -    7%) 0.704
                  CountOrHighMed      525.36      (1.8%)      527.98      (1.6%)    0.5% (  -2% -    4%) 0.367
              Or2Terms2StopWords      257.41      (3.0%)      258.78      (2.5%)    0.5% (  -4% -    6%) 0.537
                 CountAndHighMed      423.37      (2.2%)      425.64      (1.8%)    0.5% (  -3% -    4%) 0.404
                          IntNRQ      213.65      (5.0%)      214.81      (8.2%)    0.5% ( -12% -   14%) 0.799
                  TermBGroup1M1P       53.65      (2.5%)       53.98      (2.9%)    0.6% (  -4% -    6%) 0.474
            FilteredAndStopWords       81.68      (5.1%)       82.25      (4.0%)    0.7% (  -8% -   10%) 0.630
                      AndHighMed      175.84      (1.8%)      177.14      (1.5%)    0.7% (  -2% -    4%) 0.165
               FilteredAnd3Terms      270.97      (3.2%)      273.32      (3.2%)    0.9% (  -5% -    7%) 0.386
                          OrMany       26.18      (6.5%)       26.41      (4.4%)    0.9% (  -9% -   12%) 0.613
                   TermMonthSort     3460.41      (1.9%)     3491.44      (2.4%)    0.9% (  -3% -    5%) 0.182
                 CountOrHighHigh      468.00      (3.7%)      472.34      (2.5%)    0.9% (  -5% -    7%) 0.359
             CountFilteredOrMany       40.96      (2.9%)       41.35      (2.1%)    0.9% (  -3% -    6%) 0.240
             FilteredAndHighHigh       98.24      (5.1%)       99.19      (3.9%)    1.0% (  -7% -   10%) 0.505
                        Wildcard      130.14      (3.8%)      131.54      (3.5%)    1.1% (  -6% -    8%) 0.352
     FilteredAnd2Terms2StopWords      306.52      (2.9%)      310.42      (1.9%)    1.3% (  -3% -    6%) 0.103
              FilteredAndHighMed      183.32      (2.4%)      186.00      (1.8%)    1.5% (  -2% -    5%) 0.030
                    FilteredTerm      264.80      (3.7%)      269.01      (3.5%)    1.6% (  -5% -    9%) 0.159
                      TermDTSort      597.32      (3.3%)      607.15      (4.2%)    1.6% (  -5% -    9%) 0.168
                    AndStopWords       39.81      (3.8%)       40.47      (3.0%)    1.7% (  -4% -    8%) 0.123
             And2Terms2StopWords      266.94      (2.9%)      271.66      (1.7%)    1.8% (  -2% -    6%) 0.017
                       And3Terms      235.03      (2.7%)      239.30      (1.6%)    1.8% (  -2% -    6%) 0.010
                         Prefix3      232.88      (3.3%)      237.20      (2.4%)    1.9% (  -3% -    7%) 0.043
                            Term      773.39      (6.3%)      792.17      (4.1%)    2.4% (  -7% -   13%) 0.148
               FilteredOrHighMed      238.07      (5.1%)      253.56      (4.3%)    6.5% (  -2% -   16%) 0.000
      FilteredOr2Terms2StopWords      238.50      (4.7%)      255.19      (4.0%)    7.0% (  -1% -   16%) 0.000
                FilteredOr3Terms      234.99      (4.9%)      255.47      (4.6%)    8.7% (   0% -   19%) 0.000
              FilteredOrHighHigh       97.09      (5.1%)      106.85      (4.5%)   10.1% (   0% -   20%) 0.000
             FilteredOrStopWords       71.25      (4.8%)       78.94      (4.4%)   10.8% (   1% -   20%) 0.000
                  FilteredOrMany       19.72      (6.9%)       23.05      (2.3%)   16.9% (   7% -   28%) 0.000

@jpountz jpountz merged commit 26dbc82 into apache:main Feb 2, 2025
5 checks passed
@jpountz jpountz deleted the fix/bug_filtered_disjunction branch February 2, 2025 19:26
jpountz added a commit that referenced this pull request Feb 2, 2025
This is a rare bug (for instance none of the queries in nightly
benchmarks return different top hits with the fix, and I haven't been
able to create a proper test) but still a bug.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant