baseline and modified version showing different benchmark result even if the codebase is same #142

praveennish · 2021-10-11T09:17:18Z

I have cloned Lucene 9 code in both baseline and candidate folder( so codebase is 100% same)
I saw there are performance difference after running command:

python3 src/python/localrun.py -source wikimedium10k

1st Output table shows

  		TaskQPS baseline      StdDevQPS my_modified_version      StdDev                Pct diff p-value

                     Prefix3      407.18      (0.0%)      314.24      (0.0%)  -22.8% ( -22% -  -22%) 1.000
                 LowSpanNear     1073.76      (0.0%)      852.06      (0.0%)  -20.6% ( -20% -  -20%) 1.000
             MedSloppyPhrase     1140.22      (0.0%)      927.42      (0.0%)  -18.7% ( -18% -  -18%) 1.000
                   MedPhrase      964.51      (0.0%)      848.50      (0.0%)  -12.0% ( -12% -  -12%) 1.000
        HighIntervalsOrdered     1002.98      (0.0%)      884.65      (0.0%)  -11.8% ( -11% -  -11%) 1.000
           HighTermMonthSort     4017.92      (0.0%)     3660.73      (0.0%)   -8.9% (  -8% -   -8%) 1.000
                     Respell      512.33      (0.0%)      467.72      (0.0%)   -8.7% (  -8% -   -8%) 1.000
                HighSpanNear      893.76      (0.0%)      821.69      (0.0%)   -8.1% (  -8% -   -8%) 1.000
                      IntNRQ     1828.06      (0.0%)     1682.03      (0.0%)   -8.0% (  -7% -   -7%) 1.000
                    HighTerm     5614.10      (0.0%)     5200.05      (0.0%)   -7.4% (  -7% -   -7%) 1.000
       BrowseMonthTaxoFacets     4142.06      (0.0%)     3870.82      (0.0%)   -6.5% (  -6% -   -6%) 1.000
       HighTermDayOfYearSort     3782.61      (0.0%)     3538.93      (0.0%)   -6.4% (  -6% -   -6%) 1.000
   BrowseDayOfYearSSDVFacets     2665.19      (0.0%)     2514.64      (0.0%)   -5.6% (  -5% -   -5%) 1.000
                     LowTerm     6806.33      (0.0%)     6460.07      (0.0%)   -5.1% (  -5% -   -5%) 1.000
            HighSloppyPhrase      886.16      (0.0%)      845.10      (0.0%)   -4.6% (  -4% -   -4%) 1.000
                   OrHighMed      898.26      (0.0%)      858.97      (0.0%)   -4.4% (  -4% -   -4%) 1.000
                   LowPhrase      988.79      (0.0%)      947.64      (0.0%)   -4.2% (  -4% -   -4%) 1.000
                   OrHighLow     1171.10      (0.0%)     1124.50      (0.0%)   -4.0% (  -3% -   -3%) 1.000
        BrowseDateTaxoFacets     3796.98      (0.0%)     3648.76      (0.0%)   -3.9% (  -3% -   -3%) 1.000
                    PKLookup      326.99      (0.0%)      315.53      (0.0%)   -3.5% (  -3% -   -3%) 1.000
       BrowseMonthSSDVFacets     3212.18      (0.0%)     3110.22      (0.0%)   -3.2% (  -3% -   -3%) 1.000
                  AndHighLow     2763.74      (0.0%)     2691.30      (0.0%)   -2.6% (  -2% -   -2%) 1.000
                 MedSpanNear      634.86      (0.0%)      624.48      (0.0%)   -1.6% (  -1% -   -1%) 1.000
                    Wildcard      581.94      (0.0%)      572.55      (0.0%)   -1.6% (  -1% -   -1%) 1.000
                  HighPhrase      729.77      (0.0%)      720.61      (0.0%)   -1.3% (  -1% -   -1%) 1.000
   BrowseDayOfYearTaxoFacets     3111.47      (0.0%)     3073.01      (0.0%)   -1.2% (  -1% -   -1%) 1.000
                  OrHighHigh      430.85      (0.0%)      426.77      (0.0%)   -0.9% (   0% -    0%) 1.000
                 AndHighHigh     1029.49      (0.0%)     1028.71      (0.0%)   -0.1% (   0% -    0%) 1.000
             LowSloppyPhrase     1351.24      (0.0%)     1365.14      (0.0%)    1.0% (   1% -    1%) 1.000
                      Fuzzy2       70.31      (0.0%)       71.83      (0.0%)    2.2% (   2% -    2%) 1.000
                      Fuzzy1      324.58      (0.0%)      338.44      (0.0%)    4.3% (   4% -    4%) 1.000
         LowIntervalsOrdered     1721.13      (0.0%)     1807.65      (0.0%)    5.0% (   5% -    5%) 1.000
                     MedTerm     5749.70      (0.0%)     6042.57      (0.0%)    5.1% (   5% -    5%) 1.000
         MedIntervalsOrdered     1291.17      (0.0%)     1382.36      (0.0%)    7.1% (   7% -    7%) 1.000
                  AndHighMed     1322.11      (0.0%)     1575.31      (0.0%)   19.2% (  19% -   19%) 1.000

My expectation was that both same code will perform the same way but you can notice deviations. Can you please explain it ? Is it right way to run benchmark?

Thanks!

The text was updated successfully, but these errors were encountered:

msokolov · 2021-10-12T08:54:02Z

There is no statistical difference here. The final column, p value, tells you the probability that the difference you are observing is due to random chance. It's one. You can observe the absolute values of these random differences reduced by running larger test samples. More iterations, larger indexes, more queries. What your a/a test shows you is the magnitude off noise on the system given your sample size.

…

On Mon, Oct 11, 2021, 5:17 AM praveennish ***@***.***> wrote: Hi @mikemccand <https://github.com/mikemccand>, I have cloned Lucene 9 code in both baseline and candidate folder( so codebase is 100% same) I saw there are performance difference after running command: python3 src/python/localrun.py -source wikimedium10k 1st Output table shows TaskQPS baseline StdDevQPS my_modified_version StdDev Pct diff p-value Prefix3 407.18 (0.0%) 314.24 (0.0%) -22.8% ( -22% - -22%) 1.000 LowSpanNear 1073.76 (0.0%) 852.06 (0.0%) -20.6% ( -20% - -20%) 1.000 MedSloppyPhrase 1140.22 (0.0%) 927.42 (0.0%) -18.7% ( -18% - -18%) 1.000 MedPhrase 964.51 (0.0%) 848.50 (0.0%) -12.0% ( -12% - -12%) 1.000 HighIntervalsOrdered 1002.98 (0.0%) 884.65 (0.0%) -11.8% ( -11% - -11%) 1.000 HighTermMonthSort 4017.92 (0.0%) 3660.73 (0.0%) -8.9% ( -8% - -8%) 1.000 Respell 512.33 (0.0%) 467.72 (0.0%) -8.7% ( -8% - -8%) 1.000 HighSpanNear 893.76 (0.0%) 821.69 (0.0%) -8.1% ( -8% - -8%) 1.000 IntNRQ 1828.06 (0.0%) 1682.03 (0.0%) -8.0% ( -7% - -7%) 1.000 HighTerm 5614.10 (0.0%) 5200.05 (0.0%) -7.4% ( -7% - -7%) 1.000 BrowseMonthTaxoFacets 4142.06 (0.0%) 3870.82 (0.0%) -6.5% ( -6% - -6%) 1.000 HighTermDayOfYearSort 3782.61 (0.0%) 3538.93 (0.0%) -6.4% ( -6% - -6%) 1.000 BrowseDayOfYearSSDVFacets 2665.19 (0.0%) 2514.64 (0.0%) -5.6% ( -5% - -5%) 1.000 LowTerm 6806.33 (0.0%) 6460.07 (0.0%) -5.1% ( -5% - -5%) 1.000 HighSloppyPhrase 886.16 (0.0%) 845.10 (0.0%) -4.6% ( -4% - -4%) 1.000 OrHighMed 898.26 (0.0%) 858.97 (0.0%) -4.4% ( -4% - -4%) 1.000 LowPhrase 988.79 (0.0%) 947.64 (0.0%) -4.2% ( -4% - -4%) 1.000 OrHighLow 1171.10 (0.0%) 1124.50 (0.0%) -4.0% ( -3% - -3%) 1.000 BrowseDateTaxoFacets 3796.98 (0.0%) 3648.76 (0.0%) -3.9% ( -3% - -3%) 1.000 PKLookup 326.99 (0.0%) 315.53 (0.0%) -3.5% ( -3% - -3%) 1.000 BrowseMonthSSDVFacets 3212.18 (0.0%) 3110.22 (0.0%) -3.2% ( -3% - -3%) 1.000 AndHighLow 2763.74 (0.0%) 2691.30 (0.0%) -2.6% ( -2% - -2%) 1.000 MedSpanNear 634.86 (0.0%) 624.48 (0.0%) -1.6% ( -1% - -1%) 1.000 Wildcard 581.94 (0.0%) 572.55 (0.0%) -1.6% ( -1% - -1%) 1.000 HighPhrase 729.77 (0.0%) 720.61 (0.0%) -1.3% ( -1% - -1%) 1.000 BrowseDayOfYearTaxoFacets 3111.47 (0.0%) 3073.01 (0.0%) -1.2% ( -1% - -1%) 1.000 OrHighHigh 430.85 (0.0%) 426.77 (0.0%) -0.9% ( 0% - 0%) 1.000 AndHighHigh 1029.49 (0.0%) 1028.71 (0.0%) -0.1% ( 0% - 0%) 1.000 LowSloppyPhrase 1351.24 (0.0%) 1365.14 (0.0%) 1.0% ( 1% - 1%) 1.000 Fuzzy2 70.31 (0.0%) 71.83 (0.0%) 2.2% ( 2% - 2%) 1.000 Fuzzy1 324.58 (0.0%) 338.44 (0.0%) 4.3% ( 4% - 4%) 1.000 LowIntervalsOrdered 1721.13 (0.0%) 1807.65 (0.0%) 5.0% ( 5% - 5%) 1.000 MedTerm 5749.70 (0.0%) 6042.57 (0.0%) 5.1% ( 5% - 5%) 1.000 MedIntervalsOrdered 1291.17 (0.0%) 1382.36 (0.0%) 7.1% ( 7% - 7%) 1.000 AndHighMed 1322.11 (0.0%) 1575.31 (0.0%) 19.2% ( 19% - 19%) 1.000 My expectation was that both same code will perform the same way but you can notice deviations. Can you please explain it ? Is it right way to run benchmark? Thanks! — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub <#142>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAHHUQO6HZOLZ5WE73XIQPDUGKTSTANCNFSM5FXX6M3Q> . Triage notifications on the go with GitHub Mobile for iOS <https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675> or Android <https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>.

praveennish · 2021-10-20T05:55:22Z

Thanks for your input! Is there a wiki or any documentation which specifies which fields to look for measuring performance of baseline vs modified ?

praveennish · 2021-12-09T06:33:40Z

Hi @msokolov

I am still observing difference in stats of baseline vs modified though they are same lucene 9 code

Following are my parameters:

corpus - wikimediumall
taskRepeatCount = 20
jvmCount = 64

This is 63rd iteration value

  		TaskQPS baseline      StdDevQPS my_modified_version      StdDev                Pct diff p-value

         LowIntervalsOrdered       18.02     (18.8%)       16.76     (18.2%)   -7.0% ( -37% -   37%) 0.034
                  OrHighHigh        3.89     (21.6%)        3.63     (20.6%)   -6.6% ( -40% -   45%) 0.081
         MedIntervalsOrdered       32.85     (21.9%)       30.82     (19.0%)   -6.2% ( -38% -   44%) 0.091
       BrowseMonthTaxoFacets        0.65     (12.6%)        0.61     (11.9%)   -5.1% ( -26% -   22%) 0.019
                     Respell       21.47     (16.4%)       20.48     (14.3%)   -4.6% ( -30% -   31%) 0.092
                     Prefix3       25.37     (19.6%)       24.49     (19.1%)   -3.5% ( -35% -   43%) 0.310
       BrowseMonthSSDVFacets        2.49     (24.8%)        2.41     (27.1%)   -3.5% ( -44% -   64%) 0.454
                   LowPhrase       21.06     (18.1%)       20.39     (16.3%)   -3.2% ( -31% -   38%) 0.298
            HighSloppyPhrase        6.87     (19.8%)        6.69     (19.7%)   -2.6% ( -35% -   45%) 0.468
               OrHighNotHigh      258.43     (13.4%)      252.31     (12.3%)   -2.4% ( -24% -   26%) 0.301
                OrHighNotLow      381.47     (14.0%)      372.64      (9.7%)   -2.3% ( -22% -   24%) 0.280
    AndHighHighDayTaxoFacets        4.06     (15.8%)        3.97     (14.8%)   -2.2% ( -28% -   33%) 0.412
                    PKLookup       55.24     (10.8%)       54.02      (8.9%)   -2.2% ( -19% -   19%) 0.211
                  TermDTSort       40.58     (29.2%)       39.80     (26.4%)   -1.9% ( -44% -   75%) 0.699
                      Fuzzy1       33.16      (8.6%)       32.63      (7.8%)   -1.6% ( -16% -   16%) 0.276
           HighTermMonthSort       16.28     (28.9%)       16.03     (29.1%)   -1.5% ( -46% -   79%) 0.773
     AndHighMedDayTaxoFacets        5.57     (14.2%)        5.49     (14.8%)   -1.3% ( -26% -   32%) 0.602
             MedSloppyPhrase        1.74     (17.9%)        1.72     (17.8%)   -1.2% ( -31% -   42%) 0.713
               OrNotHighHigh      326.82     (11.5%)      323.05      (9.8%)   -1.2% ( -20% -   22%) 0.545
                HighSpanNear        5.84     (21.5%)        5.78     (22.5%)   -1.0% ( -37% -   54%) 0.789
                   OrHighMed        9.65     (22.9%)        9.56     (22.1%)   -1.0% ( -37% -   56%) 0.809
        MedTermDayTaxoFacets        1.52     (22.0%)        1.50     (20.8%)   -0.9% ( -35% -   53%) 0.808
      OrHighMedDayTaxoFacets        2.15     (21.9%)        2.13     (22.0%)   -0.8% ( -36% -   55%) 0.833
                    Wildcard       39.49     (23.2%)       39.22     (22.1%)   -0.7% ( -37% -   58%) 0.866
                     MedTerm      468.66     (15.5%)      465.63     (14.4%)   -0.6% ( -26% -   34%) 0.808
                  HighPhrase       47.44     (18.7%)       47.36     (16.1%)   -0.2% ( -29% -   42%) 0.954
                      Fuzzy2       32.58     (10.8%)       32.54      (9.9%)   -0.1% ( -18% -   23%) 0.948
                   OrHighLow      112.38     (15.1%)      112.64     (15.8%)    0.2% ( -26% -   36%) 0.935
                  AndHighLow      225.77     (16.1%)      226.74     (14.8%)    0.4% ( -26% -   37%) 0.876
                 AndHighHigh        7.80     (17.7%)        7.84     (16.1%)    0.5% ( -28% -   41%) 0.869
             LowSloppyPhrase       16.10     (18.2%)       16.19     (16.8%)    0.6% ( -29% -   43%) 0.848
                  AndHighMed       40.40     (18.6%)       40.65     (16.4%)    0.6% ( -28% -   43%) 0.845
        BrowseDateTaxoFacets        0.58     (11.5%)        0.59     (11.9%)    0.7% ( -20% -   27%) 0.745
                      IntNRQ       12.90     (17.6%)       12.99     (18.6%)    0.7% ( -30% -   44%) 0.819
        HighTermTitleBDVSort       14.81     (29.6%)       14.93     (28.9%)    0.9% ( -44% -   84%) 0.869
                OrNotHighMed      319.00      (9.5%)      322.15      (9.6%)    1.0% ( -16% -   22%) 0.563
                     LowTerm      667.97      (7.3%)      674.60      (6.5%)    1.0% ( -11% -   15%) 0.419
                    HighTerm      534.32     (13.9%)      541.61     (15.9%)    1.4% ( -24% -   36%) 0.609
                   MedPhrase        8.58     (18.2%)        8.72     (19.1%)    1.6% ( -30% -   47%) 0.627
   BrowseDayOfYearTaxoFacets        0.59     (10.7%)        0.60     (11.0%)    1.8% ( -17% -   26%) 0.353
                OrHighNotMed      341.92     (11.7%)      350.59     (13.4%)    2.5% ( -20% -   31%) 0.257
                OrNotHighLow      257.81     (10.8%)      264.55     (10.3%)    2.6% ( -16% -   26%) 0.165
       HighTermDayOfYearSort       10.83     (31.5%)       11.15     (32.5%)    2.9% ( -46% -   97%) 0.613
                 MedSpanNear       13.81     (14.7%)       14.22     (17.4%)    3.0% ( -25% -   41%) 0.293
        HighIntervalsOrdered        1.29     (19.7%)        1.33     (18.8%)    3.4% ( -29% -   52%) 0.329
                 LowSpanNear        3.91     (18.9%)        4.06     (17.2%)    3.9% ( -27% -   49%) 0.231
   BrowseDayOfYearSSDVFacets        2.11     (26.4%)        2.21     (28.6%)    4.6% ( -39% -   80%) 0.348

and this is 64th iteration

  		TaskQPS baseline      StdDevQPS my_modified_version      StdDev                Pct diff p-value

         LowIntervalsOrdered       17.98     (18.9%)       16.72     (18.2%)   -7.0% ( -37% -   37%) 0.033
                  OrHighHigh        3.88     (21.6%)        3.63     (20.5%)   -6.5% ( -39% -   45%) 0.081
         MedIntervalsOrdered       32.84     (21.7%)       30.83     (18.9%)   -6.1% ( -38% -   44%) 0.088
                     Respell       21.49     (16.3%)       20.51     (14.2%)   -4.6% ( -30% -   30%) 0.091
       BrowseMonthTaxoFacets        0.64     (13.0%)        0.62     (12.1%)   -4.4% ( -26% -   23%) 0.046
       BrowseMonthSSDVFacets        2.50     (24.6%)        2.39     (27.1%)   -4.2% ( -44% -   63%) 0.355
               OrHighNotHigh      259.60     (13.7%)      251.79     (12.3%)   -3.0% ( -25% -   26%) 0.190
                OrHighNotLow      382.71     (14.1%)      371.91      (9.7%)   -2.8% ( -23% -   24%) 0.187
                   LowPhrase       21.04     (18.0%)       20.47     (16.5%)   -2.7% ( -31% -   38%) 0.375
            HighSloppyPhrase        6.85     (19.8%)        6.67     (19.8%)   -2.6% ( -35% -   46%) 0.459
                     Prefix3       25.30     (19.6%)       24.65     (19.7%)   -2.6% ( -34% -   45%) 0.460
    AndHighHighDayTaxoFacets        4.06     (15.7%)        3.97     (14.7%)   -2.2% ( -28% -   33%) 0.408
                    PKLookup       55.19     (10.8%)       53.99      (8.9%)   -2.2% ( -19% -   19%) 0.213
                  TermDTSort       40.41     (29.3%)       39.58     (26.6%)   -2.0% ( -44% -   76%) 0.681
     AndHighMedDayTaxoFacets        5.59     (14.3%)        5.48     (14.7%)   -1.8% ( -26% -   31%) 0.472
                      Fuzzy1       33.21      (8.6%)       32.61      (7.7%)   -1.8% ( -16% -   15%) 0.210
           HighTermMonthSort       16.21     (29.0%)       15.95     (29.3%)   -1.6% ( -46% -   80%) 0.758
                HighSpanNear        5.84     (21.3%)        5.75     (22.6%)   -1.5% ( -37% -   53%) 0.698
             MedSloppyPhrase        1.74     (17.8%)        1.72     (17.9%)   -1.5% ( -31% -   41%) 0.639
               OrNotHighHigh      326.70     (11.4%)      322.82      (9.7%)   -1.2% ( -20% -   22%) 0.527
                   OrHighMed        9.63     (22.8%)        9.53     (22.0%)   -1.0% ( -37% -   56%) 0.801
      OrHighMedDayTaxoFacets        2.14     (21.9%)        2.12     (22.1%)   -1.0% ( -36% -   55%) 0.804
        MedTermDayTaxoFacets        1.51     (22.0%)        1.50     (20.7%)   -0.5% ( -35% -   54%) 0.886
                    Wildcard       39.26     (23.7%)       39.08     (22.2%)   -0.5% ( -37% -   59%) 0.907
                     MedTerm      467.59     (15.5%)      465.43     (14.3%)   -0.5% ( -26% -   34%) 0.862
                      Fuzzy2       32.53     (10.7%)       32.45     (10.0%)   -0.2% ( -18% -   23%) 0.893
                  HighPhrase       47.37     (18.6%)       47.29     (16.0%)   -0.2% ( -29% -   42%) 0.957
                   OrHighLow      112.24     (15.1%)      112.47     (15.7%)    0.2% ( -26% -   36%) 0.941
                  AndHighMed       40.34     (18.5%)       40.53     (16.5%)    0.5% ( -29% -   43%) 0.879
             LowSloppyPhrase       16.06     (18.1%)       16.14     (16.8%)    0.5% ( -29% -   43%) 0.871
        HighTermTitleBDVSort       14.78     (29.5%)       14.88     (28.8%)    0.7% ( -44% -   83%) 0.888
                      IntNRQ       12.87     (17.6%)       12.96     (18.6%)    0.7% ( -30% -   44%) 0.820
        BrowseDateTaxoFacets        0.58     (11.4%)        0.59     (11.9%)    0.8% ( -20% -   27%) 0.704
                   MedPhrase        8.63     (18.4%)        8.70     (19.0%)    0.8% ( -30% -   46%) 0.804
                OrNotHighMed      318.81      (9.5%)      321.48      (9.7%)    0.8% ( -16% -   22%) 0.620
                  AndHighLow      225.43     (16.1%)      227.47     (14.9%)    0.9% ( -25% -   37%) 0.741
                 AndHighHigh        7.79     (17.7%)        7.86     (16.1%)    0.9% ( -27% -   42%) 0.755
                     LowTerm      667.09      (7.3%)      674.22      (6.5%)    1.1% ( -11% -   16%) 0.380
                    HighTerm      533.13     (14.0%)      540.24     (15.9%)    1.3% ( -25% -   36%) 0.615
       HighTermDayOfYearSort       10.90     (31.4%)       11.08     (32.4%)    1.7% ( -47% -   95%) 0.765
   BrowseDayOfYearTaxoFacets        0.59     (10.7%)        0.60     (11.1%)    2.2% ( -17% -   26%) 0.256
                OrNotHighLow      257.38     (10.8%)      263.97     (10.4%)    2.6% ( -16% -   26%) 0.173
                OrHighNotMed      341.35     (11.7%)      350.53     (13.3%)    2.7% ( -19% -   31%) 0.224
        HighIntervalsOrdered        1.29     (20.0%)        1.33     (18.5%)    2.9% ( -29% -   51%) 0.393
                 MedSpanNear       13.78     (14.7%)       14.19     (17.4%)    2.9% ( -25% -   41%) 0.302
                 LowSpanNear        3.93     (19.0%)        4.08     (17.3%)    3.8% ( -27% -   49%) 0.238
   BrowseDayOfYearSSDVFacets        2.12     (26.2%)        2.20     (28.4%)    3.9% ( -40% -   79%) 0.422

Kindly educate me why it is happening and what conclusion we can draw from this ?

mikemccand · 2021-12-19T22:52:38Z

These are surprisingly/depressingly noisy results. Are you sure the are A/A? Exactly same git clone of Lucene 9 being compared to itself? Which JVM / Java CL flags are you passing? How much RAM does the box have? Enough to keep the whole index hot?

praveennish · 2022-01-10T09:00:42Z

I am very sorry @mikemccand for the late reply!

I wanted to retest today but after latest pull i am getting FileNotFoundException for this file
enwiki-20120502-lines-1k-fixed-utf8-with-random-label.txt in data folder

From where i can download this file please?

original-brownbear · 2024-10-21T11:26:31Z

I think this is partly explained by #307

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

baseline and modified version showing different benchmark result even if the codebase is same #142

baseline and modified version showing different benchmark result even if the codebase is same #142

praveennish commented Oct 11, 2021

msokolov commented Oct 12, 2021 via email

praveennish commented Oct 20, 2021

praveennish commented Dec 9, 2021

mikemccand commented Dec 19, 2021

praveennish commented Jan 10, 2022

original-brownbear commented Oct 21, 2024

baseline and modified version showing different benchmark result even if the codebase is same #142

baseline and modified version showing different benchmark result even if the codebase is same #142

Comments

praveennish commented Oct 11, 2021

msokolov commented Oct 12, 2021 via email

praveennish commented Oct 20, 2021

praveennish commented Dec 9, 2021

This is 63rd iteration value

and this is 64th iteration

mikemccand commented Dec 19, 2021

praveennish commented Jan 10, 2022

original-brownbear commented Oct 21, 2024