Skip to content

Computation of the weighted geometric mean

Stefan Krause edited this page Oct 15, 2023 · 2 revisions

For results before chrome 118 the overall result for the CPU benchmarks was simply the geometric mean of the slowdown factor for each implementation and benchmark. The slowdown factor is the duration for the implementation and benchmark divided by the duration of the fastest implementation for that benchmark.

If you look at the results you see that the benchmarks have unequal spread of the factors. Create row factors are closer than select row and so on. If we simple take the geometric mean thus we emphasize the influence for those benchmarks that have a large spread. (Sadly those are even the weakest benchmarks in terms of variance and stability...) So it seems like an idea to use a weighted geometric mean (https://en.wikipedia.org/wiki/Weighted_geometric_mean).

What weights could we use for that purpose. For each benchmark we could take the 1/factor for the slowest implementation. Select row would then have a weight of 1/47.6684 and create row a weight of 1/3.267. But what if choo changed it's (obviously pretty slow) implementation for select row and let's say performs as good as blazor-wasm? Then the weight would drop to 1/24.7354, which is obviously a big change.

Thus we're using the 90% percentile of the factors, which results in a weight of 1/5.16529 for select rows and thus is a much less drastic weight.

Those are the current weights:

benchmark fastest 90% percentile 90% factor weight slowest factor for slowest
01_run1k 38.72 60.24 1.56 0.64 126.51 3.27
02_replace1k 39.15 69.82 1.78 0.56 192.16 4.91
03_update10th1k_x16 19.34 34.26 1.77 0.56 168.47 8.71
04_select1k 3.29 17.06 5.19 0.19 156.59 47.67
05_swap1k 22.61 171.26 7.58 0.13 328.72 14.54
06_remove-one-1k 17.72 33.58 1.89 0.53 163.77 9.24
07_create10k 386.77 685.22 1.77 0.56 2415.89 6.25
08_create1k-after1k_x2 40.88 74.22 1.82 0.55 160.16 3.92
09_clear1k_x8 13.14 31.10 2.37 0.42 53.41 4.06

Those weights might be readjusted at some point in future.

If you want to take a closer look you can play in an excel file with the weights: results.xlsx