Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve the benchmark for polynomial evaluation #3930

Merged
merged 6 commits into from
Mar 29, 2024

Conversation

pleroy
Copy link
Member

@pleroy pleroy commented Mar 29, 2024

  1. Exercise more degrees, to get a sense of which scheme is preferable for which degree.
  2. Exercise evaluation with and without FMA, to get a sense of the impact of FMA.
  3. Use modern benchmark library features to replace our hacky way of preventing optimization.
  4. Use templates instead of code duplication.
  5. Remove the benchmark for Vector<double>, it does not differ from Displacement in any way, and it's not something that we use in practice.

Numbers after this PR:

Running C:\Users\phl\Projects\GitHub\Principia\Principia\Release\x64\benchmarks.exe
Run on (48 X 3793 MHz CPU s)
CPU Caches:
  L1 Data 32 KiB (x24)
  L1 Instruction 32 KiB (x24)
  L2 Unified 512 KiB (x24)
  L3 Unified 32768 KiB (x4)
------------------------------------------------------------------------------------------------------------------------
Benchmark                                                                              Time             CPU   Iterations
------------------------------------------------------------------------------------------------------------------------
BM_EvaluatePolynomialInMonomialBasis<double, Estrin>/2                              2.65 ns         2.67 ns    263529412
BM_EvaluatePolynomialInMonomialBasis<double, Estrin>/4                              2.79 ns         2.76 ns    248888889
BM_EvaluatePolynomialInMonomialBasis<double, Estrin>/6                              2.91 ns         2.92 ns    235789474
BM_EvaluatePolynomialInMonomialBasis<double, Estrin>/8                              3.54 ns         3.53 ns    194782609
BM_EvaluatePolynomialInMonomialBasis<double, Estrin>/10                             3.76 ns         3.85 ns    194782609
BM_EvaluatePolynomialInMonomialBasis<double, Estrin>/12                             3.98 ns         4.01 ns    179200000
BM_EvaluatePolynomialInMonomialBasis<double, Estrin>/14                             4.20 ns         4.14 ns    165925926
BM_EvaluatePolynomialInMonomialBasis<double, Estrin>/16                             4.64 ns         4.60 ns    112000000
BM_EvaluatePolynomialInMonomialBasis<Length, Estrin>/2                              2.66 ns         2.67 ns    263529412
BM_EvaluatePolynomialInMonomialBasis<Length, Estrin>/4                              2.87 ns         2.89 ns    248888889
BM_EvaluatePolynomialInMonomialBasis<Length, Estrin>/6                              2.91 ns         2.95 ns    248888889
BM_EvaluatePolynomialInMonomialBasis<Length, Estrin>/8                              3.32 ns         3.30 ns    203636364
BM_EvaluatePolynomialInMonomialBasis<Length, Estrin>/10                             3.79 ns         3.77 ns    186666667
BM_EvaluatePolynomialInMonomialBasis<Length, Estrin>/12                             3.99 ns         3.99 ns    172307692
BM_EvaluatePolynomialInMonomialBasis<Length, Estrin>/14                             4.30 ns         4.33 ns    165925926
BM_EvaluatePolynomialInMonomialBasis<Length, Estrin>/16                             4.87 ns         4.74 ns    112000000
BM_EvaluatePolynomialInMonomialBasis<Displacement<ICRS>, Estrin>/2                  2.82 ns         2.83 ns    248888889
BM_EvaluatePolynomialInMonomialBasis<Displacement<ICRS>, Estrin>/4                  3.55 ns         3.53 ns    194782609
BM_EvaluatePolynomialInMonomialBasis<Displacement<ICRS>, Estrin>/6                  4.64 ns         4.60 ns    149333333
BM_EvaluatePolynomialInMonomialBasis<Displacement<ICRS>, Estrin>/8                  5.64 ns         5.62 ns    100000000
BM_EvaluatePolynomialInMonomialBasis<Displacement<ICRS>, Estrin>/10                 6.34 ns         6.28 ns    112000000
BM_EvaluatePolynomialInMonomialBasis<Displacement<ICRS>, Estrin>/12                 7.59 ns         7.67 ns    112000000
BM_EvaluatePolynomialInMonomialBasis<Displacement<ICRS>, Estrin>/14                 10.4 ns         10.5 ns     74666667
BM_EvaluatePolynomialInMonomialBasis<Displacement<ICRS>, Estrin>/16                 7.12 ns         7.15 ns     89600000
BM_EvaluatePolynomialInMonomialBasis<double, Horner>/2                              2.67 ns         2.67 ns    263529412
BM_EvaluatePolynomialInMonomialBasis<double, Horner>/4                              2.66 ns         2.61 ns    263529412
BM_EvaluatePolynomialInMonomialBasis<double, Horner>/6                              2.88 ns         2.92 ns    235789474
BM_EvaluatePolynomialInMonomialBasis<double, Horner>/8                              2.88 ns         2.89 ns    248888889
BM_EvaluatePolynomialInMonomialBasis<double, Horner>/10                             3.19 ns         3.21 ns    224000000
BM_EvaluatePolynomialInMonomialBasis<double, Horner>/12                             3.53 ns         3.53 ns    194782609
BM_EvaluatePolynomialInMonomialBasis<double, Horner>/14                             3.89 ns         3.92 ns    179200000
BM_EvaluatePolynomialInMonomialBasis<double, Horner>/16                             4.39 ns         4.43 ns    165925926
BM_EvaluatePolynomialInMonomialBasis<Length, Horner>/2                              2.67 ns         2.67 ns    263529412
BM_EvaluatePolynomialInMonomialBasis<Length, Horner>/4                              2.87 ns         2.85 ns    235789474
BM_EvaluatePolynomialInMonomialBasis<Length, Horner>/6                              2.80 ns         2.76 ns    248888889
BM_EvaluatePolynomialInMonomialBasis<Length, Horner>/8                              2.88 ns         2.83 ns    248888889
BM_EvaluatePolynomialInMonomialBasis<Length, Horner>/10                             3.32 ns         3.37 ns    213333333
BM_EvaluatePolynomialInMonomialBasis<Length, Horner>/12                             3.73 ns         3.68 ns    186666667
BM_EvaluatePolynomialInMonomialBasis<Length, Horner>/14                             3.91 ns         3.92 ns    179200000
BM_EvaluatePolynomialInMonomialBasis<Length, Horner>/16                             4.43 ns         4.44 ns    161858065
BM_EvaluatePolynomialInMonomialBasis<Displacement<ICRS>, Horner>/2                  2.83 ns         2.83 ns    248888889
BM_EvaluatePolynomialInMonomialBasis<Displacement<ICRS>, Horner>/4                  3.57 ns         3.61 ns    194782609
BM_EvaluatePolynomialInMonomialBasis<Displacement<ICRS>, Horner>/6                  4.23 ns         4.24 ns    165925926
BM_EvaluatePolynomialInMonomialBasis<Displacement<ICRS>, Horner>/8                  5.13 ns         5.02 ns    112000000
BM_EvaluatePolynomialInMonomialBasis<Displacement<ICRS>, Horner>/10                 5.79 ns         5.86 ns    112000000
BM_EvaluatePolynomialInMonomialBasis<Displacement<ICRS>, Horner>/12                 6.50 ns         6.56 ns    112000000
BM_EvaluatePolynomialInMonomialBasis<Displacement<ICRS>, Horner>/14                 9.90 ns         9.84 ns     74666667
BM_EvaluatePolynomialInMonomialBasis<Displacement<ICRS>, Horner>/16                 8.22 ns         8.20 ns     89600000
BM_EvaluatePolynomialInMonomialBasis<double, EstrinWithoutFMA>/2                    2.88 ns         2.89 ns    248888889
BM_EvaluatePolynomialInMonomialBasis<double, EstrinWithoutFMA>/4                    2.89 ns         2.89 ns    248888889
BM_EvaluatePolynomialInMonomialBasis<double, EstrinWithoutFMA>/6                    2.88 ns         2.85 ns    235789474
BM_EvaluatePolynomialInMonomialBasis<double, EstrinWithoutFMA>/8                    2.87 ns         2.89 ns    248888889
BM_EvaluatePolynomialInMonomialBasis<double, EstrinWithoutFMA>/10                   3.12 ns         3.07 ns    224000000
BM_EvaluatePolynomialInMonomialBasis<double, EstrinWithoutFMA>/12                   3.32 ns         3.30 ns    213333333
BM_EvaluatePolynomialInMonomialBasis<double, EstrinWithoutFMA>/14                   3.54 ns         3.53 ns    194782609
BM_EvaluatePolynomialInMonomialBasis<double, EstrinWithoutFMA>/16                   3.98 ns         4.10 ns    179200000
BM_EvaluatePolynomialInMonomialBasis<Length, EstrinWithoutFMA>/2                    2.88 ns         2.89 ns    248888889
BM_EvaluatePolynomialInMonomialBasis<Length, EstrinWithoutFMA>/4                    3.32 ns         3.35 ns    224000000
BM_EvaluatePolynomialInMonomialBasis<Length, EstrinWithoutFMA>/6                    2.88 ns         2.85 ns    235789474
BM_EvaluatePolynomialInMonomialBasis<Length, EstrinWithoutFMA>/8                    2.88 ns         2.95 ns    248888889
BM_EvaluatePolynomialInMonomialBasis<Length, EstrinWithoutFMA>/10                   3.32 ns         3.30 ns    203636364
BM_EvaluatePolynomialInMonomialBasis<Length, EstrinWithoutFMA>/12                   3.33 ns         3.30 ns    203636364
BM_EvaluatePolynomialInMonomialBasis<Length, EstrinWithoutFMA>/14                   3.76 ns         3.77 ns    186666667
BM_EvaluatePolynomialInMonomialBasis<Length, EstrinWithoutFMA>/16                   4.04 ns         4.10 ns    179200000
BM_EvaluatePolynomialInMonomialBasis<Displacement<ICRS>, EstrinWithoutFMA>/2        3.10 ns         3.14 ns    224000000
BM_EvaluatePolynomialInMonomialBasis<Displacement<ICRS>, EstrinWithoutFMA>/4        3.83 ns         3.75 ns    179200000
BM_EvaluatePolynomialInMonomialBasis<Displacement<ICRS>, EstrinWithoutFMA>/6        4.87 ns         4.87 ns    144516129
BM_EvaluatePolynomialInMonomialBasis<Displacement<ICRS>, EstrinWithoutFMA>/8        5.75 ns         5.78 ns    100000000
BM_EvaluatePolynomialInMonomialBasis<Displacement<ICRS>, EstrinWithoutFMA>/10       6.90 ns         6.98 ns    112000000
BM_EvaluatePolynomialInMonomialBasis<Displacement<ICRS>, EstrinWithoutFMA>/12       8.05 ns         8.02 ns     89600000
BM_EvaluatePolynomialInMonomialBasis<Displacement<ICRS>, EstrinWithoutFMA>/14       10.5 ns         10.5 ns     64000000
BM_EvaluatePolynomialInMonomialBasis<Displacement<ICRS>, EstrinWithoutFMA>/16       10.6 ns         10.7 ns     74666667
BM_EvaluatePolynomialInMonomialBasis<double, HornerWithoutFMA>/2                    2.88 ns         2.89 ns    248888889
BM_EvaluatePolynomialInMonomialBasis<double, HornerWithoutFMA>/4                    2.88 ns         2.85 ns    235789474
BM_EvaluatePolynomialInMonomialBasis<double, HornerWithoutFMA>/6                    3.10 ns         3.11 ns    235789474
BM_EvaluatePolynomialInMonomialBasis<double, HornerWithoutFMA>/8                    3.48 ns         3.45 ns    194782609
BM_EvaluatePolynomialInMonomialBasis<double, HornerWithoutFMA>/10                   4.30 ns         4.30 ns    160000000
BM_EvaluatePolynomialInMonomialBasis<double, HornerWithoutFMA>/12                   5.01 ns         5.00 ns    100000000
BM_EvaluatePolynomialInMonomialBasis<double, HornerWithoutFMA>/14                   5.68 ns         5.72 ns    112000000
BM_EvaluatePolynomialInMonomialBasis<double, HornerWithoutFMA>/16                   6.65 ns         6.70 ns    112000000
BM_EvaluatePolynomialInMonomialBasis<Length, HornerWithoutFMA>/2                    2.90 ns         2.92 ns    235789474
BM_EvaluatePolynomialInMonomialBasis<Length, HornerWithoutFMA>/4                    3.10 ns         3.08 ns    213333333
BM_EvaluatePolynomialInMonomialBasis<Length, HornerWithoutFMA>/6                    2.98 ns         2.98 ns    235789474
BM_EvaluatePolynomialInMonomialBasis<Length, HornerWithoutFMA>/8                    3.54 ns         3.61 ns    194782609
BM_EvaluatePolynomialInMonomialBasis<Length, HornerWithoutFMA>/10                   4.22 ns         4.14 ns    165925926
BM_EvaluatePolynomialInMonomialBasis<Length, HornerWithoutFMA>/12                   5.01 ns         5.00 ns    100000000
BM_EvaluatePolynomialInMonomialBasis<Length, HornerWithoutFMA>/14                   5.74 ns         5.72 ns    112000000
BM_EvaluatePolynomialInMonomialBasis<Length, HornerWithoutFMA>/16                   6.64 ns         6.56 ns    112000000
BM_EvaluatePolynomialInMonomialBasis<Displacement<ICRS>, HornerWithoutFMA>/2        3.10 ns         3.07 ns    224000000
BM_EvaluatePolynomialInMonomialBasis<Displacement<ICRS>, HornerWithoutFMA>/4        3.78 ns         3.75 ns    179200000
BM_EvaluatePolynomialInMonomialBasis<Displacement<ICRS>, HornerWithoutFMA>/6        4.91 ns         4.97 ns    144516129
BM_EvaluatePolynomialInMonomialBasis<Displacement<ICRS>, HornerWithoutFMA>/8        6.10 ns         6.00 ns    112000000
BM_EvaluatePolynomialInMonomialBasis<Displacement<ICRS>, HornerWithoutFMA>/10       7.92 ns         7.85 ns     89600000
BM_EvaluatePolynomialInMonomialBasis<Displacement<ICRS>, HornerWithoutFMA>/12       9.72 ns         9.63 ns     74666667
BM_EvaluatePolynomialInMonomialBasis<Displacement<ICRS>, HornerWithoutFMA>/14       10.4 ns         10.5 ns     74666667
BM_EvaluatePolynomialInMonomialBasis<Displacement<ICRS>, HornerWithoutFMA>/16       12.0 ns         11.7 ns     56000000

@eggrobin eggrobin added the LGTM label Mar 29, 2024
@pleroy pleroy merged commit 4c8e29f into mockingbirdnest:master Mar 29, 2024
7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants