Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize the polynomial evaluation #3932

Merged
merged 8 commits into from
Mar 30, 2024
Merged

Conversation

pleroy
Copy link
Member

@pleroy pleroy commented Mar 29, 2024

  1. Pass quantities by copy.
  2. Use __vectorcall to return results in registers.
  3. Simplify the control flow for degree less than 2, where Estrin and Horner are identical.
  4. Do not generate the squares upfront for Estrin, rely on common subexpression elimination instead.

The benchmarks after this PR show small but consistent improvements of 5-10% (compare with #3930).

Run on (48 X 3793 MHz CPU s)
CPU Caches:
  L1 Data 32 KiB (x24)
  L1 Instruction 32 KiB (x24)
  L2 Unified 512 KiB (x24)
  L3 Unified 32768 KiB (x4)
------------------------------------------------------------------------------------------------------------------------
Benchmark                                                                              Time             CPU   Iterations
------------------------------------------------------------------------------------------------------------------------
BM_EvaluatePolynomialInMonomialBasis<double, Estrin>/2                              2.44 ns         2.46 ns    298666667
BM_EvaluatePolynomialInMonomialBasis<double, Estrin>/4                              2.43 ns         2.46 ns    298666667
BM_EvaluatePolynomialInMonomialBasis<double, Estrin>/6                              2.90 ns         2.89 ns    248888889
BM_EvaluatePolynomialInMonomialBasis<double, Estrin>/8                              3.10 ns         3.14 ns    224000000
BM_EvaluatePolynomialInMonomialBasis<double, Estrin>/10                             3.54 ns         3.53 ns    194782609
BM_EvaluatePolynomialInMonomialBasis<double, Estrin>/12                             3.77 ns         3.77 ns    186666667
BM_EvaluatePolynomialInMonomialBasis<double, Estrin>/14                             4.03 ns         4.01 ns    179200000
BM_EvaluatePolynomialInMonomialBasis<double, Estrin>/16                             4.43 ns         4.39 ns    160000000
BM_EvaluatePolynomialInMonomialBasis<Length, Estrin>/2                              2.22 ns         2.25 ns    320000000
BM_EvaluatePolynomialInMonomialBasis<Length, Estrin>/4                              2.43 ns         2.40 ns    280000000
BM_EvaluatePolynomialInMonomialBasis<Length, Estrin>/6                              2.88 ns         2.85 ns    235789474
BM_EvaluatePolynomialInMonomialBasis<Length, Estrin>/8                              3.10 ns         3.14 ns    224000000
BM_EvaluatePolynomialInMonomialBasis<Length, Estrin>/10                             3.54 ns         3.53 ns    194782609
BM_EvaluatePolynomialInMonomialBasis<Length, Estrin>/12                             3.78 ns         3.77 ns    186666667
BM_EvaluatePolynomialInMonomialBasis<Length, Estrin>/14                             3.99 ns         3.99 ns    172307692
BM_EvaluatePolynomialInMonomialBasis<Length, Estrin>/16                             4.45 ns         4.44 ns    161858065
BM_EvaluatePolynomialInMonomialBasis<Displacement<ICRS>, Estrin>/2                  2.67 ns         2.67 ns    263529412
BM_EvaluatePolynomialInMonomialBasis<Displacement<ICRS>, Estrin>/4                  3.32 ns         3.37 ns    213333333
BM_EvaluatePolynomialInMonomialBasis<Displacement<ICRS>, Estrin>/6                  4.23 ns         4.26 ns    172307692
BM_EvaluatePolynomialInMonomialBasis<Displacement<ICRS>, Estrin>/8                  5.33 ns         5.30 ns    112000000
BM_EvaluatePolynomialInMonomialBasis<Displacement<ICRS>, Estrin>/10                 5.53 ns         5.62 ns    100000000
BM_EvaluatePolynomialInMonomialBasis<Displacement<ICRS>, Estrin>/12                 6.99 ns         7.11 ns    112000000
BM_EvaluatePolynomialInMonomialBasis<Displacement<ICRS>, Estrin>/14                 8.47 ns         8.37 ns     74666667
BM_EvaluatePolynomialInMonomialBasis<Displacement<ICRS>, Estrin>/16                 6.00 ns         6.00 ns    112000000
BM_EvaluatePolynomialInMonomialBasis<double, Horner>/2                              2.22 ns         2.20 ns    320000000
BM_EvaluatePolynomialInMonomialBasis<double, Horner>/4                              2.43 ns         2.40 ns    280000000
BM_EvaluatePolynomialInMonomialBasis<double, Horner>/6                              2.44 ns         2.40 ns    280000000
BM_EvaluatePolynomialInMonomialBasis<double, Horner>/8                              2.66 ns         2.64 ns    248888889
BM_EvaluatePolynomialInMonomialBasis<double, Horner>/10                             2.88 ns         2.89 ns    248888889
BM_EvaluatePolynomialInMonomialBasis<double, Horner>/12                             3.07 ns         3.07 ns    224000000
BM_EvaluatePolynomialInMonomialBasis<double, Horner>/14                             3.52 ns         3.53 ns    194782609
BM_EvaluatePolynomialInMonomialBasis<double, Horner>/16                             4.06 ns         3.99 ns    172307692
BM_EvaluatePolynomialInMonomialBasis<Length, Horner>/2                              2.21 ns         2.25 ns    320000000
BM_EvaluatePolynomialInMonomialBasis<Length, Horner>/4                              2.66 ns         2.61 ns    263529412
BM_EvaluatePolynomialInMonomialBasis<Length, Horner>/6                              2.45 ns         2.46 ns    280000000
BM_EvaluatePolynomialInMonomialBasis<Length, Horner>/8                              2.66 ns         2.64 ns    248888889
BM_EvaluatePolynomialInMonomialBasis<Length, Horner>/10                             2.87 ns         2.85 ns    235789474
BM_EvaluatePolynomialInMonomialBasis<Length, Horner>/12                             3.07 ns         3.07 ns    224000000
BM_EvaluatePolynomialInMonomialBasis<Length, Horner>/14                             3.53 ns         3.53 ns    203636364
BM_EvaluatePolynomialInMonomialBasis<Length, Horner>/16                             3.95 ns         3.92 ns    179200000
BM_EvaluatePolynomialInMonomialBasis<Displacement<ICRS>, Horner>/2                  2.43 ns         2.46 ns    298666667
BM_EvaluatePolynomialInMonomialBasis<Displacement<ICRS>, Horner>/4                  3.33 ns         3.35 ns    224000000
BM_EvaluatePolynomialInMonomialBasis<Displacement<ICRS>, Horner>/6                  3.99 ns         3.99 ns    172307692
BM_EvaluatePolynomialInMonomialBasis<Displacement<ICRS>, Horner>/8                  4.92 ns         5.00 ns    100000000
BM_EvaluatePolynomialInMonomialBasis<Displacement<ICRS>, Horner>/10                 5.31 ns         5.30 ns    112000000
BM_EvaluatePolynomialInMonomialBasis<Displacement<ICRS>, Horner>/12                 6.00 ns         5.86 ns    112000000
BM_EvaluatePolynomialInMonomialBasis<Displacement<ICRS>, Horner>/14                 8.16 ns         8.20 ns     89600000
BM_EvaluatePolynomialInMonomialBasis<Displacement<ICRS>, Horner>/16                 7.10 ns         6.98 ns     89600000
BM_EvaluatePolynomialInMonomialBasis<double, EstrinWithoutFMA>/2                    2.43 ns         2.40 ns    280000000
BM_EvaluatePolynomialInMonomialBasis<double, EstrinWithoutFMA>/4                    3.10 ns         3.11 ns    235789474
BM_EvaluatePolynomialInMonomialBasis<double, EstrinWithoutFMA>/6                    2.88 ns         2.89 ns    248888889
BM_EvaluatePolynomialInMonomialBasis<double, EstrinWithoutFMA>/8                    3.11 ns         3.14 ns    224000000
BM_EvaluatePolynomialInMonomialBasis<double, EstrinWithoutFMA>/10                   3.10 ns         3.07 ns    224000000
BM_EvaluatePolynomialInMonomialBasis<double, EstrinWithoutFMA>/12                   3.77 ns         3.68 ns    186666667
BM_EvaluatePolynomialInMonomialBasis<double, EstrinWithoutFMA>/14                   3.55 ns         3.53 ns    203636364
BM_EvaluatePolynomialInMonomialBasis<double, EstrinWithoutFMA>/16                   3.98 ns         3.99 ns    172307692
BM_EvaluatePolynomialInMonomialBasis<Length, EstrinWithoutFMA>/2                    2.66 ns         2.67 ns    263529412
BM_EvaluatePolynomialInMonomialBasis<Length, EstrinWithoutFMA>/4                    3.10 ns         3.11 ns    235789474
BM_EvaluatePolynomialInMonomialBasis<Length, EstrinWithoutFMA>/6                    2.88 ns         2.85 ns    235789474
BM_EvaluatePolynomialInMonomialBasis<Length, EstrinWithoutFMA>/8                    3.10 ns         3.14 ns    224000000
BM_EvaluatePolynomialInMonomialBasis<Length, EstrinWithoutFMA>/10                   3.33 ns         3.35 ns    224000000
BM_EvaluatePolynomialInMonomialBasis<Length, EstrinWithoutFMA>/12                   3.32 ns         3.35 ns    224000000
BM_EvaluatePolynomialInMonomialBasis<Length, EstrinWithoutFMA>/14                   3.57 ns         3.61 ns    203636364
BM_EvaluatePolynomialInMonomialBasis<Length, EstrinWithoutFMA>/16                   3.98 ns         4.01 ns    179200000
BM_EvaluatePolynomialInMonomialBasis<Displacement<ICRS>, EstrinWithoutFMA>/2        2.21 ns         2.20 ns    320000000
BM_EvaluatePolynomialInMonomialBasis<Displacement<ICRS>, EstrinWithoutFMA>/4        3.87 ns         3.85 ns    186666667
BM_EvaluatePolynomialInMonomialBasis<Displacement<ICRS>, EstrinWithoutFMA>/6        4.87 ns         4.87 ns    144516129
BM_EvaluatePolynomialInMonomialBasis<Displacement<ICRS>, EstrinWithoutFMA>/8        5.78 ns         5.86 ns    112000000
BM_EvaluatePolynomialInMonomialBasis<Displacement<ICRS>, EstrinWithoutFMA>/10       6.26 ns         6.28 ns    112000000
BM_EvaluatePolynomialInMonomialBasis<Displacement<ICRS>, EstrinWithoutFMA>/12       7.34 ns         7.50 ns     89600000
BM_EvaluatePolynomialInMonomialBasis<Displacement<ICRS>, EstrinWithoutFMA>/14       8.53 ns         8.54 ns     89600000
BM_EvaluatePolynomialInMonomialBasis<Displacement<ICRS>, EstrinWithoutFMA>/16       7.02 ns         7.11 ns    112000000
BM_EvaluatePolynomialInMonomialBasis<double, HornerWithoutFMA>/2                    2.43 ns         2.46 ns    298666667
BM_EvaluatePolynomialInMonomialBasis<double, HornerWithoutFMA>/4                    3.54 ns         3.61 ns    203636364
BM_EvaluatePolynomialInMonomialBasis<double, HornerWithoutFMA>/6                    3.10 ns         3.07 ns    224000000
BM_EvaluatePolynomialInMonomialBasis<double, HornerWithoutFMA>/8                    3.13 ns         3.14 ns    224000000
BM_EvaluatePolynomialInMonomialBasis<double, HornerWithoutFMA>/10                   3.71 ns         3.66 ns    179200000
BM_EvaluatePolynomialInMonomialBasis<double, HornerWithoutFMA>/12                   4.58 ns         4.65 ns    154482759
BM_EvaluatePolynomialInMonomialBasis<double, HornerWithoutFMA>/14                   5.14 ns         5.16 ns    100000000
BM_EvaluatePolynomialInMonomialBasis<double, HornerWithoutFMA>/16                   6.46 ns         6.42 ns    112000000
BM_EvaluatePolynomialInMonomialBasis<Length, HornerWithoutFMA>/2                    2.67 ns         2.67 ns    263529412
BM_EvaluatePolynomialInMonomialBasis<Length, HornerWithoutFMA>/4                    3.34 ns         3.30 ns    203636364
BM_EvaluatePolynomialInMonomialBasis<Length, HornerWithoutFMA>/6                    3.10 ns         3.14 ns    224000000
BM_EvaluatePolynomialInMonomialBasis<Length, HornerWithoutFMA>/8                    3.32 ns         3.37 ns    213333333
BM_EvaluatePolynomialInMonomialBasis<Length, HornerWithoutFMA>/10                   3.88 ns         3.81 ns    172307692
BM_EvaluatePolynomialInMonomialBasis<Length, HornerWithoutFMA>/12                   4.47 ns         4.45 ns    154482759
BM_EvaluatePolynomialInMonomialBasis<Length, HornerWithoutFMA>/14                   5.15 ns         5.16 ns    100000000
BM_EvaluatePolynomialInMonomialBasis<Length, HornerWithoutFMA>/16                   6.44 ns         6.42 ns    112000000
BM_EvaluatePolynomialInMonomialBasis<Displacement<ICRS>, HornerWithoutFMA>/2        2.22 ns         2.20 ns    320000000
BM_EvaluatePolynomialInMonomialBasis<Displacement<ICRS>, HornerWithoutFMA>/4        3.82 ns         3.85 ns    186666667
BM_EvaluatePolynomialInMonomialBasis<Displacement<ICRS>, HornerWithoutFMA>/6        4.67 ns         4.60 ns    149333333
BM_EvaluatePolynomialInMonomialBasis<Displacement<ICRS>, HornerWithoutFMA>/8        5.58 ns         5.58 ns    112000000
BM_EvaluatePolynomialInMonomialBasis<Displacement<ICRS>, HornerWithoutFMA>/10       7.16 ns         6.98 ns     89600000
BM_EvaluatePolynomialInMonomialBasis<Displacement<ICRS>, HornerWithoutFMA>/12       8.80 ns         8.79 ns     74666667
BM_EvaluatePolynomialInMonomialBasis<Displacement<ICRS>, HornerWithoutFMA>/14       11.6 ns         11.4 ns     56000000
BM_EvaluatePolynomialInMonomialBasis<Displacement<ICRS>, HornerWithoutFMA>/16       10.8 ns         11.0 ns     64000000

Comparison:

Benchmark                                                                                       Time             CPU
--------------------------------------------------------------------------------------------------------------------
BM_EvaluatePolynomialInMonomialBasis<double, Estrin>/2                                       -0.0800         -0.0797
BM_EvaluatePolynomialInMonomialBasis<double, Estrin>/4                                       -0.0501         -0.0556
BM_EvaluatePolynomialInMonomialBasis<double, Estrin>/6                                       -0.0023         +0.0233
BM_EvaluatePolynomialInMonomialBasis<double, Estrin>/8                                       -0.1253         -0.1364
BM_EvaluatePolynomialInMonomialBasis<double, Estrin>/10                                      -0.0622         -0.0630
BM_EvaluatePolynomialInMonomialBasis<double, Estrin>/12                                      -0.0565         -0.0822
BM_EvaluatePolynomialInMonomialBasis<double, Estrin>/14                                      -0.0512         -0.0535
BM_EvaluatePolynomialInMonomialBasis<double, Estrin>/16                                      -0.0485         -0.0492
BM_EvaluatePolynomialInMonomialBasis<Length, Estrin>/2                                       -0.0005         +0.0000
BM_EvaluatePolynomialInMonomialBasis<Length, Estrin>/4                                       -0.1363         -0.1506
BM_EvaluatePolynomialInMonomialBasis<Length, Estrin>/6                                       -0.0101         +0.0000
BM_EvaluatePolynomialInMonomialBasis<Length, Estrin>/8                                       -0.0702         -0.0870
BM_EvaluatePolynomialInMonomialBasis<Length, Estrin>/10                                      -0.0601         -0.0586
BM_EvaluatePolynomialInMonomialBasis<Length, Estrin>/12                                      -0.0497         -0.0125
BM_EvaluatePolynomialInMonomialBasis<Length, Estrin>/14                                      -0.1005         -0.0826
BM_EvaluatePolynomialInMonomialBasis<Length, Estrin>/16                                      -0.0910         -0.0968
BM_EvaluatePolynomialInMonomialBasis<Displacement<ICRS>, Estrin>/2                           -0.1428         -0.1373
BM_EvaluatePolynomialInMonomialBasis<Displacement<ICRS>, Estrin>/4                           -0.0692         -0.0861
BM_EvaluatePolynomialInMonomialBasis<Displacement<ICRS>, Estrin>/6                           -0.0540         -0.0257
BM_EvaluatePolynomialInMonomialBasis<Displacement<ICRS>, Estrin>/8                           -0.0366         -0.0575
BM_EvaluatePolynomialInMonomialBasis<Displacement<ICRS>, Estrin>/10                          -0.0966         -0.0667
BM_EvaluatePolynomialInMonomialBasis<Displacement<ICRS>, Estrin>/12                          -0.0842         -0.1136
BM_EvaluatePolynomialInMonomialBasis<Displacement<ICRS>, Estrin>/14                          -0.1110         -0.1111
BM_EvaluatePolynomialInMonomialBasis<Displacement<ICRS>, Estrin>/16                          -0.1572         -0.1569
BM_EvaluatePolynomialInMonomialBasis<double, Horner>/2                                       -0.1547         -0.1486
BM_EvaluatePolynomialInMonomialBasis<double, Horner>/4                                       -0.0773         -0.0761
BM_EvaluatePolynomialInMonomialBasis<double, Horner>/6                                       -0.1470         -0.1383
BM_EvaluatePolynomialInMonomialBasis<double, Horner>/8                                       -0.1099         -0.0991
BM_EvaluatePolynomialInMonomialBasis<double, Horner>/10                                      -0.0244         -0.0111
BM_EvaluatePolynomialInMonomialBasis<double, Horner>/12                                      -0.1143         -0.1304
BM_EvaluatePolynomialInMonomialBasis<double, Horner>/14                                      -0.0881         -0.0800
BM_EvaluatePolynomialInMonomialBasis<double, Horner>/16                                      -0.0766         -0.0675
BM_EvaluatePolynomialInMonomialBasis<Length, Horner>/2                                       -0.0843         -0.0784
BM_EvaluatePolynomialInMonomialBasis<Length, Horner>/4                                       -0.1436         -0.1429
BM_EvaluatePolynomialInMonomialBasis<Length, Horner>/6                                       -0.0534         -0.0556
BM_EvaluatePolynomialInMonomialBasis<Length, Horner>/8                                       -0.0047         +0.0000
BM_EvaluatePolynomialInMonomialBasis<Length, Horner>/10                                      -0.1315         -0.1163
BM_EvaluatePolynomialInMonomialBasis<Length, Horner>/12                                      -0.1714         -0.1498
BM_EvaluatePolynomialInMonomialBasis<Length, Horner>/14                                      -0.0988         -0.0948
BM_EvaluatePolynomialInMonomialBasis<Length, Horner>/16                                      -0.1039         -0.0778
BM_EvaluatePolynomialInMonomialBasis<Displacement<ICRS>, Horner>/2                           -0.1503         -0.1554
BM_EvaluatePolynomialInMonomialBasis<Displacement<ICRS>, Horner>/4                           +0.0420         +0.0698
BM_EvaluatePolynomialInMonomialBasis<Displacement<ICRS>, Horner>/6                           -0.0565         -0.0449
BM_EvaluatePolynomialInMonomialBasis<Displacement<ICRS>, Horner>/8                           -0.0861         -0.0784
BM_EvaluatePolynomialInMonomialBasis<Displacement<ICRS>, Horner>/10                          -0.0909         -0.0712
BM_EvaluatePolynomialInMonomialBasis<Displacement<ICRS>, Horner>/12                          -0.0890         -0.0851
BM_EvaluatePolynomialInMonomialBasis<Displacement<ICRS>, Horner>/14                          -0.1502         -0.1522
BM_EvaluatePolynomialInMonomialBasis<Displacement<ICRS>, Horner>/16                          -0.1311         -0.1239
BM_EvaluatePolynomialInMonomialBasis<double, EstrinWithoutFMA>/2                             -0.1589         -0.1579
BM_EvaluatePolynomialInMonomialBasis<double, EstrinWithoutFMA>/4                             -0.1202         -0.1311
BM_EvaluatePolynomialInMonomialBasis<double, EstrinWithoutFMA>/6                             +0.0802         +0.0870
BM_EvaluatePolynomialInMonomialBasis<double, EstrinWithoutFMA>/8                             -0.1874         -0.1927
BM_EvaluatePolynomialInMonomialBasis<double, EstrinWithoutFMA>/10                            +0.0007         +0.0047
BM_EvaluatePolynomialInMonomialBasis<double, EstrinWithoutFMA>/12                            -0.0626         -0.0861
BM_EvaluatePolynomialInMonomialBasis<double, EstrinWithoutFMA>/14                            -0.0100         -0.0222
BM_EvaluatePolynomialInMonomialBasis<double, EstrinWithoutFMA>/16                            -0.0091         +0.0141
BM_EvaluatePolynomialInMonomialBasis<Length, EstrinWithoutFMA>/2                             -0.0746         -0.0636
BM_EvaluatePolynomialInMonomialBasis<Length, EstrinWithoutFMA>/4                             -0.1990         -0.1873
BM_EvaluatePolynomialInMonomialBasis<Length, EstrinWithoutFMA>/6                             +0.0616         +0.0362
BM_EvaluatePolynomialInMonomialBasis<Length, EstrinWithoutFMA>/8                             -0.0778         -0.0851
BM_EvaluatePolynomialInMonomialBasis<Length, EstrinWithoutFMA>/10                            -0.1237         -0.1176
BM_EvaluatePolynomialInMonomialBasis<Length, EstrinWithoutFMA>/12                            +0.0016         +0.0000
BM_EvaluatePolynomialInMonomialBasis<Length, EstrinWithoutFMA>/14                            -0.0597         -0.0426
BM_EvaluatePolynomialInMonomialBasis<Length, EstrinWithoutFMA>/16                            +0.0225         +0.0169
BM_EvaluatePolynomialInMonomialBasis<Displacement<ICRS>, EstrinWithoutFMA>/2                 -0.3329         -0.3499
BM_EvaluatePolynomialInMonomialBasis<Displacement<ICRS>, EstrinWithoutFMA>/4                 +0.0248         -0.0062
BM_EvaluatePolynomialInMonomialBasis<Displacement<ICRS>, EstrinWithoutFMA>/6                 -0.0018         -0.0222
BM_EvaluatePolynomialInMonomialBasis<Displacement<ICRS>, EstrinWithoutFMA>/8                 -0.0437         -0.0353
BM_EvaluatePolynomialInMonomialBasis<Displacement<ICRS>, EstrinWithoutFMA>/10                -0.0832         -0.0816
BM_EvaluatePolynomialInMonomialBasis<Displacement<ICRS>, EstrinWithoutFMA>/12                -0.1206         -0.1087
BM_EvaluatePolynomialInMonomialBasis<Displacement<ICRS>, EstrinWithoutFMA>/14                -0.0944         -0.0842
BM_EvaluatePolynomialInMonomialBasis<Displacement<ICRS>, EstrinWithoutFMA>/16                -0.3263         -0.3355
BM_EvaluatePolynomialInMonomialBasis<double, HornerWithoutFMA>/2                             -0.2671         -0.2550
BM_EvaluatePolynomialInMonomialBasis<double, HornerWithoutFMA>/4                             -0.1995         -0.1704
BM_EvaluatePolynomialInMonomialBasis<double, HornerWithoutFMA>/6                             -0.0723         -0.0851
BM_EvaluatePolynomialInMonomialBasis<double, HornerWithoutFMA>/8                             -0.1137         -0.1304
BM_EvaluatePolynomialInMonomialBasis<double, HornerWithoutFMA>/10                            -0.1294         -0.1292
BM_EvaluatePolynomialInMonomialBasis<double, HornerWithoutFMA>/12                            -0.0908         -0.1052
BM_EvaluatePolynomialInMonomialBasis<double, HornerWithoutFMA>/14                            -0.1017         -0.1081
BM_EvaluatePolynomialInMonomialBasis<double, HornerWithoutFMA>/16                            -0.0376         -0.0105
BM_EvaluatePolynomialInMonomialBasis<Length, HornerWithoutFMA>/2                             -0.2010         -0.1873
BM_EvaluatePolynomialInMonomialBasis<Length, HornerWithoutFMA>/4                             -0.1865         -0.1927
BM_EvaluatePolynomialInMonomialBasis<Length, HornerWithoutFMA>/6                             +0.0723         +0.0373
BM_EvaluatePolynomialInMonomialBasis<Length, HornerWithoutFMA>/8                             -0.0267         -0.0238
BM_EvaluatePolynomialInMonomialBasis<Length, HornerWithoutFMA>/10                            -0.1319         -0.1498
BM_EvaluatePolynomialInMonomialBasis<Length, HornerWithoutFMA>/12                            -0.0973         -0.1016
BM_EvaluatePolynomialInMonomialBasis<Length, HornerWithoutFMA>/14                            -0.1226         -0.1541
BM_EvaluatePolynomialInMonomialBasis<Length, HornerWithoutFMA>/16                            -0.0270         -0.0365
BM_EvaluatePolynomialInMonomialBasis<Displacement<ICRS>, HornerWithoutFMA>/2                 -0.2152         -0.2167
BM_EvaluatePolynomialInMonomialBasis<Displacement<ICRS>, HornerWithoutFMA>/4                 +0.0312         +0.0233
BM_EvaluatePolynomialInMonomialBasis<Displacement<ICRS>, HornerWithoutFMA>/6                 -0.0775         -0.0557
BM_EvaluatePolynomialInMonomialBasis<Displacement<ICRS>, HornerWithoutFMA>/8                 -0.0816         -0.0909
BM_EvaluatePolynomialInMonomialBasis<Displacement<ICRS>, HornerWithoutFMA>/10                -0.0905         -0.0631
BM_EvaluatePolynomialInMonomialBasis<Displacement<ICRS>, HornerWithoutFMA>/12                -0.0896         -0.0761
BM_EvaluatePolynomialInMonomialBasis<Displacement<ICRS>, HornerWithoutFMA>/14                +0.1116         +0.0719
BM_EvaluatePolynomialInMonomialBasis<Displacement<ICRS>, HornerWithoutFMA>/16                -0.4625         -0.4615

virtual Value operator()(Argument const& argument) const = 0;
virtual Derivative<Value, Argument> EvaluateDerivative(
Argument const& argument) const = 0;
virtual Value __vectorcall operator()(Argument argument) const = 0;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Clang seems to pass things in registers with its default calling convention, so let’s make __vectorcall conditional on MSVC.

See https://godbolt.org/z/MqTP8vcK8.

@eggrobin eggrobin added the LGTM label Mar 30, 2024
Co-authored-by: Robin Leroy <egg.robin.leroy@gmail.com>
@pleroy pleroy merged commit 6b7a8c9 into mockingbirdnest:master Mar 30, 2024
7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants