Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a comparative benchmark for SkM44 vs SkMatrix vs impeller::Matrix #51332

Merged
merged 3 commits into from
Mar 13, 2024

Conversation

flar
Copy link
Contributor

@flar flar commented Mar 11, 2024

In preparation for converting the DisplayList and engine Layer code to using Impeller data structures the performance differential for the various transform objects is important information. According to this benchmark, the performance of Impeller's data structure is sometimes faster and never significantly worse than the Skia objects.

@flar
Copy link
Contributor Author

flar commented Mar 11, 2024

Preliminary results running on my MBP M1 Max:

Results on MBP M1 Max
Benchmark                                          Time             CPU   Iterations
------------------------------------------------------------------------------------
BM_AdapterDispatchOverhead/SkMatrix             1.24 ns         1.24 ns    555895269
BM_AdapterDispatchOverhead/SkM44                1.24 ns         1.24 ns    563884033
BM_AdapterDispatchOverhead/ImpellerMatrix       1.24 ns         1.24 ns    564434195
BM_AdapterRectOverhead/SkMatrix                 1.24 ns         1.24 ns    564329536
BM_AdapterRectOverhead/SkM44                    1.24 ns         1.24 ns    563911289
BM_AdapterRectOverhead/ImpellerMatrix           1.24 ns         1.24 ns    564229464
BM_SetIdentity/SkMatrix                         2.42 ns         2.42 ns    289735099
BM_SetIdentity/SkM44                            1.99 ns         1.99 ns    343745550
BM_SetIdentity/ImpellerMatrix                   1.99 ns         1.99 ns    343413332
BM_SetTranslate/SkMatrix                        3.69 ns         3.68 ns    186232621
BM_SetTranslate/SkM44                           2.21 ns         2.21 ns    317256007
BM_SetTranslate/ImpellerMatrix                  2.20 ns         2.20 ns    317709586
BM_SetScale/SkMatrix                            3.88 ns         3.88 ns    181758697
BM_SetScale/SkM44                               2.17 ns         2.17 ns    318999617
BM_SetScale/ImpellerMatrix                      2.18 ns         2.17 ns    312916290
BM_SetRotate/SkMatrix                           7.03 ns         7.02 ns     99342918
BM_SetRotate/SkM44                              8.68 ns         8.67 ns     80480121
BM_SetRotate/ImpellerMatrix                     3.75 ns         3.74 ns    187162773
BM_IdentityBounds/SkMatrix                      3.27 ns         3.26 ns    215113903
BM_IdentityBounds/SkM44                         6.95 ns         6.94 ns    100771623
BM_IdentityBounds/ImpellerMatrix                6.96 ns         6.95 ns    100532824
BM_TranslateBounds/SkMatrix                     3.28 ns         3.27 ns    215096056
BM_TranslateBounds/SkM44                        6.95 ns         6.94 ns    100710730
BM_TranslateBounds/ImpellerMatrix               6.97 ns         6.94 ns    100625314
BM_ScaleBounds/SkMatrix                         3.77 ns         3.73 ns    188220616
BM_ScaleBounds/SkM44                            7.31 ns         7.29 ns     96010095
BM_ScaleBounds/ImpellerMatrix                   6.95 ns         6.93 ns    100365618
BM_ScaleTranslateBounds/SkMatrix                3.73 ns         3.72 ns    188238331
BM_ScaleTranslateBounds/SkM44                   7.31 ns         7.29 ns     95978501
BM_ScaleTranslateBounds/ImpellerMatrix          6.95 ns         6.94 ns    100665833
BM_RotateBounds/SkMatrix                        9.65 ns         9.64 ns     72754485
BM_RotateBounds/SkM44                           13.3 ns         13.3 ns     53099493
BM_RotateBounds/ImpellerMatrix                  6.97 ns         6.96 ns     98231827

@jonahwilliams
Copy link
Member

Copy link
Member

@jonahwilliams jonahwilliams left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@flar
Copy link
Contributor Author

flar commented Mar 12, 2024

Newer results based on the new architecture and with many more cases measured:

Results from M1 Max MBP
Benchmark                                                       Time             CPU   Iterations
-------------------------------------------------------------------------------------------------
BM_AdapterDispatchOverhead/SkMatrix                          1.55 ns         1.55 ns    426962043
BM_AdapterDispatchOverhead/SkM44                             1.24 ns         1.24 ns    564261301
BM_AdapterDispatchOverhead/ImpellerMatrix                    1.17 ns         1.17 ns    656260254
BM_SetIdentity/SkMatrix                                      2.15 ns         2.15 ns    313046434
BM_SetIdentity/SkM44                                         2.16 ns         2.16 ns    322724905
BM_SetIdentity/ImpellerMatrix                                1.99 ns         1.99 ns    350408226
BM_Translate/SkMatrix                                        3.42 ns         3.41 ns    204990644
BM_Translate/SkM44                                           5.28 ns         5.27 ns    132683814
BM_Translate/ImpellerMatrix                                  3.41 ns         3.41 ns    205341218
BM_Scale/SkMatrix                                            4.71 ns         4.71 ns    148696042
BM_Scale/SkM44                                               3.44 ns         3.41 ns    205272573
BM_Scale/ImpellerMatrix                                      3.41 ns         3.41 ns    205319536
BM_Rotate/SkMatrix                                           14.8 ns         14.8 ns     47420976
BM_Rotate/SkM44                                              12.5 ns         12.5 ns     56070874
BM_Rotate/ImpellerMatrix                                     11.1 ns         11.0 ns     63456378
BM_Concat/Scale*Translate/SkMatrix                           5.14 ns         5.13 ns    136399065
BM_Concat/Scale*Translate/SkM44                              3.94 ns         3.93 ns    178003362
BM_Concat/Scale*Translate/ImpellerMatrix                     5.34 ns         5.33 ns    131403578
BM_Concat/ScaleTranslate*ScaleTranslate/SkMatrix             5.12 ns         5.12 ns    137152710
BM_Concat/ScaleTranslate*ScaleTranslate/SkM44                3.93 ns         3.92 ns    174519200
BM_Concat/ScaleTranslate*ScaleTranslate/ImpellerMatrix       5.32 ns         5.31 ns    131522086
BM_Concat/ScaleTranslate*Rotate/SkMatrix                     5.99 ns         5.98 ns    116875094
BM_Concat/ScaleTranslate*Rotate/SkM44                        3.91 ns         3.90 ns    179178441
BM_Concat/ScaleTranslate*Rotate/ImpellerMatrix               5.32 ns         5.31 ns    131477621
BM_InvertUnchecked/Identity/SkMatrix                         2.48 ns         2.48 ns    282385593
BM_InvertUnchecked/Identity/SkM44                            20.8 ns         20.7 ns     33772374
BM_InvertUnchecked/Identity/ImpellerMatrix                   15.3 ns         15.3 ns     45699363
BM_InvertUnchecked/Translate/SkMatrix                        3.79 ns         3.78 ns    185148939
BM_InvertUnchecked/Translate/SkM44                           20.8 ns         20.7 ns     33774330
BM_InvertUnchecked/Translate/ImpellerMatrix                  15.3 ns         15.3 ns     45702646
BM_InvertUnchecked/Scale/SkMatrix                            4.04 ns         4.03 ns    173740813
BM_InvertUnchecked/Scale/SkM44                               20.8 ns         20.7 ns     33756414
BM_InvertUnchecked/Scale/ImpellerMatrix                      15.3 ns         15.3 ns     45708614
BM_InvertUnchecked/ScaleTranslate/SkMatrix                   4.04 ns         4.03 ns    173747713
BM_InvertUnchecked/ScaleTranslate/SkM44                      20.8 ns         20.7 ns     33764881
BM_InvertUnchecked/ScaleTranslate/ImpellerMatrix             15.3 ns         15.3 ns     45699363
BM_InvertUnchecked/Rotate/SkMatrix                           9.15 ns         9.13 ns     76158148
BM_InvertUnchecked/Rotate/SkM44                              20.7 ns         20.7 ns     33760321
BM_InvertUnchecked/Rotate/ImpellerMatrix                     15.3 ns         15.3 ns     45697573
BM_InvertAndCheck/Identity/SkMatrix                          2.48 ns         2.48 ns    282280829
BM_InvertAndCheck/Identity/SkM44                             20.8 ns         20.7 ns     33784436
BM_InvertAndCheck/Identity/ImpellerMatrix                    18.5 ns         18.4 ns     37942230
BM_InvertAndCheck/Translate/SkMatrix                         3.79 ns         3.78 ns    185159713
BM_InvertAndCheck/Translate/SkM44                            20.8 ns         20.7 ns     33771234
BM_InvertAndCheck/Translate/ImpellerMatrix                   18.5 ns         18.4 ns     37941408
BM_InvertAndCheck/Scale/SkMatrix                             4.04 ns         4.03 ns    173552572
BM_InvertAndCheck/Scale/SkM44                                20.8 ns         20.7 ns     33770419
BM_InvertAndCheck/Scale/ImpellerMatrix                       18.5 ns         18.4 ns     37891911
BM_InvertAndCheck/ScaleTranslate/SkMatrix                    4.04 ns         4.03 ns    169934672
BM_InvertAndCheck/ScaleTranslate/SkM44                       20.7 ns         20.7 ns     33774818
BM_InvertAndCheck/ScaleTranslate/ImpellerMatrix              18.5 ns         18.5 ns     37840701
BM_InvertAndCheck/Rotate/SkMatrix                            9.07 ns         9.05 ns     76025805
BM_InvertAndCheck/Rotate/SkM44                               21.0 ns         21.0 ns     33775796
BM_InvertAndCheck/Rotate/ImpellerMatrix                      18.7 ns         18.6 ns     36730560
BM_TransformPoint/Identity/SkMatrix                          3.61 ns         3.61 ns    195285801
BM_TransformPoint/Identity/SkM44                             7.41 ns         7.40 ns     94440172
BM_TransformPoint/Identity/ImpellerMatrix                    1.87 ns         1.86 ns    375244579
BM_TransformPoint/Translate/SkMatrix                         3.85 ns         3.84 ns    182194401
BM_TransformPoint/Translate/SkM44                            7.58 ns         7.56 ns     92510606
BM_TransformPoint/Translate/ImpellerMatrix                   1.86 ns         1.86 ns    370764676
BM_TransformPoint/Scale/SkMatrix                             3.85 ns         3.85 ns    182136566
BM_TransformPoint/Scale/SkM44                                7.58 ns         7.56 ns     92465392
BM_TransformPoint/Scale/ImpellerMatrix                       1.86 ns         1.86 ns    375449869
BM_TransformPoint/ScaleTranslate/SkMatrix                    3.92 ns         3.92 ns    178694053
BM_TransformPoint/ScaleTranslate/SkM44                       7.65 ns         7.64 ns     91528393
BM_TransformPoint/ScaleTranslate/ImpellerMatrix              1.87 ns         1.86 ns    374925015
BM_TransformPoint/Rotate/SkMatrix                            4.13 ns         4.12 ns    170018872
BM_TransformPoint/Rotate/SkM44                               7.89 ns         7.87 ns     88906953
BM_TransformPoint/Rotate/ImpellerMatrix                      1.86 ns         1.86 ns    376216658
BM_TransformPoints/Identity/SkMatrix                         13.0 ns         13.0 ns     53923721 items_per_second=7.71264G/s
BM_TransformPoints/Identity/SkM44                            16.9 ns         16.9 ns     41518387 items_per_second=5.92838G/s
BM_TransformPoints/Identity/ImpellerMatrix                   44.8 ns         44.7 ns     15669139 items_per_second=2.23548G/s
BM_TransformPoints/Translate/SkMatrix                        18.3 ns         18.2 ns     38374037 items_per_second=5.48761G/s
BM_TransformPoints/Translate/SkM44                           19.4 ns         19.4 ns     36045500 items_per_second=5.15602G/s
BM_TransformPoints/Translate/ImpellerMatrix                  45.2 ns         44.7 ns     15668438 items_per_second=2.2381G/s
BM_TransformPoints/Scale/SkMatrix                            18.4 ns         18.4 ns     38509358 items_per_second=5.43258G/s
BM_TransformPoints/Scale/SkM44                               19.8 ns         19.7 ns     34722050 items_per_second=5.07025G/s
BM_TransformPoints/Scale/ImpellerMatrix                      44.8 ns         44.7 ns     15640711 items_per_second=2.23469G/s
BM_TransformPoints/ScaleTranslate/SkMatrix                   18.2 ns         18.2 ns     38515291 items_per_second=5.50534G/s
BM_TransformPoints/ScaleTranslate/SkM44                      19.6 ns         19.6 ns     35699532 items_per_second=5.10892G/s
BM_TransformPoints/ScaleTranslate/ImpellerMatrix             45.6 ns         45.5 ns     15531981 items_per_second=2.19797G/s
BM_TransformPoints/Rotate/SkMatrix                           23.1 ns         23.0 ns     29977816 items_per_second=4.33993G/s
BM_TransformPoints/Rotate/SkM44                              27.9 ns         27.9 ns     25105893 items_per_second=3.58652G/s
BM_TransformPoints/Rotate/ImpellerMatrix                     46.2 ns         46.2 ns     15156042 items_per_second=2.16515G/s
BM_TransformRect/Identity/SkMatrix                           3.49 ns         3.49 ns    200817615
BM_TransformRect/Identity/SkM44                              7.31 ns         7.30 ns     95992979
BM_TransformRect/Identity/ImpellerMatrix                     7.54 ns         7.52 ns     92691905
BM_TransformRect/Translate/SkMatrix                          3.50 ns         3.49 ns    200776719
BM_TransformRect/Translate/SkM44                             7.28 ns         7.27 ns     96267569
BM_TransformRect/Translate/ImpellerMatrix                    7.44 ns         7.42 ns     93097486
BM_TransformRect/Scale/SkMatrix                              3.83 ns         3.83 ns    188204422
BM_TransformRect/Scale/SkM44                                 7.70 ns         7.69 ns     90923261
BM_TransformRect/Scale/ImpellerMatrix                        7.54 ns         7.53 ns     93156956
BM_TransformRect/ScaleTranslate/SkMatrix                     3.85 ns         3.85 ns    182029052
BM_TransformRect/ScaleTranslate/SkM44                        7.70 ns         7.69 ns     91061649
BM_TransformRect/ScaleTranslate/ImpellerMatrix               7.54 ns         7.53 ns     92883776
BM_TransformRect/Rotate/SkMatrix                             9.93 ns         9.92 ns     70379345
BM_TransformRect/Rotate/SkM44                                13.4 ns         13.4 ns     52392072
BM_TransformRect/Rotate/ImpellerMatrix                       7.31 ns         7.30 ns     95932464

@flar
Copy link
Contributor Author

flar commented Mar 12, 2024

I'd like to add Perspective matrices for completion. They aren't used a lot, but we might as well track them.

@flar flar requested a review from jonahwilliams March 12, 2024 10:57
@flar
Copy link
Contributor Author

flar commented Mar 12, 2024

New perspective cases added. Here is the associated output from the extra new tests:

Perspective results on M1 Max MBP
BM_SetPerspective/SkMatrix                                   9.37 ns         9.34 ns     75038055
BM_SetPerspective/SkM44                                      8.41 ns         8.38 ns     83403829
BM_SetPerspective/ImpellerMatrix                             5.41 ns         5.39 ns    129615228
BM_Concat/ScaleTranslate*Perspective/SkMatrix                5.16 ns         5.15 ns    135184720
BM_Concat/ScaleTranslate*Perspective/SkM44                   3.91 ns         3.90 ns    178776657
BM_Concat/ScaleTranslate*Perspective/ImpellerMatrix          5.32 ns         5.31 ns    130860689
BM_Concat/Perspective*ScaleTranslate/SkMatrix                5.20 ns         5.19 ns    134264232
BM_Concat/Perspective*ScaleTranslate/SkM44                   3.95 ns         3.94 ns    178492200
BM_Concat/Perspective*ScaleTranslate/ImpellerMatrix          5.35 ns         5.34 ns    131591315
BM_InvertUnchecked/Perspective/SkMatrix                      4.04 ns         4.03 ns    173736933
BM_InvertUnchecked/Perspective/SkM44                         20.8 ns         20.7 ns     33769930
BM_InvertUnchecked/Perspective/ImpellerMatrix                15.3 ns         15.3 ns     45693099
BM_InvertAndCheck/Perspective/SkMatrix                       4.03 ns         4.03 ns    173744695
BM_InvertAndCheck/Perspective/SkM44                          20.7 ns         20.7 ns     33770093
BM_InvertAndCheck/Perspective/ImpellerMatrix                 18.5 ns         18.4 ns     37937089
BM_TransformPoint/Perspective/SkMatrix                       3.96 ns         3.96 ns    182023845
BM_TransformPoint/Perspective/SkM44                          7.57 ns         7.56 ns     92460506
BM_TransformPoint/Perspective/ImpellerMatrix                 1.86 ns         1.86 ns    376174201
BM_TransformPoints/Perspective/SkMatrix                      18.2 ns         18.1 ns     38472108 items_per_second=5.51437G/s
BM_TransformPoints/Perspective/SkM44                         19.7 ns         19.6 ns     35820101 items_per_second=5.09179G/s
BM_TransformPoints/Perspective/ImpellerMatrix                44.8 ns         44.7 ns     15651972 items_per_second=2.23837G/s
BM_TransformRect/Perspective/SkMatrix                        3.73 ns         3.72 ns    188170020
BM_TransformRect/Perspective/SkM44                           7.45 ns         7.44 ns     93641727
BM_TransformRect/Perspective/ImpellerMatrix                  7.31 ns         7.30 ns     95691164

@jonahwilliams
Copy link
Member

I think its fine to add more benchmark cases - these engine benchmarks are very cheap since they run on host machines.

Copy link
Member

@jonahwilliams jonahwilliams left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@flar flar added the autosubmit Merge PR when tree becomes green via auto submit App label Mar 13, 2024
@auto-submit auto-submit bot merged commit 323944a into flutter:main Mar 13, 2024
28 checks passed
engine-flutter-autoroll added a commit to engine-flutter-autoroll/flutter that referenced this pull request Mar 13, 2024
auto-submit bot pushed a commit to flutter/flutter that referenced this pull request Mar 13, 2024
…145094)

flutter/engine@2871c26...6ceccf8

2024-03-13 skia-flutter-autoroll@skia.org Roll Fuchsia Linux SDK from UR-nKoLidl7cVLrrN... to mYZPzdM3hCE1TA91s... (flutter/engine#51370)
2024-03-13 flar@google.com Add a comparative benchmark for SkM44 vs SkMatrix vs impeller::Matrix (flutter/engine#51332)

Also rolling transitive DEPS:
  fuchsia/sdk/core/linux-amd64 from UR-nKoLidl7c to mYZPzdM3hCE1

If this roll has caused a breakage, revert this CL and stop the roller
using the controls here:
https://autoroll.skia.org/r/flutter-engine-flutter-autoroll
Please CC bdero@google.com,rmistry@google.com,zra@google.com on the revert to ensure that a human
is aware of the problem.

To file a bug in Flutter: https://github.com/flutter/flutter/issues/new/choose

To report a problem with the AutoRoller itself, please file a bug:
https://issues.skia.org/issues/new?component=1389291&template=1850622

Documentation for the AutoRoller is here:
https://skia.googlesource.com/buildbot/+doc/main/autoroll/README.md
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
autosubmit Merge PR when tree becomes green via auto submit App
Projects
None yet
2 participants