Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SofaHelper] Reduce AdvancedTimer overhead #2645

Merged
merged 1 commit into from
Feb 1, 2022

Conversation

alxbilger
Copy link
Contributor

Again a case where a std::unordered_map is preferable over a std::map. It makes a difference in the benchmarks:

Before

-----------------------------------------------------------------------------------
Benchmark                                         Time             CPU   Iterations
-----------------------------------------------------------------------------------
BM_AdvancedTimer_begin_end                     78.4 ns         78.1 ns     11200000
BM_AdvancedTimer_largeNumberTimers/128        0.012 ms        0.012 ms        64000
BM_AdvancedTimer_largeNumberTimers/512        0.073 ms        0.073 ms         8960
BM_AdvancedTimer_largeNumberTimers/4096       0.760 ms        0.767 ms          896
BM_AdvancedTimer_largeNumberTimers/16384       3.39 ms         3.37 ms          204
BM_AdvancedTimer_deepTreeEnabled/1/2          0.000 ms        0.000 ms      1659259 nbTimers=2
BM_AdvancedTimer_deepTreeEnabled/2/2          0.001 ms        0.001 ms       560000 nbTimers=6
BM_AdvancedTimer_deepTreeEnabled/4/2          0.004 ms        0.004 ms       165926 nbTimers=20
BM_AdvancedTimer_deepTreeEnabled/8/2          0.016 ms        0.016 ms        44800 nbTimers=72
BM_AdvancedTimer_deepTreeEnabled/16/2         0.072 ms        0.071 ms         8960 nbTimers=272
BM_AdvancedTimer_deepTreeEnabled/32/2         0.302 ms        0.298 ms         2358 nbTimers=1056
BM_AdvancedTimer_deepTreeEnabled/64/2          1.35 ms         1.35 ms          498 nbTimers=4.16k
BM_AdvancedTimer_deepTreeEnabled/1/3          0.001 ms        0.001 ms      1120000 nbTimers=3
BM_AdvancedTimer_deepTreeEnabled/2/3          0.003 ms        0.003 ms       248889 nbTimers=14
BM_AdvancedTimer_deepTreeEnabled/4/3          0.019 ms        0.019 ms        37333 nbTimers=84
BM_AdvancedTimer_deepTreeEnabled/8/3          0.153 ms        0.150 ms         4480 nbTimers=584
BM_AdvancedTimer_deepTreeEnabled/16/3          1.15 ms         1.15 ms          640 nbTimers=4.368k
BM_AdvancedTimer_deepTreeEnabled/32/3          9.69 ms         9.58 ms           75 nbTimers=33.824k
BM_AdvancedTimer_deepTreeEnabled/64/3           103 ms          103 ms            7 nbTimers=266.304k
BM_AdvancedTimer_deepTreeEnabled/1/4          0.001 ms        0.001 ms       640000 nbTimers=4
BM_AdvancedTimer_deepTreeEnabled/2/4          0.008 ms        0.008 ms        89600 nbTimers=30
BM_AdvancedTimer_deepTreeEnabled/4/4          0.115 ms        0.114 ms         5600 nbTimers=340
BM_AdvancedTimer_deepTreeEnabled/8/4           1.74 ms         1.77 ms          407 nbTimers=4.68k
BM_AdvancedTimer_deepTreeEnabled/16/4          24.7 ms         25.0 ms           30 nbTimers=69.904k
BM_AdvancedTimer_deepTreeEnabled/32/4           759 ms          750 ms            1 nbTimers=1082.4k
BM_AdvancedTimer_deepTreeEnabled/64/4         16211 ms        16203 ms            1 nbTimers=17.0435M
BM_AdvancedTimer_deepTreeDisabled/1/2         0.000 ms        0.000 ms      1120000 nbTimers=2
BM_AdvancedTimer_deepTreeDisabled/2/2         0.002 ms        0.002 ms       407273 nbTimers=6
BM_AdvancedTimer_deepTreeDisabled/4/2         0.006 ms        0.006 ms       112000 nbTimers=20
BM_AdvancedTimer_deepTreeDisabled/8/2         0.025 ms        0.025 ms        28000 nbTimers=72
BM_AdvancedTimer_deepTreeDisabled/16/2        0.142 ms        0.141 ms         4978 nbTimers=272
BM_AdvancedTimer_deepTreeDisabled/32/2        0.702 ms        0.698 ms          896 nbTimers=1056
BM_AdvancedTimer_deepTreeDisabled/64/2         5.51 ms         5.58 ms          112 nbTimers=4.16k
BM_AdvancedTimer_deepTreeDisabled/1/3         0.001 ms        0.001 ms       896000 nbTimers=3
BM_AdvancedTimer_deepTreeDisabled/2/3         0.004 ms        0.004 ms       172308 nbTimers=14
BM_AdvancedTimer_deepTreeDisabled/4/3         0.030 ms        0.030 ms        23579 nbTimers=84
BM_AdvancedTimer_deepTreeDisabled/8/3         0.371 ms        0.368 ms         1867 nbTimers=584
BM_AdvancedTimer_deepTreeDisabled/16/3         5.12 ms         5.16 ms          100 nbTimers=4.368k
BM_AdvancedTimer_deepTreeDisabled/32/3         43.7 ms         43.2 ms           17 nbTimers=33.824k
BM_AdvancedTimer_deepTreeDisabled/64/3          235 ms          234 ms            3 nbTimers=266.304k
BM_AdvancedTimer_deepTreeDisabled/1/4         0.001 ms        0.001 ms       640000 nbTimers=4
BM_AdvancedTimer_deepTreeDisabled/2/4         0.010 ms        0.010 ms        74667 nbTimers=30
BM_AdvancedTimer_deepTreeDisabled/4/4         0.202 ms        0.200 ms         3446 nbTimers=340
BM_AdvancedTimer_deepTreeDisabled/8/4          5.57 ms         5.58 ms          112 nbTimers=4.68k
BM_AdvancedTimer_deepTreeDisabled/16/4         75.2 ms         74.7 ms            9 nbTimers=69.904k
BM_AdvancedTimer_deepTreeDisabled/32/4          624 ms          625 ms            1 nbTimers=1082.4k
BM_AdvancedTimer_deepTreeDisabled/64/4         7576 ms         7562 ms            1 nbTimers=17.0435M

After

-----------------------------------------------------------------------------------
Benchmark                                         Time             CPU   Iterations
-----------------------------------------------------------------------------------
BM_AdvancedTimer_begin_end                     71.3 ns         71.5 ns      8960000
BM_AdvancedTimer_largeNumberTimers/128        0.004 ms        0.004 ms       160000
BM_AdvancedTimer_largeNumberTimers/512        0.019 ms        0.019 ms        37333
BM_AdvancedTimer_largeNumberTimers/4096       0.169 ms        0.169 ms         4073
BM_AdvancedTimer_largeNumberTimers/16384      0.693 ms        0.698 ms         1120
BM_AdvancedTimer_deepTreeEnabled/1/2          0.000 ms        0.000 ms      2800000 nbTimers=2
BM_AdvancedTimer_deepTreeEnabled/2/2          0.001 ms        0.001 ms      1120000 nbTimers=6
BM_AdvancedTimer_deepTreeEnabled/4/2          0.002 ms        0.001 ms       448000 nbTimers=20
BM_AdvancedTimer_deepTreeEnabled/8/2          0.006 ms        0.006 ms       112000 nbTimers=72
BM_AdvancedTimer_deepTreeEnabled/16/2         0.024 ms        0.024 ms        28000 nbTimers=272
BM_AdvancedTimer_deepTreeEnabled/32/2         0.112 ms        0.109 ms         5600 nbTimers=1056
BM_AdvancedTimer_deepTreeEnabled/64/2         0.661 ms        0.663 ms          896 nbTimers=4.16k
BM_AdvancedTimer_deepTreeEnabled/1/3          0.000 ms        0.000 ms      2240000 nbTimers=3
BM_AdvancedTimer_deepTreeEnabled/2/3          0.001 ms        0.001 ms       640000 nbTimers=14
BM_AdvancedTimer_deepTreeEnabled/4/3          0.006 ms        0.006 ms       112000 nbTimers=84
BM_AdvancedTimer_deepTreeEnabled/8/3          0.044 ms        0.043 ms        15448 nbTimers=584
BM_AdvancedTimer_deepTreeEnabled/16/3         0.399 ms        0.399 ms         1723 nbTimers=4.368k
BM_AdvancedTimer_deepTreeEnabled/32/3          4.40 ms         4.45 ms          172 nbTimers=33.824k
BM_AdvancedTimer_deepTreeEnabled/64/3          64.4 ms         65.3 ms           11 nbTimers=266.304k
BM_AdvancedTimer_deepTreeEnabled/1/4          0.000 ms        0.000 ms      1866667 nbTimers=4
BM_AdvancedTimer_deepTreeEnabled/2/4          0.002 ms        0.002 ms       320000 nbTimers=30
BM_AdvancedTimer_deepTreeEnabled/4/4          0.026 ms        0.025 ms        26353 nbTimers=340
BM_AdvancedTimer_deepTreeEnabled/8/4          0.447 ms        0.446 ms         1120 nbTimers=4.68k
BM_AdvancedTimer_deepTreeEnabled/16/4          9.69 ms         9.77 ms           64 nbTimers=69.904k
BM_AdvancedTimer_deepTreeEnabled/32/4           230 ms          229 ms            3 nbTimers=1082.4k
BM_AdvancedTimer_deepTreeEnabled/64/4         10830 ms        10844 ms            1 nbTimers=17.0435M
BM_AdvancedTimer_deepTreeDisabled/1/2         0.000 ms        0.000 ms      4977778 nbTimers=2
BM_AdvancedTimer_deepTreeDisabled/2/2         0.000 ms        0.000 ms      2240000 nbTimers=6
BM_AdvancedTimer_deepTreeDisabled/4/2         0.001 ms        0.001 ms       746667 nbTimers=20
BM_AdvancedTimer_deepTreeDisabled/8/2         0.003 ms        0.003 ms       213333 nbTimers=72
BM_AdvancedTimer_deepTreeDisabled/16/2        0.016 ms        0.016 ms        40727 nbTimers=272
BM_AdvancedTimer_deepTreeDisabled/32/2        0.078 ms        0.078 ms         8960 nbTimers=1056
BM_AdvancedTimer_deepTreeDisabled/64/2        0.502 ms        0.516 ms         1000 nbTimers=4.16k
BM_AdvancedTimer_deepTreeDisabled/1/3         0.000 ms        0.000 ms      3733333 nbTimers=3
BM_AdvancedTimer_deepTreeDisabled/2/3         0.001 ms        0.001 ms      1120000 nbTimers=14
BM_AdvancedTimer_deepTreeDisabled/4/3         0.004 ms        0.004 ms       194783 nbTimers=84
BM_AdvancedTimer_deepTreeDisabled/8/3         0.029 ms        0.029 ms        24889 nbTimers=584
BM_AdvancedTimer_deepTreeDisabled/16/3        0.256 ms        0.255 ms         2635 nbTimers=4.368k
BM_AdvancedTimer_deepTreeDisabled/32/3         2.84 ms         2.85 ms          236 nbTimers=33.824k
BM_AdvancedTimer_deepTreeDisabled/64/3         63.2 ms         62.5 ms           11 nbTimers=266.304k
BM_AdvancedTimer_deepTreeDisabled/1/4         0.000 ms        0.000 ms      3200000 nbTimers=4
BM_AdvancedTimer_deepTreeDisabled/2/4         0.001 ms        0.001 ms       560000 nbTimers=30
BM_AdvancedTimer_deepTreeDisabled/4/4         0.016 ms        0.015 ms        44800 nbTimers=340
BM_AdvancedTimer_deepTreeDisabled/8/4         0.248 ms        0.246 ms         2800 nbTimers=4.68k
BM_AdvancedTimer_deepTreeDisabled/16/4         7.14 ms         7.08 ms           75 nbTimers=69.904k
BM_AdvancedTimer_deepTreeDisabled/32/4          203 ms          203 ms            3 nbTimers=1082.4k
BM_AdvancedTimer_deepTreeDisabled/64/4         4035 ms         4031 ms            1 nbTimers=17.0435M

By submitting this pull request, I acknowledge that
I have read, understand, and agree SOFA Developer Certificate of Origin (DCO).


Reviewers will merge this pull-request only if

  • it builds with SUCCESS for all platforms on the CI.
  • it does not generate new warnings.
  • it does not generate new unit test failures.
  • it does not generate new scene test failures.
  • it does not break API compatibility.
  • it is more than 1 week old (or has fast-merge label).

@alxbilger alxbilger added enhancement About a possible enhancement pr: fast merge Minor change that can be merged without waiting for the 7 review days pr: status to review To notify reviewers to review this pull-request labels Jan 28, 2022
@fredroy
Copy link
Contributor

fredroy commented Jan 28, 2022

[ci-build][with-all-tests]

@fredroy fredroy added pr: status ready Approved a pull-request, ready to be squashed and removed pr: status to review To notify reviewers to review this pull-request labels Jan 31, 2022
@fredroy fredroy merged commit 6633adc into sofa-framework:master Feb 1, 2022
@fredroy fredroy added the pr: highlighted in next release Highlight this contribution in the notes of the upcoming release label Feb 2, 2022
@guparan guparan added this to the v22.06 milestone Jun 23, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement About a possible enhancement pr: fast merge Minor change that can be merged without waiting for the 7 review days pr: highlighted in next release Highlight this contribution in the notes of the upcoming release pr: status ready Approved a pull-request, ready to be squashed
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants