Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize divrem with non-zero immediate rhs values #825

Merged
merged 12 commits into from
Dec 3, 2023

Conversation

Robbepop
Copy link
Member

@Robbepop Robbepop commented Dec 2, 2023

Closes #807.

@paritytech-cicd-pr
Copy link

paritytech-cicd-pr commented Dec 2, 2023

BENCHMARKS

NATIVEWASMTIME
BENCHMARKMASTERPRDIFFMASTERPRDIFFWASMTIME OVERHEAD
execute/
br_table
1.51ms 1.49ms ⚪ -1.35% 1.34ms 1.29ms 🟢 -3.45% 🟢 -13%
execute/
call/host/1
44.99µs 45.02µs ⚪ -0.37% 66.55µs 65.10µs 🟢 -2.25% 🟢 45%
execute/
call/rec
168.54µs 167.69µs ⚪ -0.33% 351.45µs 355.22µs ⚪ 0.87% 🔴 112%
execute/
count_until
7.47ms 6.59ms 🟢 -10.25% 7.50ms 7.49ms ⚪ -0.17% 🟢 14%
execute/
divrem
6.43ms 6.21ms 🟢 -3.33% 6.81ms 6.97ms 🔴 2.34% 🟢 12%
execute/
factorial/iter
231.80µs 237.10µs 🔴 2.95% 323.14µs 316.27µs 🟢 -2.18% 🟢 33%
execute/
factorial/rec
697.94µs 681.03µs 🟢 -2.39% 1.38ms 1.25ms 🟢 -9.34% 🟡 84%
execute/
fibonacci/iter
1.36ms 1.36ms ⚪ 0.20% 1.19ms 1.27ms 🔴 6.65% 🟢 -7%
execute/
fibonacci/rec
6.25ms 6.07ms 🟢 -2.83% 13.28ms 13.22ms ⚪ -0.68% 🔴 118%
execute/
fibonacci/tail
1.63ms 1.50ms 🟢 -8.07% 3.79ms 4.53ms 🔴 19.33% 🔴 202%
execute/
fuse
8.25ms 7.35ms 🟢 -11.11% 11.50ms 11.64ms ⚪ 1.16% 🟡 58%
execute/
global/bump
1.33ms 1.32ms ⚪ -0.54% 1.59ms 1.54ms 🟢 -3.35% 🟢 17%
execute/
global/get_const
735.74µs 711.99µs 🟢 -3.23% 761.80µs 750.85µs 🟢 -1.46% 🟢 5%
execute/
is_even/rec
1.10ms 1.08ms 🟢 -1.95% 2.21ms 2.27ms 🔴 2.75% 🔴 111%
execute/
memory/fill_bytes
1.13ms 1.08ms 🟢 -4.68% 1.23ms 1.31ms 🔴 6.64% 🟢 22%
execute/
memory/sum_bytes
1.13ms 1.10ms 🟢 -3.39% 1.24ms 1.23ms ⚪ -0.95% 🟢 12%
execute/
memory/vec_add
3.01ms 2.96ms ⚪ -3.83% 3.41ms 3.62ms 🔴 6.50% 🟢 23%
execute/
recursive_scan
186.81µs 187.47µs ⚪ 0.37% 373.02µs 390.69µs 🔴 4.75% 🔴 108%
execute/
recursive_trap
16.11µs 15.32µs 🟢 -4.86% 35.34µs 35.33µs ⚪ -0.17% 🔴 131%
execute/
regex_redux
592.56µs 604.74µs 🔴 2.33% 1.05ms 1.06ms ⚪ 0.73% 🟡 76%
execute/
rev_complement
440.62µs 442.82µs ⚪ 0.55% 635.12µs 640.62µs ⚪ 0.77% 🟢 45%
execute/
tiny_keccak
368.83µs 349.83µs 🟢 -4.78% 418.06µs 381.67µs 🟢 -8.66% 🟢 9%
execute/
trunc_f2i
620.25µs 616.44µs ⚪ -0.60% 943.61µs 953.94µs ⚪ 1.00% 🟡 55%
instantiate/
wasm_kernel
54.59µs 56.61µs 🔴 3.62% 54.07µs 55.94µs 🔴 4.46% 🟢 -1%
overhead/
call/typed/0
1.21ms 1.23ms 🔴 1.91% 847.77µs 777.94µs 🟢 -8.23% 🟢 -37%
overhead/
call/typed/16
1.73ms 1.63ms 🟢 -5.83% 2.07ms 1.94ms 🟢 -6.61% 🟢 19%
overhead/
call/untyped/0
1.61ms 1.62ms ⚪ 0.80% 1.38ms 1.23ms 🟢 -11.06% 🟢 -24%
overhead/
call/untyped/16
2.53ms 2.46ms 🟢 -2.63% 3.95ms 3.97ms ⚪ 0.44% 🟡 61%
translate/
bz2/default
1.38ms 1.38ms ⚪ -1.08% 2.42ms 2.46ms ⚪ 1.40% 🟡 79%
translate/
bz2/fuel
1.43ms 1.42ms ⚪ -0.90% 2.58ms 2.61ms ⚪ 0.61% 🟡 84%
translate/
erc1155/default
282.58µs 279.91µs ⚪ -0.59% 476.69µs 470.62µs ⚪ -1.24% 🟡 68%
translate/
erc1155/fuel
301.19µs 299.36µs ⚪ -0.30% 503.03µs 501.65µs ⚪ -0.36% 🟡 68%
translate/
erc20/default
136.00µs 134.27µs ⚪ -1.15% 222.73µs 224.29µs ⚪ 1.01% 🟡 67%
translate/
erc20/fuel
143.84µs 142.60µs ⚪ -1.15% 234.81µs 235.66µs ⚪ 0.60% 🟡 65%
translate/
erc721/default
192.09µs 191.18µs ⚪ -0.51% 321.57µs 324.51µs ⚪ 0.91% 🟡 70%
translate/
erc721/fuel
202.45µs 201.59µs ⚪ -0.48% 339.60µs 339.35µs ⚪ -0.63% 🟡 68%
translate/
pulldown_cmark/default
3.84ms 3.81ms ⚪ -0.66% 6.34ms 6.40ms ⚪ 0.93% 🟡 68%
translate/
pulldown_cmark/fuel
3.94ms 3.93ms ⚪ -0.20% 6.64ms 6.70ms ⚪ 0.96% 🟡 70%
translate/
spidermonkey/default
0.00ns 0.00ns ⚪ -0.27% 0.00ns 0.00ns ⚪ 1.26% 🟢 0%
translate/
spidermonkey/fuel
0.00ns 0.00ns ⚪ -0.58% 0.00ns 0.00ns ⚪ 0.73% 🟢 0%
translate/
wasm_kernel/default
5.07ms 5.04ms ⚪ -0.57% 8.47ms 8.58ms 🔴 1.19% 🟡 70%
translate/
wasm_kernel/fuel
5.21ms 5.21ms ⚪ 0.19% 8.94ms 8.91ms ⚪ -0.28% 🟡 71%

Link to pipeline

@codecov-commenter
Copy link

codecov-commenter commented Dec 2, 2023

Codecov Report

Attention: 19 lines in your changes are missing coverage. Please review.

Comparison is base (42371ac) 80.88% compared to head (4633081) 81.00%.

Files Patch % Lines
...rates/wasmi/src/engine/translator/relink_result.rs 0.00% 14 Missing ⚠️
crates/wasmi/src/engine/executor/instrs/binary.rs 86.66% 4 Missing ⚠️
crates/wasmi/src/engine/executor/instrs.rs 90.90% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master     #825      +/-   ##
==========================================
+ Coverage   80.88%   81.00%   +0.11%     
==========================================
  Files         255      255              
  Lines       22344    22390      +46     
==========================================
+ Hits        18074    18138      +64     
+ Misses       4270     4252      -18     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@Robbepop
Copy link
Member Author

Robbepop commented Dec 3, 2023

Locally and on CI we can see roughly 3-4% performance improvement for the new divrem benchmark that exclusively tests the new optimization. This is way less than anticipated by still significant. The changes introduced in this PR are not complex and we may get more performance out of this if we succeed to optimize instruction dispatch since instruction dispatch clearly is the bottleneck in the divrem benchmark test.

@Robbepop Robbepop merged commit 63b1a63 into master Dec 3, 2023
18 checks passed
@Robbepop Robbepop deleted the rf-opt-unsigned-div branch December 3, 2023 13:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Optimize {i32,i64}.{rem,div}_{u,s} with immediate non-zero rhs value
3 participants