Cache HashSet in try_to_allocate_bundle_to_reg#90
Cache HashSet in try_to_allocate_bundle_to_reg#90cfallin merged 1 commit intobytecodealliance:mainfrom
HashSet in try_to_allocate_bundle_to_reg#90Conversation
Keep `conflict_set` allocated in `Env` instead of allocating a new one on every call. This improves register allocation performance by about 2%.
9afe248 to
31f48ef
Compare
HashSet in try_to_allocate_bundle_to_reg`HashSet in try_to_allocate_bundle_to_reg
cfallin
left a comment
There was a problem hiding this comment.
An easy win, thanks for finding this!
|
While double-checking my benchmarking setup before doing some perf experiments on regalloc2, I measured the effect of this PR on wasmtime. Although the effect disappears into the noise on larger inputs, I can confirm that this PR is a win across the board. I figured I might as well share the numbers. For a small program like the bz2 benchmark from Sightglass, this PR is "1.02 ± 0.01 times faster" by CPU time, according to Hyperfine. Sightglass says it's "1.02x to 1.03x faster" by CPU cycles, and 1.02x faster by instructions retired. For a slightly larger benchmark (pulldown-cmark), this PR is "1.01 ± 0.01 times faster" according to Hyperfine. Sightglass says "1.00x to 1.01x faster" by CPU cycles, and 1.01x faster by instructions retired. On our largest benchmark (spidermonkey), this PR is "1.00 ± 0.01 times faster" according to Hyperfine. Sightglass reports "No difference in performance" by CPU cycles, and 1.01x faster by instructions retired. |
Keep
conflict_setallocated inEnvinstead of allocating a new one on every call. This improves register allocation performance by about 2%.