Add compression-oriented function reordering pass by brendandahl · Pull Request #8696 · WebAssembly/binaryen

brendandahl · 2026-05-13T00:26:01Z

Implement the --reorder-functions-by-similarity optimization pass in wasm-opt.

Gzip and Brotli compression algorithms rely on finding repetitive byte patterns inside a sliding window (e.g., 32KB for Gzip). If structurally similar functions are placed far apart in the Wasm binary, the compressor cannot detect matches across them. While the existing --reorder-functions pass sorts functions strictly by call frequency to shrink LEB128 indexes, it scatters mutually compressible functions and ultimately increases gzipped delivery sizes.

This new pass traverses defined function bodies in post-order and extracts a similarity sorting key based on signature type IDs, local variables types, and structural opcode sequences. By sorting defined functions lexicographically by this key, structurally similar functions are physically grouped together in the output binary, providing adjacent compressible bytes.

brendandahl · 2026-05-13T00:28:05Z

Below is a comparison of the uncompressed and gzip-compressed binary sizes for both configurations. There are still some tweaks I think we can make. I've been able to get 2% on some files, but it wasn't doing as well on others (still need to figure out why).

Benchmark File	Uncompressed Baseline (bytes)	Uncompressed Similarity (bytes)	Uncompressed Change	Gzip Baseline (bytes)	Gzip Similarity (bytes)	Gzip Change (Savings)
dart-flute-complex.opt.wasm	1,081,549	1,083,288	`+0.16%`	392,180	386,221	`-1.52%`
dart-flute-complex.unopt.wasm	1,284,344	1,286,148	`+0.14%`	458,367	452,629	`-1.25%`
dart-pop.unopt.wasm	398,114	398,114	`0.00%`	148,474	146,737	`-1.17%`
dart-pop.opt.wasm	350,546	350,546	`0.00%`	133,329	131,929	`-1.05%`
v8_poppler.wasm	2,067,741	2,076,431	`+0.42%`	987,474	982,825	`-0.47%`
v8_sqlite.c.wasm	931,440	936,924	`+0.59%`	378,918	375,992	`-0.77%`
v8_box2d.wasm	86,598	86,598	`0.00%`	39,983	39,978	`-0.01%`

tlively

Mostly comments on algorithmic improvements. Let me know if you'd rather land as-is to get the measured benefit without investing more time in algorithmic improvements and I can review with that in mind.

tlively · 2026-05-13T01:30:11Z

+    // Capture important immediate type/operator information
+    // TODO: There's probably more data that would be useful to capture.


You could probably extract and reuse the HashStringifyWalker from Outlining.cpp. It turns expression trees into strings by shallowly hashing each expression, including all of its immediates. You would just want it to use a normal PostWalker (but probably modified to also call addUniqueSymbol at control flow boundaries, e.g. end and else) instead of the custom StringifyWalker it currently uses. Nothing a little extra templating can't solve!

This looks like it will be a bigger change (and potentially much slower). I'd like to save this for a v2 experiment.

Another option is to just look at the bytes - that would be most precise (actually use the encoding of the enums), and not hard to do, but slower. Anyhow, yes, larger changes/investigations can be left for later, this looks like a great start!

tlively · 2026-05-13T01:44:21Z

+    ThreadPool::get()->work(doWorkers);
+
+    // 3. Sort defined functions by the similarity heuristic
+    std::sort(keys.begin(), keys.end());


Sorting only works when the similarities are at the beginning of the strings, right? It seems like looking for matching substrings would be more robust. You could check out what Outlining.cpp does with a suffix tree to find common substrings, for example.

Yeah, the idea here was prologues are usually very common and doing full substring matching is very slow. As mentioned above, seems like something to explore in v2.

kripken · 2026-05-13T16:33:40Z

I assume the background here is #4322 ? Some prior work is there.

brendandahl · 2026-05-13T16:56:23Z

No, though I did find that after starting this. Awhile ago I was playing with compressed wat vs wasm with brotli/gzip and added a note to try reordering for gzip. I haven't tried out the idea from cromulate. I was also going to ask if you still have your similarity-ordering branch somewhere?

kripken · 2026-05-13T18:14:53Z

Hmm, unfortunately I seem to have deleted it when I moved my branches to my fork, but it isn't there either... Should have been at

https://github.com/kripken/binaryen/tree/similarity-ordering

Github had a way to restore deleted branches back in the day but maybe just for recent ones... anyhow, the code there was probably not great 😄

kripken · 2026-05-13T18:16:09Z

iirc, the approach was to write the binary bytes and compare them (so not at the IR level). Not sure if that is better (certainly slower).

Implement the --reorder-functions-by-similarity optimization pass in wasm-opt. Gzip and Brotli compression algorithms rely on finding repetitive byte patterns inside a sliding window (e.g., 32KB for Gzip). If structurally similar functions are placed far apart in the Wasm binary, the compressor cannot detect matches across them. While the existing --reorder-functions pass sorts functions strictly by call frequency to shrink LEB128 indexes, it scatters mutually compressible functions and ultimately increases gzipped delivery sizes. This new pass traverses defined function bodies in post-order and extracts a similarity sorting key based on signature type IDs, local variables types, and structural opcode sequences. By sorting defined functions lexicographically by this key, structurally similar functions are physically grouped together in the output binary, providing adjacent compressible bytes. Empirical benchmarks on real-world Flutter and Poppler Wasm examples show a significant improvement, saving up to 2.13% and .98% in compressed delivery size compared to the baseline (no reordering).

brendandahl · 2026-05-13T23:22:14Z

Added brotli to the comparison. Helps there even with the bigger sliding window of 4MiB

File	Gzip Baseline (bytes)	Gzip Similarity (bytes)	Gzip Diff	Brotli Baseline (bytes)	Brotli Similarity (bytes)	Brotli Diff
`dart-flute-complex.opt.wasm`	392,180	386,221	-1.52%	353,061	349,273	-1.07%
`dart-flute-complex.unopt.wasm`	458,367	452,629	-1.25%	409,995	406,029	-0.97%
`dart-pop.unopt.wasm`	148,474	146,737	-1.17%	135,640	134,493	-0.85%
`dart-pop.opt.wasm`	133,329	131,929	-1.05%	122,487	121,162	-1.08%
`v8_poppler.wasm`	987,474	982,825	-0.47%	924,941	921,716	-0.35%
`v8_sqlite.c.wasm`	378,918	375,992	-0.77%	329,704	327,791	-0.58%
`v8_box2d.wasm`	39,983	39,978	-0.01%	37,332	37,332	0.00%

MaxGraey · 2026-05-14T04:36:39Z

Could you add zstd to comparion, please?

kripken · 2026-05-14T15:49:33Z

+  : public PostWalker<OpcodeSequenceBuilder,
+                      UnifiedExpressionVisitor<OpcodeSequenceBuilder>> {
+  std::vector<uint32_t> sequence;
+  const size_t max_len = 512;


Suggested change

const size_t max_len = 512;

const size_t MaxLen = 512;

kripken · 2026-05-14T15:50:47Z

+    // Capture important immediate type/operator information
+    // TODO: There's probably more data that would be useful to capture.


Another option is to just look at the bytes - that would be most precise (actually use the encoding of the enums), and not hard to do, but slower. Anyhow, yes, larger changes/investigations can be left for later, this looks like a great start!

kripken · 2026-05-14T15:52:29Z

+      sequence.push_back(localSet->type.getID());
+    } else if (auto* const_ = curr->dynCast<Const>()) {
+      sequence.push_back(const_->type.getID());
+    }


For this PR, you can get all enums using wasm-delegations-fields. It would be shorter than the current code. That + the type would make sense I think?

kripken · 2026-05-14T15:54:35Z

+
+  void run(Module* module) override {
+    // If the number of defined functions is small, similarity-based reordering
+    // does not help and can regress size due to increasing LEB size.


Wait, doesn't this matter more for large modules? Where there are enough for LEBs to matter?

I did a deeper dive into this. This heuristic was a quick hack to avoid regressing the small v8_box2d.wasm where re-ordering it made it worse. I assumed this was LEB's or the original ordering was better, but what actually was happening was the gzip command was adding the filename into the file!

I got rid of this code locally and now use gzip -9 -k -n and v8_box2d.wasm also improves 0.33%.

Oh, funny about the filename! 😄

Nice, yeah, I'd hope this works even on small things.

brendandahl requested a review from a team as a code owner May 13, 2026 00:26

brendandahl requested review from tlively and removed request for a team May 13, 2026 00:26

tlively reviewed May 13, 2026

View reviewed changes

brendandahl force-pushed the reorder branch from e251faf to 651f4c4 Compare May 13, 2026 23:01

tlively approved these changes May 13, 2026

View reviewed changes

kripken reviewed May 14, 2026

View reviewed changes

		// Capture important immediate type/operator information
		// TODO: There's probably more data that would be useful to capture.

Conversation

brendandahl commented May 13, 2026

Uh oh!

brendandahl commented May 13, 2026

Uh oh!

tlively left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

kripken commented May 13, 2026

Uh oh!

brendandahl commented May 13, 2026

Uh oh!

kripken commented May 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

kripken commented May 13, 2026

Uh oh!

brendandahl commented May 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

MaxGraey commented May 14, 2026

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

kripken commented May 13, 2026 •

edited

Loading

brendandahl commented May 13, 2026 •

edited

Loading