New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

merge-locals pass #1334

Merged
merged 19 commits into from Dec 18, 2017

Conversation

Projects
None yet
2 participants
@kripken
Member

kripken commented Dec 15, 2017

This optimizes the situation described in #1331. Namely, when x is copied into y, then on subsequent gets of x we could use y instead, and vice versa, as their value is equal. Specifically, this seems to get rid of the definite overlap in the live ranges of x and y, as removing it allows coalesce-locals to merge them. The pass therefore does nothing if the live range of y ends there anyhow.

The danger here is that we may extend the live range so that it causes more conflicts with other things, so this is a heuristic, but I've tested it on every codebase I can find and it always produces a net win, even on one I saw a 0.4% reduction of code size, which surprised me.

This is a fairly slow pass, because it uses LocalGraph which isn't much optimized. This PR includes a minor optimization for it, but we should rewrite it. Meanwhile this is just enabled in -O3 and -Oz.

This PR also includes some fuzzing improvements, to better test stuff like this.

kripken added some commits Dec 11, 2017

fix
fix
when fuzzBinary, save/load the binary before opts, which is better fo…
…r fuzzing, as binary saving reorders locals for example, and it's better to get rid of those differences before the optimizer is influenced by them
@kripken

This comment has been minimized.

Member

kripken commented Dec 16, 2017

Thinking a bit more about this, maybe we should check on more code. @dcodeIO, can you maybe try more code from your compiler? I've checked on the one testcase from you, plus a bunch from emscripten, but more testing would be good (since this is more of a heuristic than a guarantee). By testing, I mean to run it on some code and see if it improves things or not, say by measuring the binary size, or the output of wasm-opt input.wasm --metrics --merge-locals --metrics (so it prints a diff per element).

@dcodeIO

This comment has been minimized.

Contributor

dcodeIO commented Dec 16, 2017

I must admit that my tests are fairly limited at this point. What I can do, though, is to run all the tests once with the changes (and ssa/flatten turned on) and make a commit of it that'll show the differences in each one's fixtures. Let me check...

@dcodeIO

This comment has been minimized.

Contributor

dcodeIO commented Dec 16, 2017

Here's the diff: AssemblyScript/assemblyscript@9021a7c (the memcpy one)

Edit: Oh, let me guess, this is not a default optimization yet. Sec

A diff without flatten/ssa and just merge-locals turned on: AssemblyScript/assemblyscript@5c24168

(seems I am not a help here)

@kripken

This comment has been minimized.

Member

kripken commented Dec 18, 2017

This new pass is a default optimization in -O3 and -Oz, but nowhere else, and in particular not in the default level (-Os), yeah.

The diff with flatten looks a little odd, we should investigate that before recommending people use flatten I guess.

Ok, thanks, I guess we've tested this all we can. All signs look positive, so I'll merge this. We should optimize the pass for speed eventually, and turn it on in -Os and -O2.

@kripken kripken merged commit a0de358 into master Dec 18, 2017

@kripken kripken deleted the merge-locals branch Dec 18, 2017

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment