cmd/compile: split up SSA rewrite passes #27034
For Go 1.12 I'd like to experiment with splitting up the rewrite rules for generic optimization, lowering and lowered optimization into more phases. This is a tracking issue for that experimentation. Ideas and feedback welcomed.
As we have added more and more optimizations the rules files have become increasingly large. This introduces a few problems. Firstly 'noopt' builds actually do a lot of optimizations, some of which are quite complex and may make programs much harder to debug. Secondly the ways that rules interact is becoming less clear. For example, adding a new set of rules might introduce 'dead ends' into the optimization passes, because, say, the author hasn't take into account special cases such as indexed versions of memory operations. Thirdly there are some rules which will only ever fire once. For example, the rules to explode small struct accesses('SSAable'). This is inefficient since they need to be re-checked every time the optimization pass is re-run (the optimization passes tend to run over and over again until a steady state is reached).
Here is a summary of the existing passes that use code generated from the rewrite rules (I've ignored the 32-bit passes for now):
As a rough guide I see the phases looking something like this (will most likely change quite a lot):
Architecture specific phases (not all architectures will need all of these):
I'm hoping the benefits of being able to reduce the number of rules and perhaps more efficient I-cache usage will make up for the increased number of passes. I'll need to experiment to see.
Most likely some rules will need to be duplicated in multiple passes (particularly the generic optimization passes). This will probably involve splitting the rules into more files and then re-combining them in individual passes (for example, constant propagation rules could get their own file and then be called from both the initial and main generic optimization passes).
There are some TODOs along these lines in the compiler source, but I couldn't find any existing issues, so apologies if this is a dup of another issue.
The text was updated successfully, but these errors were encountered:
This is true, but I wonder how much of a problem it actually is. Last time I checked (which was a while ago), all rewrite rules together were less than 10% of compile time. And splitting up rule sets increases the number of times each value gets looked at also, so splitting up isn't free either.
I think we should be able to make these one pass. From the comment in AMD64.rules:
Splitting lower and optimization would make the lowering one-pass and allow turning off the optimizations with -N.
I think it might be tricky to separate the rules out into a fixed ordering. How do we split up the current rules? For a new rule, where do you decide to put it? How do you record the reason why? I don't want to get to a state where there are opt1.rules through opt6.rules, and there's a rule in opt4.rules which can't be moved to opt3.rules or opt5.rules, but no one knows why. There's a simplicity in putting all the rules in a single pile.
I think it would be worth looking at least at separating out a final one pass ruleset for some very low level transformations. For example prefer this instruction over another one for these range of parameters. e.g. golang.org/cl/115997, MOV 0 -> XOR, maybe LEA 3arg -> 2x LEA2arg, ... that way they will not interfere with more complex optimizations and may reduce complexity in the main ruleset.