Conversation
When popping children for an expression, greedily pop none-typed expressions below the last value-producing expression in the stack, packaging all the popped instructions into a new block. This avoids leaving the none-typed expressions on top of the stack, which is good because having them on top of the stack would force the use of a scratch local when the next operand is popped. Besides producing better IR with fewer scratch locals, this also has the benefit of better round-tripping IR through binaries. The binary writer has an optimization where it will elide unnamed blocks because they cannot possibly be branch targets, but this could create "stacky" code that would previously introduce a scratch local when it was parsed back into IR. Now IRBuilder just recreates exactly the unnamed block that had been present in the IR in first place. While we don't generally guarantee that we can perfectly round-trip IR through binaries, this reduces the number of cases where round-trips lead to increased code size. Fixes #8413.
| @@ -1,44 +0,0 @@ | |||
| ;; NOTE: Assertions have been generated by update_lit_checks.py and should not be edited. | |||
There was a problem hiding this comment.
Round-tripping no longer introduces a scratch local, so it no longer tests anything interesting.
There was a problem hiding this comment.
It could test that we do not introduce a scratch local?
There was a problem hiding this comment.
That behavior is already well tested in e.g. wat-kitchen-sink.wast. There's nothing special about the --roundtrip pass to test here.
|
|
||
| if (type.size() == sizeHint || type.size() <= 1) { | ||
| if (hoisted.get) { | ||
| if (hoisted.valIndex < scope.exprStack.size() - 1) { |
There was a problem hiding this comment.
Basically we need to packageAsBlock whenever we want to treat multiple expressions as a unit, rather than just using the expression on top of the stack. This used to be the case precisely when we also had a get of a scratch local. Now it's possible for the value-producing expression to be on top of the stack, so we don't need a scratch local, but for us to greedily be consuming other expressions as well, so we still need a block. We handle this by more explicitly checking whether the hoisted index is something other than the top of the stack.
(I'll rename it from valIndex, since that name no longer gives the right idea.)
kripken
left a comment
There was a problem hiding this comment.
(I don't feel strongly about the removed file)
|
|
||
| if (type.size() == sizeHint || type.size() <= 1) { | ||
| if (hoisted.get) { | ||
| if (hoisted.valIndex < scope.exprStack.size() - 1) { |
|
@kripken, the clusterfuzz test failure here does not look related to this change. Is there a better way to investigate than just fuzzing locally and seeing if something similar shows up again? |
|
I don't think there's a way to download the test directory from CI, unfortunately. Sometimes the error itself has been useful in guiding me to what could be the issue. Otherwise I would ignore this. |
When popping children for an expression, greedily pop none-typed expressions below the last value-producing expression in the stack, packaging all the popped instructions into a new block. This avoids leaving the none-typed expressions on top of the stack, which is good because having them on top of the stack would force the use of a scratch local when the next operand is popped.
Besides producing better IR with fewer scratch locals, this also has the benefit of better round-tripping IR through binaries. The binary writer has an optimization where it will elide unnamed blocks because they cannot possibly be branch targets, but this could create "stacky" code that would previously introduce a scratch local when it was parsed back into IR. Now IRBuilder just recreates exactly the unnamed block that had been present in the IR in first place.
While we don't generally guarantee that we can perfectly round-trip IR through binaries, this reduces the number of cases where round-trips lead to increased code size.
Fixes #8413.