Avoid new blocks in binary reading/writing by kripken · Pull Request #1165 · WebAssembly/binaryen

kripken · 2017-09-06T00:05:46Z

Background: wasm allows lists at the top of functions, in loop bodies, and in if arms. Binaryen IR only allows a list in a block (so passes only need to deal with lists in one place), and as a result we may need more blocks than wasm does in some cases.

Turns out we generated more than we needed, which is kind of silly since just reading and writing a binary repeatedly could lead to an unstoppable increase in size. This PR fixes that. One commit is for writing, one is for reading.

kripken · 2017-09-06T00:07:34Z

  processExpressions();
  size_t end = expressionStack.size();
-  if (start - end == 1) {
+  if (end - start == 1) {


amusingly this was wrong all along, so the optimization never kicked in (if it did, it would have led to breakage, which is fixed in this PR - we need to call getBlock in places where a branch might occur, like the top of a function)

…ist context

kripken · 2017-09-08T21:45:57Z

This has been fuzzed heavily and looks correct. Any concerns?

dschuff · 2017-09-08T22:11:04Z

+
+  // Gets a potential list of instructions. This is not a block and cannot be
+  // branched to.
+  Expression* getList(WasmType type);


This name is confusing. There is no Expression that's a list, other than a block. GetMaybeBlock makes sense if only because it could return either a block or a single instruction.

I guess what this means is that it could return a single instruction or an implicit block (which cannot be branched to, and will not appear in the wasm output)?

Yeah, exactly - it's reading a "list context" in a wasm binary, so it could be a list of instructions, but we do know it can't be branched to.

Ideas for a better name? :) Maybe getExpressionList or getExpressionOrList?

dschuff · 2017-09-08T22:11:50Z

+  // Gets a potential list of instructions. This is not a block and cannot be
+  // branched to.
+  Expression* getList(WasmType type);
+  // Gets a potential block. This may be branched to.


I don't really understand what you mean by "potential" here. This always returns an actual block, no? Is the difference just that it may or may not be an implicit block (e.g. a function body or if arm?

Ok, I see that's wrong. So the only difference from getList is that this could return a single instruction or an explicit block (that can be branched to).

Yeah. Both can return a block or a single instruction if a block isn't needed. A block might be needed if we branch to it (in getBlock, but not getList where we assume no branches), or if we have more than one instruction.

kripken · 2017-09-11T16:57:31Z

How about getUnnamedBlockOrSingleton() for getting either a singleton or a block, where the block can't be branched to so we create it without a name, and getBlockOrSingleton() for the case where we can branch to it?

dschuff · 2017-09-12T00:13:11Z

OK, I think I understand better. In both cases you want an expression or list of expressions. In one case (function top level, explicit block, if arm), you need it to be targetable by a branch. In the other case, it must not be targetable because its container is the one that's targetable (loop). In both cases you want to just use a bare expression if there are in fact no branches that target it.

Given that, I have 2 questions:

Why does the logic have to be duplicated rather than shared (and have a parameter or something)
Why are if-arms in the first category instead of the second? i.e. instead of containing blocks with names, why aren't the arms targetable the way loops are? Maybe because the jump is forward instead of back?

kripken · 2017-09-12T17:17:38Z

Which logic do you mean is duplicated? if you mean the identical lines in getList and getBlock, it's just a few, and it's simpler that way I think. Otherwise getList could get a parameter, but it would be kind of messy to check it multiple times.

About loops, yeah, the issue is that their branch is backwards. wasm no longer lets them be broken out of - they only have a top label, in other words. getBlock optionally creates a block, which is only a forwards branch. So in loops we getList, and handle the branch to the loop top directly.

dschuff · 2017-09-12T20:59:36Z

ok. the code seems ok; really getList is really only for loops (everything else can be branched to) so maybe just say in the comment directly that it's for loops because its branch target is on the loop instead of the block.

Or actually, even better: let's just put that logic directly in visitLoop, and then maybe getNextExpressionOrBlock or getBlockOrSingleton or something like that for getBlock

kripken · 2017-09-12T21:40:42Z

I like that, nice. Updated to getBlockOrSingleton(), and the loop-specific getList logic is now inside the loop code, and that name is gone.

kripken commented Sep 6, 2017

View reviewed changes

kripken force-pushed the slim branch from 55c2cac to cb5fb88 Compare September 6, 2017 18:17

kripken added 3 commits September 6, 2017 15:31

don't emit a toplevel block if we don't need to, as in wasm it is a l…

164677f

…ist context

don't create unnecessary blocks in wasm reading

63388e1

update

a381ba0

kripken force-pushed the slim branch from cb5fb88 to a381ba0 Compare September 6, 2017 22:34

dschuff reviewed Sep 8, 2017

View reviewed changes

refactor

3159be6

dschuff approved these changes Sep 12, 2017

View reviewed changes

kripken merged commit 048bcad into master Sep 13, 2017

kripken deleted the slim branch September 13, 2017 00:12

Conversation

kripken commented Sep 6, 2017

Uh oh!

kripken Sep 6, 2017

Choose a reason for hiding this comment

Uh oh!

kripken commented Sep 8, 2017

Uh oh!

dschuff Sep 8, 2017

Choose a reason for hiding this comment

Uh oh!

dschuff Sep 8, 2017

Choose a reason for hiding this comment

Uh oh!

kripken Sep 9, 2017

Choose a reason for hiding this comment

Uh oh!

dschuff Sep 8, 2017

Choose a reason for hiding this comment

Uh oh!

dschuff Sep 8, 2017

Choose a reason for hiding this comment

Uh oh!

kripken Sep 9, 2017

Choose a reason for hiding this comment

Uh oh!

kripken commented Sep 11, 2017

Uh oh!

dschuff commented Sep 12, 2017

Uh oh!

kripken commented Sep 12, 2017

Uh oh!

dschuff commented Sep 12, 2017

Uh oh!

kripken commented Sep 12, 2017

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants