Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make WebAssembly more like assembly #299

Closed
wants to merge 3 commits into from
Closed

Conversation

sunfishcode
Copy link
Member

This replaces high-level control structures with low-level primitive operations. break is generalized to a "branch forward" and continue becomes a generalized "branch backward". Along with switch, this is sufficient to represent all control transfers, making specialized constructs redundant. This implements the idea at the end of #261.

This proposal also adds select operators, which serve the role of "conditional move" instructions in common ISAs.

For example, code like this:

if (x) {
  body()
} else {
  other()
}

into this (invented syntax):

{{
        br_if !x, L1
        body()
        br L0
} L1
        other()
} L0

This design is very close to goto, however unlike goto it inherently preserves the well-structured property, so backends can rely on it without verification.

This design would make it natural for CFG-based compilers to target WebAssembly without employing the Relooper algorithm. They'd still need to ensure that loops are single-entry, but given that, it is trivial to translate arbitrary control flow into the constructs proposed here.

This design neither depends on nor conflicts with statements-as-expressions. It works either way.

Big-picture questions:

  • This uses more nodes than the current design, though macro
    compression ought to be able to erase the difference. Is this a
    problem?
  • If-else is an extradordinarly common pattern with convenient
    (but not essential) properties for both humans and compilers. Is it
    worth giving up the dedicated opcode to achieve a simpler opcode set?
    This is partially compensated for by adding select, though on the
    other hand select is technically redundant too.

Small-picture questions:

  • Should block and loop be merged?
  • Should break and break_if be merged? (and continue and
    continue_if)?

@kg
Copy link
Contributor

kg commented Aug 17, 2015

This requires non-nullary macros to optimize. We haven't proven that we can even do those effectively without a performance hit, nor have we proven that they will eliminate all the size overhead in cases like this. If statements are common enough that I'd be concerned about how much performance we'd be leaving on the table if we replace the if-statement AST node with this.

I find this kinda gross but I'd be OK with it as an alternative alongside the if-statement node.

I would prefer that we not merge the block and loop concepts. I think it would be better to keep the concept of a continue separate from any other sort of branch and then enforce that it can only branch backwards within a loop body.

@sunfishcode
Copy link
Member Author

For the perspective from which the current control constructs may also be considered ugly, here's an illustration of how the four simple and orthogonal control transfers proposed here, {conditional,unconditional} branch {forward,backward}, map onto the current opcode set:

// Conditional branch forward:
if (x) stuff();
if (x) break label;
if (x) stuff(); else break label;
do { stuff(); if (x) continue; things(); } while (y); // it's in there somewhere

// Conditional branch backward:
if (x) continue label;
if (x) stuff(); else continue label;
do { stuff(); } while (x);

// Unconditional branch forward:
if (x) stuff(); else things(); // hiding in the middle
break label;
do { if (x) { stuff(); continue; } things(); } while (y); // can you find it?

// Unconditional branch backward:
continue label;
forever { stuff(); }

@kripken
Copy link
Member

kripken commented Aug 17, 2015

Is it fair to say that the benefits here are

  • Compilers emitting wasm can avoid a Relooper-like operation

and the downsides are

  • Increased code size
  • Decreased debuggability

?

@titzer
Copy link

titzer commented Aug 17, 2015

I'm not sure how this avoids the relooper algorithm. E.g. even something
simple like turning a diamond into a set of properly nested blocks seems to
require computing a dominator tree and then inserting surrounding blocks.
Also, computing a properly enclosing scope for a while seems to require
computing the loop membership and inserting a surrounding block.

On Mon, Aug 17, 2015 at 7:52 PM, Alon Zakai notifications@github.com
wrote:

Is it fair to say that the benefits here are

  • Compilers emitting wasm can avoid a Relooper-like operation

and the downsides are

  • Increased code size
  • Decreased debuggability

?


Reply to this email directly or view it on GitHub
#299 (comment).

@sunfishcode
Copy link
Member Author

@kripken - benefits also include:

  • improved debuggability for people that want to single-step through instructions
  • the same or perhaps slightly improved debuggability for people debugging at the HLL level through sourcemaps/dwarf/etc.
  • simplified and more transparent implementations

@titzer - It requires knowing where the loops are, and it requires that loops be single-entry. It does not require dominator trees. The algorithm roughly goes like this: Order the blocks in RPO. Identify the loops, and assign the backedges conditional and unconditional branches as appropriate. All remaining CFG edges are now forward branches. The scope for each branch can be placed at the beginning of the innermost loop which contains the branch destination. If multiple scopes begin at the same place, sort them in reverse order of the order the destinations in the RPO.

@sunfishcode
Copy link
Member Author

@kg This non-nullary macro compression problem is potentially easier than the general case; we could create predefined macros for things like diamonds and triangles, rather than requiring a tool to automatically discover the macros on its own.

@kripken
Copy link
Member

kripken commented Aug 17, 2015

I don't understand the debuggability improvements. Are you saying it's more obvious where the next step in the debugger will go, if there is a goto-like thing instead of an if? Surely an if is better?

In your examples, you put everything on one line. I agree that such code is hard to debug, but when code is properly pretty-printed, ifs, loops, continues and breaks are surely the simplest thing for people to debug with? It's what they write in the source code, after all...

@lukewagner
Copy link
Member

I'm inclined to ignore binary encoding size considerations at this stage for this particular issue and focus on which ops have the ideal semantics for compilation to/from wasm, debugging, etc; at least in polyfill-prototype-1, bytes spent on control flow represent only about ~12% of total serialized bytes and I'm bullish on non-nullary macros anyhow.

@MikeHolman
Copy link
Member

For what it's worth, this maps more closely with our IR format so it makes me happy.

@kripken
Copy link
Member

kripken commented Aug 17, 2015

I think this is an interesting compromise between goto and fully structured control flow. However it still has some of the downsides of both,

  • It can't handle arbitrary control flow, specifically loops with multiple entries. That means that compilers can't just emit this, they will need to either abort if they hit such a case, or run something like the relooper to structure control flow (and ignoring loops with multiple entries as "not interesting" seems not a reasonable point of view to take in a compiler). In that way, this is not as good as goto, which "just works" for compiler writers.
  • But this is also not as good as fully structured control flow, in that it is somewhat structured but not as readable as normal loops and ifs. It can also create heavily nested code, again requiring something like the relooper in order to generate more balanced output.

Overall, I agree that structured control flow has downsides. The main one is that it requires more work on the compiler emitting wasm. I do see that as a worthwhile sacrifice, though - ensuring a smaller download, better debuggability, etc. are all more important than making things a little easier for compilers, especially since we are even going all the way and writing a full compiler ourselves that people can use (or take code from).

And while plain goto can avoid the work to reloop, it generates larger code unless we do other extra work, such as using macro compression to compress gotoed output. No free lunch.

@lukewagner
Copy link
Member

@kripken The first-order reasons to do structured control flow have to do with engine impl (e.g.), binary size and the polyfill. The lower-level control flow primitives proposed here preserve all those benefits. As I've argued in other bugs, I don't think we should allow the text format to influence core semantics; providing a nice pleasant source-code-reading experience in devtools should be the job of improved source maps. If you're stepping through raw wasm, you're going to be having an assembly experience anyways (loads from pointers + offsets) so I don't think the presence of if is going to significantly move the needle.

IIUC, this proposal is just a slight tweak to what we currently have based on the realization that if and do while are higher-level constructs than needed by compilers. One ramification is that these lower-level primitives more directly represent the generated machine code and this means less work (particularly in dumb compilers) to generate the optimal code. That is, if (cond) break naively compiles to a test, conditional jump, then fall through to an unconditional jump to the target whereas br_if cond naively compiles to a test and conditional jump directly to the target. As always, the "but pattern matching" argument applies, but we're not talking about adding both, just one or the other so the one that maps more directly to machine instructions seems preferable.

@kripken
Copy link
Member

kripken commented Aug 18, 2015

I agree with most of your first paragraph, but I think we just diverge on the issue of code size. To me, that this increases code size in a way that must be offset elsewhere is a downside. I guess to you, that is acceptable, as you said earlier.

I just don't see making compiler's lives slightly easier in return for an increase in code size as worthwhile, given the "easier" here is so small - it doesn't even enable us to avoid the relooper (which, if that were possible, is a larger benefit, and goto achieves it).

@lukewagner
Copy link
Member

Actually, talking about this earlier with @sunfishcode, it seemed like code size could go either way: if the new relooper strategy was generating a ton of if (cond) break, than br_if replaces 2 ops with 1. It depends on the real-world distributions of ops which we'll just have to measure. My point, though, wasn't that size doesn't matter (it matters a great deal) but, rather, that we shouldn't at this stage in the design process choose worse semantics in the AST for small % predicted layer 0 win that layer 1 seems totally capable of erasing (we've used this reasoning to simplify the AST/layer 0 a few times now). Definitely we can reconsider these things later when we get serious about testing binary formats, the macro layer, etc.

That all being said, I'm still not 100% sure these opcodes are really so much better from a semantic perspective; definitely interested to hear more on this line of discussion.

@kg
Copy link
Contributor

kg commented Aug 18, 2015

That all being said, I'm still not 100% sure these opcodes are really so much better from a semantic perspective; definitely interested to hear more on this line of discussion.

Yeah, part of my objection to this proposal is that the upside isn't particularly clear to me and they feel like an awkward midpoint between fully structured control flow (if, loops, etc) and gotos. As a compiler author I'm not sure how I would benefit from this compared to either alternative.

If we're concerned with optimizing the relooper, shouldn't we just design one or more opcodes for that? Then we don't sacrifice the space efficiency/decode complexity of the (vastly more common) traditional ifs/loops and we don't compromise reloopered code either.

Separate opcodes also make it easier for a runtime to identify 'complex' functions which I think is valuable in terms of simpler validation (plus, interpreters/naive compilers/etc will have an easier time).

Also, to be clear, when I say 'performance hit' for non-nullary macros, I mean both file size and decode efficiency. We've been very concerned about decode performance in the past and something like this could increase decode/validation cost to a not-insignificant degree, since loops and branches are so common. So that's part of why I'm wary.

@qwertie
Copy link

qwertie commented Aug 18, 2015

I like the simplicity of this proposal - and I especially like it more than the previous text, because the previous text does not say that you can break or continue from a nested loop to an outer loop. But it sounds like we can't get a consensus without more prototyping & data. Possible compromise: predefine non-nullary macros if and ifelse that reduce to these simpler constructs. Predefined macros could be left unexpanded by the polyfill, thus translated directly to JS.

@sunfishcode
Copy link
Member Author

@kg:

Yeah, part of my objection to this proposal is that the upside isn't particularly clear to me and they feel like an awkward midpoint between fully structured control flow (if, loops, etc) and gotos. As a compiler author I'm not sure how I would benefit from this compared to either alternative.

I sketched out an algorithm for converting a CFG to structured form above, and it's simpler than the Relooper algorithm. I didn't cover how to reduce multiple-entry loops or deal with switches, but I believe it's straight-forward (ignoring the need for indirect branching, which neither approach handles well).

If we're concerned with optimizing the relooper, shouldn't we just design one or more opcodes for that? Then we don't sacrifice the space efficiency/decode complexity of the (vastly more common) traditional ifs/loops and we don't compromise reloopered code either.

Separate opcodes also make it easier for a runtime to identify 'complex' functions which I think is valuable in terms of simpler validation (plus, interpreters/naive compilers/etc will have an easier time).

On space efficiency, there is an assumption of a non-nullary macro layer, because it's useful for other things as well. If this is not true, the outlook would be different.

The proposal here reduces decode complexity because it makes the control constructs that must be handled much simpler. I expect interpreters naive compilers to be among the benefactors of this proposal. I'm not clear on the meaning of identifying 'complex' functions.

Also, to be clear, when I say 'performance hit' for non-nullary macros, I mean both file size and decode efficiency. We've been very concerned about decode performance in the past and something like this could increase decode/validation cost to a not-insignificant degree, since loops and branches are so common. So that's part of why I'm wary.

It's true that I don't have data here, and won't be able to until a lot more infrastructure is in place. However, another plausible perspective is that it's a historical accident that WebAssembly started with if-else and do-while as the baseline in the first place, and that perhaps WebAssembly should instead start with the simpler operators in this proposal, and add if-else and do-while later based on demonstrated need. I'm open to discussion.

@lukewagner:

That all being said, I'm still not 100% sure these opcodes are really so much better from a semantic perspective; definitely interested to hear more on this line of discussion.

There's no difference in expressiveness. We could translate from either form to the other losslessly. What specifically are you interested in?

@qwertie:
We can break/continue out of nested loops and blocks in the current design too, even if the present wording doesn't quite say that. It's essential.

@titzer
Copy link

titzer commented Aug 18, 2015

I'm not sure why we're spending opcodes on not-equal and all combinations
of greater-than/less-than/equal but can't afford "if".

On Tue, Aug 18, 2015 at 6:17 PM, Dan Gohman notifications@github.com
wrote:

@kg https://github.com/kg:

Yeah, part of my objection to this proposal is that the upside isn't
particularly clear to me and they feel like an awkward midpoint between
fully structured control flow (if, loops, etc) and gotos. As a compiler
author I'm not sure how I would benefit from this compared to either
alternative.

I sketched out an algorithm for converting a CFG to structured form above,
and it's simpler than the Relooper algorithm. I didn't cover how to reduce
multiple-entry loops or deal with switches, but I believe it's
straight-forward (ignoring the need for indirect branching, which neither
approach handles well).

If we're concerned with optimizing the relooper, shouldn't we just design
one or more opcodes for that? Then we don't sacrifice the space
efficiency/decode complexity of the (vastly more common) traditional
ifs/loops and we don't compromise reloopered code either.

Separate opcodes also make it easier for a runtime to identify 'complex'
functions which I think is valuable in terms of simpler validation (plus,
interpreters/naive compilers/etc will have an easier time).

On space efficiency, there is an assumption of a non-nullary macro layer,
because it's useful for other things as well. If this is not true, the
outlook would be different.

The proposal here reduces decode complexity because it makes the
control constructs that must be handled much simpler. I expect interpreters
naive compilers to be among the benefactors of this proposal. I'm not clear
on the meaning of identifying 'complex' functions.

Also, to be clear, when I say 'performance hit' for non-nullary macros, I
mean both file size and decode efficiency. We've been very concerned about
decode performance in the past and something like this could increase
decode/validation cost to a not-insignificant degree, since loops and
branches are so common. So that's part of why I'm wary.

It's true that I don't have data here, and won't be able to until a lot
more infrastructure is in place. However, another plausible perspective is
that it's a historical accident that WebAssembly started with if-else and
do-while as the baseline in the first place, and that perhaps WebAssembly
should instead start with the simpler operators in this proposal, and add
if-else and do-while later based on demonstrated need. I'm open to
discussion.

@lukewagner https://github.com/lukewagner:

That all being said, I'm still not 100% sure these opcodes are really so
much better from a semantic perspective; definitely interested to hear more
on this line of discussion.

There's no difference in expressiveness. We could translate from either
form to the other losslessly. What specifically are you interested in?

@qwertie https://github.com/qwertie:
We can break/continue out of nested loops and blocks in the current design
too, even if the present wording doesn't quite say that. It's essential.


Reply to this email directly or view it on GitHub
#299 (comment).

@lukewagner
Copy link
Member

@sunfishcode I wasn't talking about expressiveness, just the subjective: is this a more natural set of primitives to generate or consume (which of course what makes this a hard issue to come to any definitive conclusion on).

@sunfishcode
Copy link
Member Author

@titzer It's a fair point, and I don't deny there's subjectivity involved. Not-equal operators improve readability while adding very little burden to producers or consumers. If-else and do-while improve readability in many cases but also harm it in some, are a moderate additional burden to CFG-based producers and simple consumers.

I think the subjective part of the question is how we want the language to feel. If WebAssembly is a machine language, rules like what continue does in different contexts feel alien. However, as a programming language, if-else and do-while feel pretty natural.

@creationix
Copy link

I've been going back and forth on this very question in my own VM for uscript. I've implemented at least a dozen VM interpreters in the last few weeks and for the life of me can't decide between structured programming, goto, or something in between.

Personally I would be sad if web assembly was hard to write by hand as a human.

In my experiments, I have yet to see any real performance benefit from choosing a machine style over a higher-level expression-tree style with structured logic. In fact, sometimes the higher-level constructs are much faster, but that's likely because I'm writing a pure interpreter. As I understand it, web-assembly will often use AOT or JIT compilation in production which changes the costs dramatically.

@kg
Copy link
Contributor

kg commented Aug 18, 2015

Hand-writing wasm applications is definitely not high on our list of concerns. As much as we might like to make those use cases nice, it's not on the list of priorities, so we can't optimize for it at the expense of anything else.

Wasm modules will be produced by compilers. Naturally, we don't want to assume a single particular compiler. In some cases a wasm module will be produced from another wasm module via a tool, like one that generates instrumentation, adds validation code, obfuscates it, or strips unused information.

At some point user-mode JITs will use a subset of the wasm module format and IR to generate code at runtime, as well, but that's not a direct concern right now.

@titzer
Copy link

titzer commented Aug 19, 2015

The more I think about this proposal, the more it seems like just adding
"conditional break" and "conditional continue" constructs. You can, of
course, already express conditional break and conditional continue with "if
condition break else nop". In that sense "if" with "break" and "continue"
is just as foundational. I'd have to run some examples but I'm pretty sure
you get exactly the same CFG from the decoder in either case.

On Tue, Aug 18, 2015 at 9:34 PM, Katelyn Gadd notifications@github.com
wrote:

Hand-writing wasm applications is definitely not high on our list of
concerns. As much as we might like to make those use cases nice, it's not
on the list of priorities, so we can't optimize for it at the expense of
anything else.

Wasm modules will be produced by compilers. Naturally, we don't want to
assume a single particular compiler. In some cases a wasm module will be
produced from another wasm module via a tool, like one that generates
instrumentation, adds validation code, obfuscates it, or strips unused
information.

At some point user-mode JITs will use a subset of the wasm module format
and IR to generate code at runtime, as well, but that's not a direct
concern right now.


Reply to this email directly or view it on GitHub
#299 (comment).

@sunfishcode
Copy link
Member Author

@titzer Indeed; if with break and continue is the inspiration for this idea. Yes, you can get the same CFG from the decoder either way. But as @lukewagner observed above, if producers are going to emit this kind of code, they'd be better served by having opcodes that think this way too.

This PR isn't about altering what's possible, or about seeking a minimal basis. It's mainly about how people think about and use wasm.

@sunfishcode
Copy link
Member Author

An observation made recently is that there are important use cases where it's desirable to process WebAssembly code in a streaming fashion, and insert code into the stream. For example, several of the JIT library use cases involve this.

Branches with either absolute or relative offsets would make this awkward because they require patching if code moves around. The hybrid proposal here preserve the ability to have branches merely reference a nesting level avoids this problem. One can easily insert code without the need to update any branches.

@sunfishcode
Copy link
Member Author

@JSStats I expect loop+switch will be rare in practice with the proposal in this PR. In particular, I don't foresee it becoming common to use as a general way to translate a CFG to wasm. If you show me code you think will need it, I can show you what we might do for it. If you are advocating instead for a CFG representation that would be used commonly, I consider that a separate purpose.

@dschuff
Copy link
Member

dschuff commented Sep 24, 2015

I'm in favor of moving in this direction in general. In particular I always
thought it was a little odd that it seemed a forgone conclusion that we
would have structured control flow instead of a CFG, when both the
producers and the consumers would just have to convert to/from CFGs
internally anyway. Obviously it was needed when all we had was Javascript
but we actually get to decide what we should have in this case.

Taking this a step further (and probably outside the scope of this PR), I
would even argue that we should start with the assumption that we should
use a CFG, and then use structured control flow only if there is any
compelling reason.
The reasons I could think of are

  1. That's what emscripten/asm.js do already, so using a CFG might make a
    polyfill more complex and/or slower as it converts from a CFG to asm.js.
    This could partly be mitigated by removing some problematic control flow
    structures ahead of time on the producer side for the benefit of polyfills.
  2. There seems to have been an assumption that structured control flow will
    compress better than a CFG. I haven't seen any numbers to back this up, but
    I think it's an experiment that we should do to find out, especially since
    we are as far down the structured-control path as we are already.

On Wed, Sep 23, 2015 at 9:41 PM Dan Gohman notifications@github.com wrote:

@JSStats https://github.com/JSStats I expect loop+switch will be rare
in practice with the proposal in this PR. In particular, I don't foresee it
becoming common to use as a general way to translate a CFG to wasm. If you
show me code you think will need it, I can show you what we might do for
it. If you are advocating instead for a CFG representation that would be
used commonly, I consider that a separate purpose.


Reply to this email directly or view it on GitHub
#299 (comment).

@rossberg
Copy link
Member

I think we should avoid biased assumptions about the predominance of
CFG-based compilers. It is probably true for the kind of "native" language
compiler we are currently focussing on. But for a broader perspective, take
a look at this list:

https://github.com/jashkenas/coffeescript/wiki/list-of-languages-that-compile-to-js

For many of the compilers in this list (especially those further down the
list), Wasm would be a much more suitable target. Wouldn't it be great if
the Wasm ecosystem could replace the JS kludge for them eventually?

But to get there, we shouldn't make it harder than necessary to write
light-weight compilers emitting Wasm. Instead, we should encourage
everyone to target Wasm. Smaller language projects with limited resources
might benefit greatly from it. I have talked to many people who hope to be
able to do exactly that in the future.

Turning Wasm into a CFG format has minor benefits for CFG-based compilers.
But these are already comparably complex beasts. The simplifying benefit
for them will be lost in the noise of their overall complexity.

But in return, you would be making the life of a whole range of other, more
light-weight compilers measurably harder. That seems like the wrong
trade-off to me.

/Andreas

On 24 September 2015 at 18:07, Derek Schuff notifications@github.com
wrote:

I'm in favor of moving in this direction in general. In particular I always
thought it was a little odd that it seemed a forgone conclusion that we
would have structured control flow instead of a CFG, when both the
producers and the consumers would just have to convert to/from CFGs
internally anyway. Obviously it was needed when all we had was Javascript
but we actually get to decide what we should have in this case.

Taking this a step further (and probably outside the scope of this PR), I
would even argue that we should start with the assumption that we should
use a CFG, and then use structured control flow only if there is any
compelling reason.
The reasons I could think of are

  1. That's what emscripten/asm.js do already, so using a CFG might make a
    polyfill more complex and/or slower as it converts from a CFG to asm.js.
    This could partly be mitigated by removing some problematic control flow
    structures ahead of time on the producer side for the benefit of polyfills.
  2. There seems to have been an assumption that structured control flow will
    compress better than a CFG. I haven't seen any numbers to back this up, but
    I think it's an experiment that we should do to find out, especially since
    we are as far down the structured-control path as we are already.

On Wed, Sep 23, 2015 at 9:41 PM Dan Gohman notifications@github.com
wrote:

@JSStats https://github.com/JSStats I expect loop+switch will be rare
in practice with the proposal in this PR. In particular, I don't foresee
it
becoming common to use as a general way to translate a CFG to wasm. If
you
show me code you think will need it, I can show you what we might do for
it. If you are advocating instead for a CFG representation that would be
used commonly, I consider that a separate purpose.


Reply to this email directly or view it on GitHub
#299 (comment).


Reply to this email directly or view it on GitHub
#299 (comment).

@titzer
Copy link

titzer commented Sep 24, 2015

There are advantages to structuring the control flow with "if" and "block"
and "loop" for all tools that touch the format. As Andreas mentioned,
simple producers don't have to do any work to break down their natural
source-level constructs to control instructions. Tools that do simple
transforms don't have to worry about patching offsets or managing blocks,
splitting them, or introducing new control flow. Inlining really can be
syntactic substitution. Humans that read the format immediately see and
if-then-else or a loop, as opposed to having to recover that information
from the tangle of gotos and jumps that are in a CFG. And I'd argue that
it's actually easier for decoders too, even for ones that go to a
sophisticated compiler IR like TurboFan. "if" just creates a split and join
point for SSA renaming that are obvious, and "loop" starts a new SSA
renaming environment for the body. It's more memory efficient to decode
since these environments follow a stack discipline and can be discarded
when the stack is popped. I've implemented SSA renaming and block building
for a general CFG in a Java JIT and I can definitely say that decoding a
structured control flow system like this is palpably easier and less error
prone. If I were to write a baseline (single-pass) JIT, I think "if" and
"loop" and "break" are just as easy to deal with, if not easier, than jumps.

As for this concrete proposal, we've already seen that the conditional
break and conditional continue necessary to make the simpler algorithm for
CFG->wasm discovered can be easily expressed with if(cond) break or
if(cond) continue. So technically no changes are necessary to support this
different algorithm. This proposal instead seeks to take away expressive
power that both makes the code less dense (by introducing many more control
instructions per source-level construct), no easier to decode, and harder
to produce for some producers.

I have no problem with LLVM generating "if (cond) break" instead of doing
the relooper algorithm which tries to nest control flow. That's completely
a front-end implementation choice. But, if we remain from removing the
structured constructs we have now, then it's entirely conceivable that a
standalone tool can convert the "unnested" style to the "nested" style in a
totally independent pass, saving space and making the code more readable in
the process.

So why do we need to remove expressive power right now?

On Thu, Sep 24, 2015 at 6:55 PM, rossberg-chromium <notifications@github.com

wrote:

I think we should avoid biased assumptions about the predominance of
CFG-based compilers. It is probably true for the kind of "native" language
compiler we are currently focussing on. But for a broader perspective, take
a look at this list:

https://github.com/jashkenas/coffeescript/wiki/list-of-languages-that-compile-to-js

For many of the compilers in this list (especially those further down the
list), Wasm would be a much more suitable target. Wouldn't it be great if
the Wasm ecosystem could replace the JS kludge for them eventually?

But to get there, we shouldn't make it harder than necessary to write
light-weight compilers emitting Wasm. Instead, we should encourage
everyone to target Wasm. Smaller language projects with limited resources
might benefit greatly from it. I have talked to many people who hope to be
able to do exactly that in the future.

Turning Wasm into a CFG format has minor benefits for CFG-based compilers.
But these are already comparably complex beasts. The simplifying benefit
for them will be lost in the noise of their overall complexity.

But in return, you would be making the life of a whole range of other, more
light-weight compilers measurably harder. That seems like the wrong
trade-off to me.

/Andreas

On 24 September 2015 at 18:07, Derek Schuff notifications@github.com
wrote:

I'm in favor of moving in this direction in general. In particular I
always
thought it was a little odd that it seemed a forgone conclusion that we
would have structured control flow instead of a CFG, when both the
producers and the consumers would just have to convert to/from CFGs
internally anyway. Obviously it was needed when all we had was Javascript
but we actually get to decide what we should have in this case.

Taking this a step further (and probably outside the scope of this PR), I
would even argue that we should start with the assumption that we should
use a CFG, and then use structured control flow only if there is any
compelling reason.
The reasons I could think of are

  1. That's what emscripten/asm.js do already, so using a CFG might make a
    polyfill more complex and/or slower as it converts from a CFG to asm.js.
    This could partly be mitigated by removing some problematic control flow
    structures ahead of time on the producer side for the benefit of
    polyfills.
  2. There seems to have been an assumption that structured control flow
    will
    compress better than a CFG. I haven't seen any numbers to back this up,
    but
    I think it's an experiment that we should do to find out, especially
    since
    we are as far down the structured-control path as we are already.

On Wed, Sep 23, 2015 at 9:41 PM Dan Gohman notifications@github.com
wrote:

@JSStats https://github.com/JSStats I expect loop+switch will be
rare
in practice with the proposal in this PR. In particular, I don't
foresee
it
becoming common to use as a general way to translate a CFG to wasm. If
you
show me code you think will need it, I can show you what we might do
for
it. If you are advocating instead for a CFG representation that would
be
used commonly, I consider that a separate purpose.


Reply to this email directly or view it on GitHub
<#299 (comment)
.


Reply to this email directly or view it on GitHub
#299 (comment).


Reply to this email directly or view it on GitHub
#299 (comment).

@sunfishcode
Copy link
Member Author

Lowering if and if-else to the constructs in this proposal is not difficult, even for trivial single-pass compilers. There's not even any backpatching required.

The proposal here follows a stack discipline; blocks can be discarded when the stack is popped. Join points are easily identifiable. AST-based SSA construction works with this proposal in the same way that it already works in JS engines with labeled break and continue. The proposal also clearly identifies loops (nice for optimizers and humans). Inlining still is syntactic substitution. People interested in general CFGs instead are encouraged to submit a separate proposal so we can discuss it separately.

The proposal here does not remove expressive power. It also does not add any. It does: simplify the model of control transfers, and the relationship between a producer's control flow, wasm, and the control flow of an engine. From my experience with Emscripten and asm.js, the more times control flow is translated (source -> LLVM CFG -> asm.js AST -> OdinMonkey CFG -> machine code), the harder it is to follow correspondences through the system when working system-wide.

Humans are important. Ideally, we should enable humans writing high-level language code to debug their code using high-level-language debuggers. Stepping down to wasm should ideally be for low-level concerns, where eg. single-step debugging is a common activity, and it's best if every instruction does one thing. In the proposal here, the special block and loop nodes have a scope, but they themselves don't do anything, and the control transfer instructions themselves all just do one thing and complete.

Since if (x) break L0; is literally a branch-past-a-branch, which isn't how we want to think about it, the current design has the following effective control constructs:

if (x) y();
if (x) y(); else z();
if (x) break L0;
switch (w) { case C: y(); ... }
switch (w) { case C: break L0; ... }
loop { ... }
loop { ... if (x) break; }
loop { ... if (x) y(); }

The list grows if we consider break vs continue, x vs !x, loop vs forever vs do_while, switch cases not having fallthrough or requiring default to go at the end, or combinations thereof. In contrast, the proposal here has:

br L0;
br_if x, L0
br_unless x, L0
switch w, L0, ...

The list grows if we consider block vs loop.

I find arithmetic operators easier to reason about than control transfers because they have a more localized effect (the only non-local behavior is trapping). Consequently, I find greater appeal in reducing the control opcode set than the arithmetic opcode set. In this proposal, the control operators achieve a level of locality which is not far from that of the arithmetic operators.

@kg
Copy link
Contributor

kg commented Sep 24, 2015

From my experience with Emscripten and asm.js, the more times control flow is translated (source -> LLVM CFG -> asm.js AST -> OdinMonkey CFG -> machine code), the harder it is to follow correspondences through the system when working system-wide.

Then why shouldn't I emit my if statements as if nodes and my while loops as while or loop nodes? 😐

@sunfishcode
Copy link
Member Author

Then why shouldn't I emit my if statements as if nodes and my while loops as while or loop nodes? 😐

Because looking at it system-wide (the context for this quote), your code will get lowered to a CFG-like form eventually anyway.

@kg
Copy link
Contributor

kg commented Sep 24, 2015

Because looking at it system-wide (the context for this quote), your code will get lowered to a CFG-like form eventually anyway.

Unless it's being decoded by a runtime that doesn't use a CFG, or the polyfill is converting it to javascript if statements, or it's being run by an interpreter, or being manipulated by developer tools...

I totally get the premise here that LLVM happens to use a CFG, and OdinMonkey uses a CFG. That doesn't surprise me. Making it easy for LLVM and emscripten to encode a CFG also makes sense. I would be happy with some mechanisms being added to enable expressing CFGs, as a replacement for things like loop+switch. But I can't get on board with removing primitives like if statements.

FWIW, most code JSIL encounters is best expressed as if statements and loop blocks. Occasionally it does need something CFG-like (code that used goto, mostly) and in that case I have to use loop-switch. So for that subset of code I'd love some sort of CFG mechanism, but if I had to get rid of if statements and loop blocks that would be a pretty steep price to pay. I wouldn't be surprised if the same was true for some other code generators.

Does the GWT team want CFG? What about the Unity team? How about People authoring compilers for native languages like Rust, Go, etc? Does everyone want CFG and CFG only? I don't get the impression that we actually know this for a fact.

@titzer
Copy link

titzer commented Sep 24, 2015

On Thu, Sep 24, 2015 at 11:32 PM, Dan Gohman notifications@github.com
wrote:

Lowering if and if-else to the constructs in this proposal is not
difficult, even for trivial single-pass compilers. There's not even any
backpatching required.

The proposal here follows a stack discipline; blocks can be discarded when
the stack is popped. Join points are easily identifiable. AST-based SSA
construction works with this proposal in the same way that it already works
in JS engines with labeled break and continue. The proposal also clearly
identifies loops (nice for optimizers and humans). Inlining still is
syntactic substitution. People interested in general CFGs instead are
encouraged to submit a separate proposal so we can discuss it separately.

The proposal here does not remove expressive power
https://en.wikipedia.org/wiki/Expressive_power_%28computer_science%29#Information_Description.
It also does not add any. It does: simplify the model of control transfers,
and the relationship between a producer's control flow, wasm, and the
control flow of an engine. From my experience with Emscripten and asm.js,
the more times control flow is translated (source -> LLVM CFG -> asm.js AST
-> OdinMonkey CFG -> machine code), the harder it is to follow
correspondences through the system when working system-wide.

Humans are important. Ideally, we should enable humans writing high-level
language code to debug their code using high-level-language debuggers.
Stepping down to wasm should ideally be for low-level concerns, where eg.
single-step debugging is a common activity, and it's best if every
instruction does one thing. In the proposal here, the special block and
loop nodes have a scope, but they themselves don't do anything, and the
control transfer instructions themselves all just do one thing and complete.

Since if (x) break L0; is literally a branch-past-a-branch, which isn't
how we want to think about it,

Oh, I agree absolutely that it's not how we want to think about it.
Everyone except a compiler backend wants to think about "if-then-else" and
"loop".

the current design has the following effective control constructs:

if (x) y();
if (x) y(); else z();
if (x) break L0;

This is not a separate construct. There is just "if".

switch (w) { case C: y(); ... }
switch (w) { case C: break L0; ... }
loop { ... }

loop { ... if (x) break; }

loop { ... if (x) y(); }

This is also not a separate construct. There is just "loop".

The list grows if we consider break vs continue, x vs !x, loop vs forever
vs do_while, switch cases not having fallthrough or requiring default to go
at the end, or combinations thereof. In contrast, the proposal here has:

br L0;
br_if x, L0
br_unless x, L0
switch w, L0, ...

The list grows if we consider block vs loop.

I find arithmetic operators easier to reason about than control transfers
because they have a more localized effect (the only non-local behavior is
trapping).

Control is not local, and that's part of the point. If we go low level with
control, then it takes a whole concert of blocks and breaks just to make
something simple like if. We gain nothing by having only these lower level
primitives.

Consequently, I find greater appeal in reducing the control opcode set
than the arithmetic opcode set. In this proposal, the control operators
achieve a level of locality which is not far from that of the arithmetic
operators.

I would argue the opposite, since they now all require label references,
which are decidedly nonlocal. And appeal is subjective.


Reply to this email directly or view it on GitHub
#299 (comment).

@sunfishcode
Copy link
Member Author

@titzer

Everyone except a compiler backend wants to think about "if-then-else" and "loop".

I interpret it to be within the scope of the high-level goal to "Define a [...] format to serve as a compilation target". But it's nice for other low-level purposes too.

if (x) break L0;

This is not a separate construct. There is just "if".

What I mean is, the combination will want to be recognized as if it were a distinct construct to avoid thinking about branch-past-a-branch.

Control is not local, and that's part of the point.

Consider it as temporal locality; "do one thing and complete" vs "do one thing, stick around, then do something else later".

@kg

Unless it's being decoded by a runtime that doesn't use a CFG,

My proposal happens to be great for simple compilers that want to translate straight from WebAssembly to native assembly.

or the polyfill is converting it to javascript if statements

My proposal can be translated to javascript using the if (x) break L0; technique.

or it's being run by an interpreter,

People interested in interpreting wasm directly like my proposal.

or being manipulated by developer tools...

Debuggers are simpler with my proposal. Injecting code into a wasm stream remains simple. Inlining remains syntactic substitution. Are there specific things you have in mind?

Occasionally [JSIL] does need something CFG-like (code that used goto, mostly) [...]

I agree, and the current design shares this limitation. I'm interested in ways we can address it, but it's a separate topic.

Does the GWT team want CFG? What about the Unity team? How about People authoring compilers for native languages like Rust, Go, etc? Does everyone want CFG and CFG only? I don't get the impression that we actually know this for a fact.

Yes, I actually do know that Unity, Rust, and Go would be ok with this. I also know that numerous other native compilers would be ok with this.

I don't know about GWT. However, I'll recall my assertion above that lowering high-level constructs to the constructs in this proposal is not difficult (people do this quite a lot), even for trivial single-pass compilers, so I don't expect GWT would have difficulty.

@kg
Copy link
Contributor

kg commented Sep 25, 2015

My proposal can be translated to javascript using the if (x) break L0; technique.
...
Debuggers are simpler with my proposal. Injecting code into a wasm stream remains simple. Inlining remains syntactic substitution. Are there specific things you have in mind?

I feel like we're talking in circles here. I must have failed to communicate my concern here: My concern is not 'we can't implement this'; we're all programmers here, we know it's possible to implement this. My concern is that it makes things gross and awkward for the average user.

Being able to translate it to javascript with if (x) break L0; is beside the point because that's not an if statement anymore, it's an awkward goto. Debuggers being simpler to implement isn't helpful if the output in the debugger looks more like x86 assembly than it does javascript or C or your source language.

One of the advantages of having an AST with nodes like if and loop is that we retain a lot more source-level structure and it's easier to reason about the AST when you're looking at it. Your proposal destroys a lot of that, which is part of why I dislike it.

Debugging asm.js applications is enough of a nightmare to begin with. Please, please, please, we should not make wasm even worse.

@lukewagner
Copy link
Member

One of the advantages of having an AST with nodes like if and loop is that we retain a lot more
source-level structure and it's easier to reason about the AST when you're looking at it. Your
proposal destroys a lot of that, which is part of why I dislike it.

Debugging asm.js applications is enough of a nightmare to begin with. Please, please, please, we
should not make wasm even worse.

Making wasm close to the source is definitely not a stated goal (high-level or otherwise) of wasm. The path to supporting source-level reasoning about wasm code is already in Tooling.md: debug info (improved source maps). The goal of wasm is to be a good compiler target and that is the context in which we should consider this proposal.

In particular, one common argument we make is that there should be a relatively obvious mapping from wasm ops to machine code; I think br_if is a good addition according to this criteria since it avoids yet another pattern patch and we can expect br_if to be really common (and thus a size/decode-speed win) for certain codegen strategies. I'll suggest that we don't need to remove if-else and do-while, though: they are still useful and common patterns and could provide wins for other codegen strategies. While control flow ops aren't as "cheap" to add as arith ops, they're still pretty easy, so I don't think we should strive for a minimalist basis here.

@AndrewScheidecker
Copy link

I agree with @lukewagner that br_if and the lower-level switch seem like good additions, and that it seems unnecessary to remove the structured if and loop. I like these operations:

This lets a producer provide additional information to the consumer by using the more constrained control flow operations, or to generate a CFG node when necessary or convenient.

@rossberg
Copy link
Member

Dan Gohmann:

The proposal here does not remove expressive power
https://en.wikipedia.org/wiki/Expressive_power_%28computer_science%29#Information_Description.
It also does not add any. It does: simplify the model of control transfers,
and the relationship between a producer's control flow, wasm, and the
control flow of an engine. From my experience with Emscripten and asm.js,
the more times control flow is translated (source -> LLVM CFG -> asm.js AST
-> OdinMonkey CFG -> machine code), the harder it is to follow
correspondences through the system when working system-wide.

I agree that the issue isn't losing expressiveness. But it is losing
structure. I think there are various implicit assumptions with considering
that a simplification, e.g. that neither producers nor consumers ever care
about this structure, because they are CFG-based.

Luke Wagner:

In particular, one common argument we make is that there should be a
relatively obvious mapping from wasm ops to machine code

Being close to machine code is not a value in itself, though. When CPU
manufacturers have to choose between ease of execution in Hardware, and
easing the life of producers, they always chose to put the burden on the
latter. I hope we agree that we need not and should not make the same call
for Wasm unconditionally. Wasm isn't built in hardware, it is designed to
do a non-trivial compilation step. So it doesn't quite have the same
constraints, and abstracting away from some of the intricacies of actual
hardware and machine code can be valuable at that level.

@lukewagner
Copy link
Member

Being close to machine code is not a value in itself, though.

You're right; I think I should have instead stated that the goal is predictability of good codegen. As already argued, br_if maps a little more directly (to jCC) by not relying on branch folding (of if(cond)break and variations thereof) and, for certain relooping strategies, this is going to be a super-common pattern. Yes, as always, we can pattern match, even in a dumb compiler, but as we've decided several times already when faced with this situation before, when a composition of primitives that represents 1 underlying operation is really common, it's valuable to express it directly. So I'd argue that adding br_if (and not removing if-else) is not putting a burden on anyone; it's just indicating to code generators: here is a predictably-efficient construct you can use as needed.

@sunfishcode
Copy link
Member Author

When Sun designed Java, they were also freed from the constraints of direct hardware execution. They chose a CFG. Java's CFG causes some headaches (for VMs), but when Microsoft designed CIL they had the benefit of learning from Java. They also chose a CFG. Both Java bytecode and CIL are higher-level than WebAssembly, and yet it wasn't important to their creators to preserve source-level control constructs in their bytecodes.

The typical human programmer does not see Java or CIL bytecode disassembled on their screen during normal development. Java and C# debuggers don't show users disassembled bytecode (by default); they show the source code.

Obviously users sometimes do encounter bytecode, such as when porting or interfacing different languages. However, while people on the internet are not shy about complaining about aspects of Java bytecode or CIL that make porting new languages to those systems difficult, the lack of source-level-like control structures in the bytecode not something that people mention.

I am struggling to reconcile these circumstances with this conversation.

@titzer
Copy link

titzer commented Sep 29, 2015

I hope we can reach consensus on this point, because I think there is an
underlying design philosophy (structure) which WebAssembly has pursued
until now that this PR seems directed at reducing.

I don't see a problem with accepting some "sugar" forms of control like
if_break and if_continue that target labels of blocks in which they are
nested. At this point the only advantage I personally see in this is that
they avoid an intermediate branch on some hypothetical naive engines. In
v8-native-prototype, an "if(cond) break label;" will not generate any
intermediate branches and will result in exactly the CFG you would get with
the special form "if_break cond label".

But I think it's going backwards to remove the structured "if" and "loop"
constructs from the current design and won't support it.

On Fri, Sep 25, 2015 at 6:24 PM, Dan Gohman notifications@github.com
wrote:

When Sun designed Java, they were also freed from the constraints of
direct hardware execution. They chose a CFG. Java's CFG causes some
headaches (for VMs), but when Microsoft designed CIL they had the benefit
of learning from Java. They also chose a CFG. Both Java bytecode and CIL
are higher-level than WebAssembly, and yet it wasn't important to their
creators to preserve source-level control constructs in their bytecodes.

Neither of these formats are actually CFGs, they are just a sequence of
bytecodes with labels and conditional branches, out of which compilers
construct CFGs. Both of these bytecodes were optimized for interpreter
speed. They both have registers; the JVM has an additional operand stack.
In WebAssembly we've avoided optimizing for interpreter speed and have eliminated the operand
stack in favor of the structure and density possible with expressions. I
see structured control flow constructs in a similar light. The
designers of these bytecodes just simply didn't envision standardizing a structured language because they would have considered a translation step
too costly--which I actually think is kind of ironic, considering how
expensive bytecode verification turned out to be.

The typical human programmer does not see Java or CIL bytecode
disassembled on their screen during normal development. Java and C#
debuggers don't show users disassembled bytecode (by default); they show
the source code.

This argument is a red herring because we aren't talking about end users
or even high-level programmers, we're talking about producers and consumers
of the bytecode format; compilers, tooling, analysis, and engines.

Obviously users sometimes do encounter bytecode, such as when porting or
interfacing different languages. However, while people on the internet are
not shy about complaining about aspects of Java bytecode or CIL that make
porting new languages to those systems difficult, the lack of
source-level-like control structures in the bytecode not something that
people mention.

Well that might be more be a result of most people doing implementations of
imperative languages sticking to bytecodes and others sticking to ASTs. The
functional language community operates almost solely with IRs based on
trees and structured control constructs, and don't spend much time messing
with bytecode.

I am struggling to reconcile these circumstances with this conversation.


Reply to this email directly or view it on GitHub
#299 (comment).

@lukewagner
Copy link
Member

Ok, so then could the resolution of this issue be: add break_if and continue_if?

@rossberg
Copy link
Member

rossberg commented Sep 29, 2015 via email

@titzer
Copy link

titzer commented Sep 29, 2015

On Tue, Sep 29, 2015 at 3:23 PM, Luke Wagner notifications@github.com
wrote:

Ok, so then could the resolution of this issue be: add break_if and
continue_if?

I'm fine with that.


Reply to this email directly or view it on GitHub
#299 (comment).

@AndrewScheidecker
Copy link

Ok, so then could the resolution of this issue be: add break_if and continue_if?

To dispose of the other changes in this PR:

  1. Making switch into a branch
  2. Removing if
  3. Adding a select operation
  4. Merging label and block
  5. Merging break and continue
  6. Merging do_while and forever into loop

We can continue discussion of 1 further in #322, and 5 in #310. 2 seems to have pretty strong opposition. Does anybody object to 3? What about 4 and 6? I think 4 is good. We should either do 6 or add do_while to the spec interpreter.

@sunfishcode
Copy link
Member Author

When Sun designed Java, they were also freed from the constraints of
direct hardware execution. They chose a CFG. Java's CFG causes some
headaches (for VMs), but when Microsoft designed CIL they had the benefit
of learning from Java. They also chose a CFG. Both Java bytecode and CIL
are higher-level than WebAssembly, and yet it wasn't important to their
creators to preserve source-level control constructs in their bytecodes.

Neither of these formats are actually CFGs, they are just a sequence of
bytecodes with labels and conditional branches, out of which compilers
construct CFGs. Both of these bytecodes were optimized for interpreter
speed. They both have registers; the JVM has an additional operand stack.

While Java bytecode was designed to be interpreted, this doesn't appear to be the case for CIL (here, here). CIL even makes direct interpretation difficult for example by making simple instructions like add polymorphic.

Obviously users sometimes do encounter bytecode, such as when porting or
interfacing different languages. However, while people on the internet are
not shy about complaining about aspects of Java bytecode or CIL that make
porting new languages to those systems difficult, the lack of
source-level-like control structures in the bytecode not something that
people mention.
Well that might be more be a result of most people doing implementations of
imperative languages sticking to bytecodes and others sticking to ASTs. The
functional language community operates almost solely with IRs based on
trees and structured control constructs, and don't spend much time messing
with bytecode.

I'm very interested in supporting people working from high-level ASTs, and I expect they'll want many high-level features that WebAssembly isn't likely to support directly, which is why I'm promoting the idea of a JIT library. Among other higher-level features, such a libary could also easily provide even higher-level control constructs like while or for.

@jbondc
Copy link
Contributor

jbondc commented Oct 7, 2015

Agree more with @rossberg-chromium. For the JIT Library that could also easily provide even higher-level control constructs like while or for. Can't the reverse easily be said that the JIT Library could offer a transform(AST) method to turn it into a less structured shape?

In WebAssembly we've avoided optimizing for interpreter speed and have eliminated the operand stack in favor of the structure and density possible with expressions

Yes please, that's what makes WebAssembly so interesting 👍

@sunfishcode
Copy link
Member Author

Closing and following up in #403.

@sunfishcode sunfishcode deleted the control-flow branch October 31, 2015 17:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.