Get wasm2asm building again#1107
Conversation
Updates CMakeLists.txt to have wasm2asm built by default, updates wasm2asm.h to account for recent interface changes, and restores JSPrinter functionality.
|
This the beginning of my work on wasm2asm. Getting it fully functional will allow me to use it in the Rust compiler to support asm.js and eventually remove Rust's dependency on Fastcomp. |
|
Glad to see interest in working on this :) We should figure out a plan for how this will work - perhaps you've already thought through these points, but these are the issues that stalled the initial work on this:
The good news is that I think (1) can be handled by running the flatten-control-flow pass first; then what is left is much simpler and should map 1:1 to asm.js. And (3) is much less of an issue since the binaryen optimizer is now good enough to optimize away temp locals like that, so all optimization could be done in binaryen, without depending on running an asm.js optimizer. Also the tradeoffs need to be thought out. Sounds like your focus is on offline compilation, while I think it's also useful to have client-side compilation (e.g., a polyfill). That could affect the design here. Thoughts? |
|
@kripken thanks for bringing up those issues. My hope is that the control flow flattening will work, but if it doesn't then I'm prepared to do work to find a transformation. I will probably try to emit and use library code for i64 support initially, just to get something working, but it would probably be good to emit smarter code later. My priority is definitely offline compilation, but I agree that client-side seems useful as well. What kind of tradeoffs do you think would be important there? |
|
The main tradeoff there is compilation speed, I think. But maybe it's not an issue any more, in the past we would have needed to run the slower asm.js optimizer on the asm.js, but now it's probably enough to just run the fast binaryen optimizer before doing the conversion to asm.js. Could be fine on the client. |
| emit(node->getCString()); | ||
| } | ||
|
|
||
| static char* numToString(double d, bool finalize=true) { |
There was a problem hiding this comment.
This line seems overindented
| } | ||
| // figure out the table size | ||
| tableSize = wasm->table.names.size(); | ||
| // tableSize = wasm->table.names.size(); |
There was a problem hiding this comment.
Remove commented-out code
| table.resize(tableSize); | ||
| for (size_t j = 0; j < tableSize; j++) { | ||
| table[j] = fromName(name); | ||
| // for (size_t i = 0; i < wasm->table.names.size(); i++) { |
| } | ||
| default: abort(); | ||
| default: | ||
| std::cerr << curr << '\n'; |
There was a problem hiding this comment.
Yay error messages.
Might be worth adding more context though? "Unexpected type while visiting const" or something?
| // Utils | ||
|
|
||
| void ensure(int safety=100) { | ||
| if (size < used + safety) { |
There was a problem hiding this comment.
Can reduce a level of nesting here by changing this to early-return style:
if (size >= used + safety) {
return;
}| return; | ||
| } | ||
| if (node->isAssign()) { | ||
| printAssign(node); |
There was a problem hiding this comment.
No return after printAssign?
There was a problem hiding this comment.
This code had previously been deleted, so I just brought it back exactly how it was. I don't see why there wouldn't be a return here, but I don't want to mess with it before I have tests running.
| break; | ||
| } | ||
| default: { | ||
| printf("cannot yet print %s\n", type.str); |
There was a problem hiding this comment.
Should this print to stderr?
jgravelle-google
left a comment
There was a problem hiding this comment.
Looking at the test dir briefly I found this:
https://github.com/WebAssembly/binaryen/blob/master/test/address.2asm.js
Which was added in this commit: 299ab38#diff-391e3722bdf96d68c289a54f04d54d97
There's also this commit that removed the wasm2asm tests from check.py: 0783d60#diff-80ee07c6b10902b1bae15660a736de65
So, this PR should probably turn those tests on too? Or at least bring them back if 0:'d, and ran locally
| } | ||
| // if (theVar[1]->size() == 0) { XXX ??? | ||
| // if (theVar[1]->size() == 0) { | ||
| // XXX What should splice do??? |
There was a problem hiding this comment.
Ah, good reword of the comment here to be more explicit about what's "???"
Yeah this seems like the kind of thing we should know? But also seems like the kind of thing that should already be broken. Which ties in nicely to the whole concept of "having tests."
Which is to say I'm ok with leaving this as-is until we have a failing test to motivate what it should be. I also think we should get some tests for pretty much asap.
|
Test-wise it would be good to (eventually, not necessarily in this PR, depending on when it lands) have some reference tests (i.e., compare the asm.js output to a correct output) and functional tests. For functional tests,
It would be good to have both, although the main use case discussed here would be the second of those, so the first might be left for later. |
| run_command(ASM2WASM + [asmjs, '-o', 'b.wast', '-S']) | ||
| assert open('b.wast', 'rb').read()[0] != '\0', 'we emit text with -S' | ||
|
|
||
| ''' wasm2asm tests disabled for now. Use scripts/test/wasm2asm.py to test |
There was a problem hiding this comment.
Better idea: have check.py import scripts.test.wasm2asm as wasm2asm, wrap the check code in a function, check_wasm2asm, and call it from here. (Then comment out the call)
|
I was failing CI earlier because of unused imports in wasm2asm.py. Does that lint only run on the files under scripts? Edit: It looks like it does only run for scripts |
|
Oh, I didn't think about that as being a lint failure but no yeah that makes sense. Probably fine as-is then. |
|
The difference in parentheses and semicolons between the current code and the previous working version of wasm2asm is due to the removal of STAT nodes in f6c0bc2. @kripken do you know if there are any situations in which missing these semicolons would be semantically incorrect? If so, I should bring back STAT nodes. |
| raise Exception(('run_command failed', err)) | ||
| if expected_err is not None and err != expected_err: | ||
| err_correct = expected_err is None or \ | ||
| (expected_err in err if err_contains else expected_err == err) |
There was a problem hiding this comment.
Why not indent this line two extra spaces, instead of disabling that warning project-wide?
|
Hmm, STAT nodes only matter for printing of JS, they say where to emit a semicolon, and the asm.js subset is simple enough that I think we'd either be ok without semicolons, or would know trivially where to add them. In other words, we might need to change the asm.js AST => asm.js text code a little, but I'm hoping not much, probably just something like add a semicolon
|
|
Cool, I know semicolons are emitted after ifs and loops, so I just added a special case for statements in blocks. This PR should be good to go now. |
kripken
left a comment
There was a problem hiding this comment.
Overall this looks ok, aside from one issue in the tests - is that a temporary limitation, no longer emitting x = x | 0; etc.? And are the tests purposefully disabled?
I didn't read the logic changes too carefully, it'll be easier to just see test outputs when they are available.
|
|
||
| s2wasm.test_s2wasm() | ||
| s2wasm.test_linker() | ||
| # wasm2asm.test_wasm2asm() |
There was a problem hiding this comment.
Yes, since there aren't many tests right now anyway. I was planning to enable the tests in a future PR that would finish 32 bit support and add more tests.
| if proc.returncode != expected_status: | ||
| raise Exception(('run_command failed', err)) | ||
| if expected_err is not None and err != expected_err: | ||
| err_correct = expected_err is None or \ |
There was a problem hiding this comment.
why do we need the new contained-but-not-equal mode? is equality too much for some test?
There was a problem hiding this comment.
Yes, the MOZJS case in test_wasm2asm() cannot use equality because the error message contains timing information that is not test-independent.
There was a problem hiding this comment.
Do we need timing information? What is actually used there?
There was a problem hiding this comment.
The full stderr of mozjs for successful test looks like a.2asm.js:warning: Successfully compiled asm.js code (total compilation time 4ms; caching disabled by missing command-line arguments). The only part the test cares about is Successfully compiled asm.js code. I don't know of a way to stop the timing information from being emitted, which would allow us to use equality here.
There was a problem hiding this comment.
Oh, I see. Yeah, that's hard to remove. Looks good then, maybe add a comment though to say why we just check for the subset (if there isn't one already).
| table.resize(tableSize); | ||
| for (size_t j = 0; j < tableSize; j++) { | ||
| table[j] = fromName(name); | ||
| for (Table::Segment &seg : wasm->table.segments) { |
There was a problem hiding this comment.
& should be on the type, not the var
| function add(x, y) { | ||
| x = x | 0; | ||
| y = y | 0; | ||
| x = (x | 0) |
There was a problem hiding this comment.
this looks worse than before - the parens are not needed, and the ; is not strictly needed but is safer to emit. it's actually possible some asm.js compilers depend on the lack of parens and the existence of the ';', I'm not sure - the spec can be read as assuming the very canonical form x = x | 0; (up to whitespace)
There was a problem hiding this comment.
These changes are a result of STAT nodes being removed from the code base. I will put them back in to fix this.
There was a problem hiding this comment.
We took out STAT nodes as an optimization, though. I think we should emit ;s without needing an AST node for them. It should be easy as discussed before, to just emit one after each element in a block (and loop body, and if arms).
There was a problem hiding this comment.
The loop body and if arms are already taken of, but when I tried making it add a ; after each element of a block, I got a lot of ugly double semicolons and semicolons on their own line. Is that preferable to leaving them out or reinstating STAT nodes? I have a separate simple fix for the parentheses problem.
There was a problem hiding this comment.
It's safer to leave extra semicolons in (no semicolons can run into automatic semicolon insertion nonsense in js...). Let's do that, and maybe figure out how to remove then later.
|
Would be quite useful in binaryen.js as well, i.e. as |
|
@kripken The semicolons should be good enough for now. Is there anything else you see that should be fixed? |
|
lgtm, thanks. |
Updates CMakeLists.txt to have wasm2asm built by default, updates
wasm2asm.h to account for recent interface changes, and restores
JSPrinter functionality.