WebAssembly

Alon Zakai edited this page Jan 4, 2017 · 40 revisions

WebAssembly is a new binary format for executing code on the web. The overall plan is

  • WebAssembly will allow much faster start times (small download, much faster parsing in browsers) for Emscripten projects.
  • Emscripten supports compiling to WebAssembly with a compiler flag, so it is easy for projects to target both WebAssembly and asm.js.
  • Binaryen is a new C++ library for WebAssembly that helps Emscripten in this area, it optimizes WebAssembly and helps compile to it. It's being written as a separate project for modularity.

For more background, see

Building to WebAssembly

Setup

Things are still in progress, so the process for building WebAssembly is not yet fully optimized. Here are the current steps:

  • Get and use emscripten's incoming branch.
  • Build with emcc [..args..] -s WASM=1

and then just run the output js or html file. If you run the code in a JS engine with WebAssembly support, it will try to use that support, but since the binary format is in flux, this might fail (since what binaryen emits might not be in sync with what the browser accepts). To force usage of the interpreter, which should always work properly (but slowly), use the following additional flag:

  • -s "BINARYEN_METHOD='interpret-binary'"

Notes:

  • You can use BINARYEN=1 instead of WASM=1, they do the same thing.
  • the WASM, BINARYEN_METHOD, etc. options only matter when compiling to your final executable. In other words, the same .o files are used for both asm.js and WebAssembly. Only when linking them and compiling to asm.js or WebAssembly do you need to specify WebAssembly if you want that. That means that it is easy to build your project to both asm.js and WebAssembly.

Binaryen methods

When using Binaryen with Emscripten, it can load the compiled code using one of several methods. By setting -s BINARYEN_METHOD='..' you can specify those methods, as a comma-separated list (note: on the commandline you might need to quote twice, -s "BINARYEN_METHOD='..'"). It will try them one by one, which allows fallbacks.

By default, it will try native support. The full list of methods is

  • native-wasm: Use native binary wasm support in the browser.
  • interpret-s-expr: Load a .wast, which contains wasm in s-expression format, and interpret it.
  • interpret-binary: Load a .wasm, which contains wasm in binary format, and interpret it.
  • interpret-asm2wasm: Load .asm.js, compile to wasm on the fly, and interpret that.
  • asmjs: Load .asm.js and just run it, no wasm. Useful for comparisons, or as a fallback for browsers without WebAssembly support.

For more details, see src/js/wasm.js-post.js in the Binaryen repo. The function integrateWasmJS is where all the integration between JavaScript and WebAssembly happens.

Codegen effects

Note that the methods you specify affect what is emmitted. For example, -s "BINARYEN_METHOD='native-wasm,asmjs'" will try native support, and if that fails, will use asm.js. This avoids using the WebAssembly polyfill interpreter in both cases, so the interpreter won't be linked in to your code.

Another effect is that if you specify asmjs as one of the methods, then you will get a "compromise" build:

  • Some WebAssembly-specific optimizations will be prevented (like native i64s).
  • Build times will be slower than a WebAssembly-only build (in which we would use only the fast Binaryen optimizer; but for asm.js, we can't do that).
  • The asm.js code will be marked "almost asm" instead of "use asm", as a build that does either WebAssembly or asm.js will use a WebAssembly Memory during startup, and that is not compatible with asm.js optimizations.

As a result, if you want maximal performance, instead of using native-wasm,asmjs (which would try WebAssembly and fall back to asm.js if necessary), you can create two separate builds as described earlier, and run the asm.js one if WebAssembly is not present in the user's browser.

Binaryen codegen options

Precise mode

By default emscripten will emit code that behaves precisely the same as asm.js. However, WebAssembly has slightly different semantics than JavaScript and asm.js, and we need to work around those differences. That can add overhead, which you can avoid by passing

-s BINARYEN_IMPRECISE=1

The difference between precise and imprecise mode is in areas that are considered undefined behavior in C and C++, like trying to round a very large float to an integer. WebAssembly will trap on such things. As a result, in precise mode we make sure to behave like most native platforms and asm.js do (which is to do some reasonable default behavior for undefined float rounding), while in imprecise mode if you hit such a float, you will see a trap and the application will halt.

The overhead of precise mode can be noticeable in benchmarks, so for anything speed-intensive you should use imprecise mode. There is some debate as to which mode should be on by default: https://github.com/kripken/emscripten/issues/4625

Compiler output

When using emcc to build to WebAssembly, you will see a .wasm file containing that code, as well as the usual .js file that is the main target of compilation. Those two are built to work together: run the .js (or .html, if that's what you asked for) file, and it will load and set up the WebAssembly code for you, properly setting up imports and exports for it, etc. Basically, you don't need to care about whether the compiled code is asm.js or WebAssembly, it's just a compiler flag, and otherwise everything should just work (except the WebAssembly should be faster).

  • Note that the .wasm file is not standalone - it's not easy to manually run it without that .js code, as it depends on getting the proper imports that integrate with JS. For example, it receives imports for syscalls so that it can do things like print to the console. There is work in progress towards ways to create standalone .wasm files, see the WebAssembly Standalone page.

You may also see additional files generated, like a .data file if you are preloading files into the virtual filesystem. All that is exactly the same as when building to asm.js. One difference you may notice is the lack of a .mem file, which for asm.js contains the static memory initialization data, which in WebAssembly we can pack more efficiently into the WebAssembly binary itself.

Testing native WebAssembly in browsers

WebAssembly isn't released yet, so it isn't enabled by default. But you can test it in development builds:

  • In Firefox, use Nightly and set javascript.options.wasm in about:config.
  • In Chrome, use Canary and enable chrome://flags/#enable-webassembly.

Debugging

asm.js support is considered very stable now, and you can change between it and wasm with -s WASM=1, so if you see something odd in a wasm build, comparing to a parallel asm.js build can help. In general, any difference between the two could be a compiler bug or browser bug, but there are a few legitimate causes of different behavior between the two, that you may want to rule out:

  • wasm allows unaligned accesses, i.e. it will load 4 bytes from an unaligned address the same way x86 does (it doesn't care it's unaligned). asm.js works more like ARM CPUs which mostly don't accept such things (but they often trap, while asm.js just returns a wrong result). To rule this out, you can build with -s SAFE_HEAP=1, that will catch all such invalid accesses.
  • Timing issues - wasm might run faster or slower. To some extent you can mitigate that by building with -s DETERMINISTIC=1.
  • Precise/imprecise mode. As mentioned above, we generate wasm that should behave precisely like asm.js by default, but as an optimization you can change that (by enabling imprecise mode). Make sure you aren't doing that when doing such comparisons.
  • Minor libc and runtime differences. To eliminate any possible difference due to that, use builds that support both, i.e. use the same runtime etc. for both approaches, using e.g. -s "BINARYEN_METHOD='native-wasm,asmjs'" for a build that can do both, but defaults to wasm, and -s "BINARYEN_METHOD='asmjs,native-wasm'" for what is an identical build that does asm.js first. (In fact, since the builds are identical, you can make one and edit the native-wasm,asmjs string manually in the generated JS, to switch between asm.js and wasm.) Note: Such builds disable some optimizations, as mentioned above, so it's not a good idea in general.
  • Browser instability: It's worth testing multiple browsers, as one might have a wasm bug that another doesn't. You can also test the Binaryen interpreter (e.g. using the interpret-binary method, as discussed above).

Notes

  • Build with EMCC_DEBUG=1 in the env to see Emscripten's debug output as it runs the various tools, and also to save the intermediate files in /tmp/emscripten_temp. It will save both the .s and .wast files there (in addition to other files it normally saves).

Known issues

  • Closure compiler (--closure to emcc) makes a few emscripten test suite cases fail with Binaryen.

New WebAssembly backend + Binaryen's s2wasm

The steps in the previous section all use Binaryen's asm2wasm tool to compile asm.js to WebAssembly. This option is considered stable as it passes the test suite.

There is a new LLVM backend being developed for WebAssembly. It will eventually provide another way to compile to wasm, but is not yet ready, see the new wasm backend web page.