Fast validation by kripken · Pull Request #1204 · WebAssembly/binaryen

kripken · 2017-09-29T00:06:00Z

This makes wasm validation parallel (the function part). This makes loading+validating tanks (a 12MB wasm file) 2.3x faster on a 4-core machine (from 3.5 to 1.5 seconds). It's a big speedup because most of loading+validating was actually validating.

It's also noticeable during compilation, since we validate by default at the end. 8% faster on -O2 and 23% on -O0. So actually fairly significant on -O0 builds.

As a bonus, this PR also moves the code from being 99% in the header to be 1% in the header, which I think @dschuff will appreciate ;)

binji · 2017-09-29T18:06:38Z

This is a bit surprising to me, since wabt's validation is actually one of the faster parts of loading. I understand binaryen's validation is different, but I assumed it was still single pass, right?

I annotated wasm2wat w/ timings when loading tanks:

read file: 0.0144852s
read binary: 0.684837s
validate: 0.190429s
apply names: 0.116855s
write wat: 1.28688s

These probably could be a lot faster actually... :-} But the point is that validation is actually pretty small.

kripken · 2017-09-29T18:19:28Z

Yeah, this is a little weird I guess. Binaryen does some fast validation during reading, like a wasm reader would, but most of the work is done after reading, on the IR. And we validate some internal IR details (like unreachable types being correct, node types not being stale, nodes appearing only once in the IR, etc.), so it's maybe more like LLVM's validation than wasm binary validation in that respect. And those haven't been much optimized - I think there are some places where it might not be linear. So it's both doing more work and not doing it super-efficiently ;)

For now, parallelizing is pretty easy to do and gives a big speedup. We should also probably give an option to not validate, like I think LLVM does.

And thanks for those numbers, that's interesting. Seems like binaryen loading is in the right ballpark but still kind of slow. Btw, is all that single-threaded in wabt?

cc @yurydelendik - @binji's wabt numbers on wasm loading might interest you as you were measuring related things I think.

binji · 2017-09-29T18:34:35Z

Yeah, all single-threaded for now. It's also worth noting that when I enable expression folding it gets quite a bit slower (unoptimized, so probably could improve). I assume binaryen is doing something like this when creating the IR (what I call "read binary" here), so maybe that explains some additional speed differences.

read file: 0.0145758s
read binary: 0.672801s
validate: 0.158918s
apply names: 0.0924712s
write wat: 2.66662s

kripken · 2017-09-29T19:43:35Z

What do you mean by expression folding?

binji · 2017-09-29T19:53:52Z

Turning stuff like this:

i32.const 0
i32.const 1
i32.add

into this:

(i32.add (i32.const 0) (i32.const 1))

That's what the spec calls it, anyway: https://webassembly.github.io/spec/text/instructions.html#folded-instructions

kripken · 2017-09-29T19:58:44Z

I see, yeah. Makes sense that takes more work.

jgravelle-google · 2017-10-02T18:05:22Z

-  void validateMemBytes(uint8_t bytes, WasmType type, Expression* curr);
-  void validateBinaryenIR(Module& wasm);
+struct WasmValidator {
+  bool validate(Module& module, bool validateWeb = false, bool validateGlobally = true, bool quiet = false);


Three bool args make me think this should be flags.

enum ValidatorFlags { ValidateWeb = 1 << 0, ValidateGlobally = 1 << 1, ValidateQuiet = 1 << 2 } //... bool WasmValidator::validate(Module& module, ValidatorFlags flags); //... validator.flags(wasm, ValidateGlobally | ValidateQuiet);

Yeah, makes sense. How about if I do that in a followup? (I'm trying to change the API as little as possible in this PR)

Sounds reasonable

kripken added 8 commits September 29, 2017 14:14

wip

28c5258

builds

b508032

really builds

15482ac

deadlock

720e9ec

fix

8b54074

rewrite parallel logging logic

1883f41

don't use multiple threads in torture tests, which are parallel anyhow

24e9f0f

if we fail to create a thread, don't use multiple threads

38e6c31

kripken force-pushed the fast-validation branch from f567bc5 to 38e6c31 Compare September 29, 2017 21:17

kripken mentioned this pull request Sep 29, 2017

Thread fixes #1205

Merged

jgravelle-google reviewed Oct 2, 2017

View reviewed changes

kripken merged commit 1f8d8a5 into master Oct 2, 2017

kripken deleted the fast-validation branch October 2, 2017 20:52

kripken mentioned this pull request Oct 2, 2017

Refactor validator API to use enums #1209

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fast validation#1204

Fast validation#1204
kripken merged 8 commits intomasterfrom
fast-validation

kripken commented Sep 29, 2017

Uh oh!

binji commented Sep 29, 2017

Uh oh!

kripken commented Sep 29, 2017

Uh oh!

binji commented Sep 29, 2017

Uh oh!

kripken commented Sep 29, 2017

Uh oh!

binji commented Sep 29, 2017

Uh oh!

kripken commented Sep 29, 2017

Uh oh!

jgravelle-google Oct 2, 2017

Uh oh!

kripken Oct 2, 2017

Uh oh!

jgravelle-google Oct 2, 2017

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

kripken commented Sep 29, 2017

Uh oh!

binji commented Sep 29, 2017

Uh oh!

kripken commented Sep 29, 2017

Uh oh!

binji commented Sep 29, 2017

Uh oh!

kripken commented Sep 29, 2017

Uh oh!

binji commented Sep 29, 2017

Uh oh!

kripken commented Sep 29, 2017

Uh oh!

jgravelle-google Oct 2, 2017

Choose a reason for hiding this comment

Uh oh!

kripken Oct 2, 2017

Choose a reason for hiding this comment

Uh oh!

jgravelle-google Oct 2, 2017

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants