Conversation
|
This is a bit surprising to me, since wabt's validation is actually one of the faster parts of loading. I understand binaryen's validation is different, but I assumed it was still single pass, right? I annotated These probably could be a lot faster actually... :-} But the point is that validation is actually pretty small. |
|
Yeah, this is a little weird I guess. Binaryen does some fast validation during reading, like a wasm reader would, but most of the work is done after reading, on the IR. And we validate some internal IR details (like unreachable types being correct, node types not being stale, nodes appearing only once in the IR, etc.), so it's maybe more like LLVM's validation than wasm binary validation in that respect. And those haven't been much optimized - I think there are some places where it might not be linear. So it's both doing more work and not doing it super-efficiently ;) For now, parallelizing is pretty easy to do and gives a big speedup. We should also probably give an option to not validate, like I think LLVM does. And thanks for those numbers, that's interesting. Seems like binaryen loading is in the right ballpark but still kind of slow. Btw, is all that single-threaded in wabt? cc @yurydelendik - @binji's wabt numbers on wasm loading might interest you as you were measuring related things I think. |
|
Yeah, all single-threaded for now. It's also worth noting that when I enable expression folding it gets quite a bit slower (unoptimized, so probably could improve). I assume binaryen is doing something like this when creating the IR (what I call "read binary" here), so maybe that explains some additional speed differences. |
|
What do you mean by expression folding? |
|
Turning stuff like this: into this: That's what the spec calls it, anyway: https://webassembly.github.io/spec/text/instructions.html#folded-instructions |
|
I see, yeah. Makes sense that takes more work. |
f567bc5 to
38e6c31
Compare
| void validateMemBytes(uint8_t bytes, WasmType type, Expression* curr); | ||
| void validateBinaryenIR(Module& wasm); | ||
| struct WasmValidator { | ||
| bool validate(Module& module, bool validateWeb = false, bool validateGlobally = true, bool quiet = false); |
There was a problem hiding this comment.
Three bool args make me think this should be flags.
enum ValidatorFlags {
ValidateWeb = 1 << 0,
ValidateGlobally = 1 << 1,
ValidateQuiet = 1 << 2
}
//...
bool WasmValidator::validate(Module& module, ValidatorFlags flags);
//...
validator.flags(wasm, ValidateGlobally | ValidateQuiet);There was a problem hiding this comment.
Yeah, makes sense. How about if I do that in a followup? (I'm trying to change the API as little as possible in this PR)
This makes wasm validation parallel (the function part). This makes loading+validating tanks (a 12MB wasm file) 2.3x faster on a 4-core machine (from 3.5 to 1.5 seconds). It's a big speedup because most of loading+validating was actually validating.
It's also noticeable during compilation, since we validate by default at the end. 8% faster on -O2 and 23% on -O0. So actually fairly significant on -O0 builds.
As a bonus, this PR also moves the code from being 99% in the header to be 1% in the header, which I think @dschuff will appreciate ;)