Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Lazy validation of functions #1464

Open
gahaas opened this issue Nov 9, 2022 · 15 comments
Open

Lazy validation of functions #1464

gahaas opened this issue Nov 9, 2022 · 15 comments

Comments

@gahaas
Copy link
Contributor

gahaas commented Nov 9, 2022

Already in 2016 an agreement was made (see #719 (comment)) to allow lazy validation of functions. This means that if a WebAssembly module contains an invalid function, module compilation would still be allowed to succeed, but executing the invalid function would trap.

My question now is, where is lazy validation of functions mentioned in the spec? I cannot find it anywhere.

If lazy validation was not added back then by accident, should we add it to the spec now? In V8 we did performance measurements with lazy validation, and especially in combination with lazy compilation lazy validation shows interesting performance gains.

@rossberg
Copy link
Member

rossberg commented Nov 9, 2022

@gahaas, it's specified here.

That said, I know that some of us were actually hoping that we could remove this eventually, since it is gross, and was mainly introduced to address a single engine that now is no longer maintained. I would be rather sad if V8 were to start picking it up. I understand the advantages of lazy compilation, but does it need lazy validation?

@lukewagner
Copy link
Member

+1 to @rossberg's point

@penzn
Copy link

penzn commented Nov 10, 2022

Would not this actually be useful for large libraries? It's not hard to imagine how this would lead to performance improvements.

@eqrion
Copy link

eqrion commented Nov 10, 2022

I would be concerned that if lazy validation is started to be widely used in engines, we may eventually find all engines are required to implement it to remain web-compatible.

A concrete issue would be if a widely used module has a function with a validation error in it, but the function is never called in practice. On an engine with lazy validation, the module would run fine. On an engine with eager validation, the module would fail to compile and the web page would be broken. In order to resolve this, the second engine would need to implement lazy validation even if it doesn't make sense in their compilation model.

@rrwinterton
Copy link

There are other concerns to about validation of the wasm if cached. Perhaps this isn't relevant but a strict validation of the wasm code is going to require more than a hash validation. Are these two different validations? Could be but probably shouldn't be.

@penzn
Copy link

penzn commented Nov 10, 2022

There is only one Wasm validation, but the question is whether or not it is OK to delay validating a function if it is not currently getting called.

On a side note, the engine that introduced this is, strictly speaking, maintained, but it does not meet our requirements for a "web runtime" as it is not part of a browser.

A concrete issue would be if a widely used module has a function with a validation error in it, but the function is never called in practice. On an engine with lazy validation, the module would run fine.

This can be a way to support standard instructions and instructions from a proposal in a single module, though the check "is the proposal supported" would have to be externalized. However, you are right that it would create a situation when this module is both broken and correct at the same time.

@gahaas do you have any data that you can share about this?

@gahaas
Copy link
Contributor Author

gahaas commented Nov 14, 2022

@penzn here is some data I measured:

I measured with validation times of an about 40MB big wasm module with a bit more than 100.000 functions, on a 4 core machine (I configured my work station to only use 4 cores). When validation is done in a separate step, then it takes about 125ms. When done as part of compilation, it only adds 22ms overhead to compilation without validation (with validation: 758, without validation: 736). With lazy compilation, validation has to be done separately, so validation causes around 100ms overhead. Note that the actual overhead is higher, especially during startup, because most functions don’t get executed during startup and would therefore not have to be validated. For the module I measured, only about 20% of the code gets executed during startup.

On weaker devices we see validation times of more than 1 second for the same module.

We see in big applications like Photoshop that especially during startup the CPU is a bottleneck. Therefore we try to reduce the use of the CPU for code compilation, validation, and optimization. The advantage of lazy compilation and validation is not just that we do not only postpone compilation and validation, additionally most of the compilation does not even happen because many function never get executed in the first place.

@eqrion
Copy link

eqrion commented Nov 14, 2022

I measured with validation times of an about 40MB big wasm module with a bit more than 100.000 functions, on a 4 core machine (I configured my work station to only use 4 cores). When validation is done in a separate step, then it takes about 125ms. When done as part of compilation, it only adds 22ms overhead to compilation without validation (with validation: 758, without validation: 736).

I may be misunderstanding, what does this number mean? How are you compiling without validating?

With lazy compilation, validation has to be done separately, so validation causes around 100ms overhead. Note that the actual overhead is higher, especially during startup, because most functions don’t get executed during startup and would therefore not have to be validated. For the module I measured, only about 20% of the code gets executed during startup.

So am I reading this right that in lazy compilation mode for V8 on this module, skipping validation for functions that are not called during startup reduces startup time by 100ms on your machine?

@gahaas
Copy link
Contributor Author

gahaas commented Nov 14, 2022

I measured with validation times of an about 40MB big wasm module with a bit more than 100.000 functions, on a 4 core machine (I configured my work station to only use 4 cores). When validation is done in a separate step, then it takes about 125ms. When done as part of compilation, it only adds 22ms overhead to compilation without validation (with validation: 758ms, without validation: 736ms).

I may be misunderstanding, what does this number mean? How are you compiling without validating?

For the experiment I compiled my own version of V8 where no function validation is happening. I wanted to measure how much overhead is introduced when validation is happening during compilation. So I compiled two versions of V8, one where validation is happening during compilation, and one where validation was not done at all. The performance difference between the two configurations was 22ms.

With lazy compilation, validation has to be done separately, so validation causes around 100ms overhead. Note that the actual overhead is higher, especially during startup, because most functions don’t get executed during startup and would therefore not have to be validated. For the module I measured, only about 20% of the code gets executed during startup.

So am I reading this right that in lazy compilation mode for V8 on this module, skipping validation for functions that are not called during startup reduces startup time by 100ms on your machine?

I guess I mixed up some thoughts in this paragraph. As I wrote above, if validation is done as part of compilation, it only takes 22ms in my benchmark. If validation is done in a separate step, then validation takes 125ms. With eager validation and lazy compilation, validation has to happen in a separate step, so lazy compilation has to spent 125ms during module initialization on validation. Later during module execution when functions get compiled lazily, function compilation can be 3% faster because no validation is needed anymore. However, even if all functions of a module were executed, this 3% speedup would only result in savings of 22ms. So all together eager validation adds a performance penalty to lazy compilation of 125ms while only providing the potential to save 22ms later.

@ajklein
Copy link

ajklein commented Nov 15, 2022

@rossberg @lukewagner Can you say more than "it's gross" about your concerns with lazy validation? If it was acceptable at Wasm's inception it's not clear to me why it would be less acceptable today.

@lukewagner
Copy link
Member

Lazy validation was a compromise to get 4 browsers to agree to ship a 1.0 release; Chakra sortof forced the issue. It seemed fine then because, if only 1 of 4 engines was doing the lazy thing, we wouldn't end up in the situation that @eqrion describes above. If we did end up in that state, we'd probably need to specify deterministically-lazy validation semantics. Implementation-wise, I don't think this would be much of a problem (since an AOT compiler can just compile an invalid function body to unreachable). But it would end up indirectly supporting feature detection, wherein a toolchain could intentionally ship a single .wasm with possibly-not-implemented instructions and then import a host-supplied value (say, computed from WebAssemby.validate()) to direct control flow appropriately. But maybe that's a reasonable compromise on the whole feature-detection issue anyway?

@rossberg
Copy link
Member

For all practical purposes, allowing lazy validation means multiple runtime behaviours: programs that succeed in one place may fail in another. It's essentially non-deterministic.

As @lukewagner says, we could require lazy validation, but that again seems undesirable for many engines and use cases.

@gahaas, if you say it costs 100ms, how much is that relative to the overall initialisation time? Also, has the implementation of validation been optimised for that purpose?

@gahaas
Copy link
Contributor Author

gahaas commented Nov 18, 2022

@rossberg With lazy compilation, validation accounts for more than 60% of the time spent by WebAssembly.compile, even though validation is reasonably optimized.

I read again through #719, and I think the assumptions made there to reject lazy validation turned out to be wrong:

  1. It was assumed there that all engines will eventually switch to eager AOT compilation, so lazy validation would just be unnecessary complexity, see Function bodies deferred error reporting #719 (comment). So far, however, fast startup turned out to be much more important than fast peak performance, and the resource consumption of AOT compilation turned out to be too high. V8 therefore switched now to lazy compilation.
  2. It was assumed that eager validation is fast enough, see Function bodies deferred error reporting #719 (comment). It turns out, some modules are just really big, and low-end devices are slow. The 125ms I measured above were on a high-end machine, on low-end devices we validation times of longer than 1 second.
  3. It was assumed that lazy validation makes feature detection more difficult. But feature detection uses WebAssembly.validate anyways, so using lazy validation for compilation is not a problem. As @lukewagner even writes above, lazy validation could allow better ways for feature detection.

On the contrary, Chakra's arguments for lazy validation turned out to be correct. Fast startup turned out to be more important than reaching peak performance fast. Also, most functions don't get executed during startup, or even not at all, so validating and compiling them turned out as a waste of resources. That's why V8 is switching now to lazy compilation.

@lukewagner
Copy link
Member

It does feel a bit unfortunate to back off what was initially one of the original wins of wasm in the browser which was this smooth startup based on streaming parallel AOT compilation. I fully believe that for many/most of the practical use cases you're looking at, lazy compilation is a net win. But it's sad that certain workloads would permanently lose the ability to AOT in the cases that it would've been beneficial.

I wonder if specifying deterministic lazy validation could complement another old idea that we used to discuss about optimizing load time: allowing producer toolchains to indicate which functions to compile eagerly vs. lazily. As a custom section of hints, this quickly opens up pandora's box and so we haven't. But if we're talking about this semantically-visible lazy validation, then if lazy validation was specifiable per-function, then perhaps that could maintain the good parts of the predictable cost model, by allowing engines to meaningfully say when they really do want eager treatment. I'm not sure this is a good idea, but it does seem related to the basic eager-vs-lazy discussion so I wanted to see if seemed beneficial to the folks actually measuring this now.

@ppenzin
Copy link

ppenzin commented Jun 9, 2024

@camio, remembered we have this discussion after the talk you gave at CG meeting.

Large applications are one of the cases where this can make a difference. For an "palette" type of GUI app we are not really expecting every tool in it to be used every single time user opens the app, yet every every instruction in every tool has to pass Wasm validation, even if it is not going to be called. @gahaas has some synthetic benchmark data above, you can probably compare that to function and instruction count estimates for the apps you are dealing with.

Another place where that might have impact is code that distinguishes between x86 and Arm - every function optimized via this techniques would have two versions, one is bound to never run, but they both have to be validated. For a large kernel library this might be measurable, though I don't think we have data on that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

8 participants