Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Memory growth and JS #82

Closed
kripken opened this issue Aug 28, 2019 · 22 comments
Closed

Memory growth and JS #82

kripken opened this issue Aug 28, 2019 · 22 comments
Labels
discussion A discussion that doesn't yet have a specific conclusion or actionable proposal.

Comments

@kripken
Copy link
Member

kripken commented Aug 28, 2019

It looks like the current wasi libc implements sbrk using the clang builtin to grow, and there isn't a wasi API for growth. I think this may be a problem for a JS embedding (that is, running a wasi program with JS implementing the wasi APIs etc.), as any JS views on the buffer used in the wasm Memory will become invalid - they don't resize automatically, and must be manually recreated. In particular I think the current Web polyfill for wasi probably doesn't fully work with memory growth.

There is no event callback for when a Memory grows, but even if there were, it wouldn't be enough, just like with pthreads - the event would happen on a later JS event loop iteration, and not when we need it.

One possible solution here would be to add an API to wasi that either does the growth (__wasi_grow_memory?), or that notifies the runtime about the growth (__wasi_notify_memory_growth?).

@sbc100
Copy link
Member

sbc100 commented Aug 28, 2019

The solution I used my tiny wasi.js that we use of the waterfall is to call checkHeap at the start of each syscall that needs memory: https://github.com/WebAssembly/waterfall/blob/fe3feca48ae596780282d9fc36f876b9a3131688/src/wasi.js#L76

@devsnek
Copy link
Member

devsnek commented Aug 28, 2019

@pchickey
Copy link
Contributor

pchickey commented Aug 28, 2019

We even do the equivalent in Lucet - this isn't restricted to just JS embeddings https://github.com/fastly/lucet/blob/master/lucet-runtime/lucet-runtime-internals/src/vmctx.rs#L142

@kripken
Copy link
Member Author

kripken commented Aug 28, 2019

It's possible at the beginning of every call (into the runtime and out to wasm), yeah - emscripten does that for pthreads + memory growth. But it's pretty slow... I think it would be nice to have a proper API for this.

Another option is for the VM to instrument the wasm before running it, replacing every memory growth instruction with a notification. But that has other downsides...

@devsnek
Copy link
Member

devsnek commented Aug 28, 2019

I don't think this is an inherent limitation of wasi or wasm, but it might be a limitation of the js api. A native implementation can just have an api where native functions are provided the current memory. Perhaps something worth requesting on the JS api is a "memory update" callback.

@kripken
Copy link
Member Author

kripken commented Aug 28, 2019

@devsnek - I think a normal JS event callback would happen in a later frame, which is too late (see note earlier). Maybe a nonstandard callback that happens synchronously could work? But I suspect that would be controversial on the Web.

@devsnek
Copy link
Member

devsnek commented Aug 28, 2019

@kripken yes i was imagining a function passed directly to instantiation, so it would be called synchronously.

WebAssembly.instantiate(module, imports, {
  onMemoryUpdate: () => {},
});

I think we could add the callback in https://webassembly.github.io/spec/js-api/index.html#reset-the-memory-buffer

@alexcrichton
Copy link
Contributor

Memory growth doesn't necessarily always happen in the standard library (e.g. wasi-libc or something like that) but rather since it's just an instruction any function can execute it. (or rather any particular library). As a result if we were to do something like this then for a solution like __wasi_grow_memory there'd need to be a postprocessing pass to wire up memory.grow instructions to that intrinsic, or with __wasi_notify_memory_growth it'd still have to be a postprocessing pass to inject after memory.grow instructions. Either way though it seems weird to have to postprocess a module like this to me?

I'd personally be more in favor of saying "the JS side has to check" and do what it does today with comparing ArrayBuffer instances before handing out views.

@kripken
Copy link
Member Author

kripken commented Aug 28, 2019

@alexcrichton I'm skeptical of the JS side having to check because it means every single location that calls into wasm needs extra work, and if you forget one you get breakage later. That would make it much more clumsy to use wasi binaries from JS...

@devsnek I'm also skeptical of a synchronous callback - is that common on the Web? Offhand the only example I can think of is sorting, and it's not even a real callback there, since the entire thing happens immediately...

@guybedford
Copy link
Contributor

@kripken script load callbacks on the web are synchronous.

@kripken
Copy link
Member Author

kripken commented Aug 28, 2019

@guybedford Interesting, thanks, I guess I'm not enough of a web dev to know this stuff :) Reading this page I don't see anything clear enough on it being synchronous, or maybe I don't understand what you mean by "synchronous" here? Or am I looking in the wrong place?

@devsnek
Copy link
Member

devsnek commented Aug 28, 2019

@kripken splitting your question into two parts:

EventTarget (web) and EventEmitter (node) dispatch events synchronously, but calls dispatch those events are generally queued via the event loop. The reason the event loop is used is generally because its reacting to things that happen while js is running (users can click buttons even when js is running), and both node and the web have a rule about state not changing observably in the js thread while js is running.

the wasm spec doesn't assume anything about the environment besides it being javascript, so we can't even use EventTarget or EventEmitter anyway, and so i think any precedence using them might bring is moot.

If the call stack is already in js land, i don't think it matters that much whether the callback is async or not.

@sbc100
Copy link
Member

sbc100 commented Aug 28, 2019

@kripken are you worried about the cost of the if (mem.buffer) at the start of each JS syscall? Or are you worried about all the other JS functions that emscripten users might want to expose to wasm?

If its the latter then presumably we can wrap all JS functions that we export to wasm a helper that does this? It would be useful to have that ability anyway for other things such as tracing that wasm boundary.

@kripken
Copy link
Member Author

kripken commented Aug 28, 2019

Thanks @devsnek! It does sound like that may be a good option then. I opened WebAssembly/design#1296 , please let me know if I wrote this up ok.

@sbc100 yeah, but not just syscalls - any time you enter JS, if wasm might have run, you need to do that, and any time right after you call into wasm and return from there.

So if someone loads a wasi wasm file in JS, and interleaves calling exports from there with JS looking at the memory, bugs can easily happen. Like imagine box2d.wasm,

jsGlue.addObject(); // JS looking at views
box2D.calculatePhysics(); // wasm runs
jsGlue.readWorldState(); // JS looking at views

For that to be correct, we'd need to check if memory grew right after that wasm call. In other words, users need to be very careful...

@sbc100
Copy link
Member

sbc100 commented Aug 28, 2019

@kripken could we not do that automatically since we control the list of functions we give to wasm that the functions we get back.

It would mean that the functions we expose to user JS are not that actual wasm functions but wrappers. And the JS function we give to wasm would also be wrappers.

This is obviously overhead, I'm just not sure how much given that the crossing from JS to wasm presumably already has a fair amount.

@kripken
Copy link
Member Author

kripken commented Aug 28, 2019

@sbc100 Yes, some automatic instrumentation is possible here. But it means that every user that uses wasi in JS must remember to do that, and do it properly - this isn't just for emscripten users!

@sbc100
Copy link
Member

sbc100 commented Aug 28, 2019

Emscripten is a little different in that its users tend to expose arbitrary functions to their wasm programs. In the WASI world the idea is that the program will only be exposed to the WASI syscall APIs, not arbitrary host functions.

I'm not saying the current situation is ideal. Its certainly on overhead for JS embedders that they have to handle this, but hopefully its something that can be done at embedder level not something that individual developer or app will need to care about.

@kripken
Copy link
Member Author

kripken commented Aug 28, 2019

In the WASI world the idea is that the program will only be exposed to the WASI syscall APIs, not arbitrary host functions.

Will wasi programs not want to provide exports that can be called from JS?

@sunfishcode
Copy link
Member

At the moment, all WASI programs are Commands, which means they just have a _start function that needs to get called. We're working on establishing Reactors and Libraries as two additional forms of WASI programs, which would support calls from the outside, but at the moment those concepts are still being developed.

@devsnek
Copy link
Member

devsnek commented Aug 29, 2019

Another solution might be interface types. If wasi apis passed live memory views instead of pointers, this wouldn't be a problem anymore.

@skoppe
Copy link

skoppe commented Oct 23, 2019

There is no event callback for when a Memory grows

I have had the same problem in skoppe/spasm and - like everyone else - ended up using a check at the beginning of each JS function.

In my case I was statically linking imgui (compiled to wasm with wasi-libc) with a webgl backend written in Dlang. Both the D program and wasi-libc were issuing calls to the memory grow intrinsic, at which point I realised I couldn't intercept calls to the grow intrinsic in the D program, since it would miss those issued from wasi-libc.

So I ended up injecting memory checks in each js function. It didn't feel right when I wrote it, and it still doesn't.

While I understand that a reallocation invalidates pointers, I am of the opinion that the wasm runtime ought to fix that pointer after a memory grow.

To check whether the typed js array is still valid I check the length property, which equals 0 when the memory is invalid. This suggests to me there is already some bounds checking and memory validation code whenever JS accesses the typed array. If that is the case, having the wasm runtime update the typed array when a memory grow happens should cost very little.

I stand corrected. That is not what is happening. I now favor kripken's proposal to add load/store functions on the WebAssembly.Memory object, as outlined in https://gist.github.com/kripken/949eab99b7bc34f67c12140814d2b595.

@linclark
Copy link
Member

This has been filed as a core Wasm issue, and it makes more sense to handle it at the Wasm level than WASI, so I'll go ahead and close this one out.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
discussion A discussion that doesn't yet have a specific conclusion or actionable proposal.
Projects
None yet
Development

No branches or pull requests

9 participants