Add ability to stop running `wasmtime::Instance` #860

rylev · 2020-01-24T20:12:54Z

Once a wasmtime::Instance in created the start function will be run. This may lead to the current thread being blocked for a long time. It would be helpful to have the ability to stop or pause an wasmtime::Instance while it is running.

The text was updated successfully, but these errors were encountered:

alexcrichton · 2020-01-24T23:42:56Z

I'd be curious to hear other's thoughts on this as well, but two "lightweight" ways to do this could be:

One would be not wasmtime-specific at all, but a sort of transformation pass on a *.wasm file that inserts calls to an imported function at "notable locations" like loops or functions, and the imported function would be called periodically where a trap could signify "time to halt".
Alternatively we could make this wasmtime-specific and provide this as an option during module compilation where it does roughly the same thing but directly on the cranelift IR, where an imported function is called every so often and has the option of trapping.

Out of curiosity, you say "stop or pause" but I suspect "stop" is going to be a whole lot easier than "pause". Does your use case require the pausing aspect of resuming again later if something was stopped?

eminence · 2020-01-24T23:47:22Z

Related to #712

thomastaylor312 · 2020-01-28T12:03:42Z

I think both would be helpful. There is a use case for pausing a module as well, but for this specific issue, just being able to stop it is the main feature request

alexcrichton · 2020-02-01T18:05:00Z

Some other notes:

In #43 @sunfishcode says:

cranelift-wasm's FuncEnviron has a translate_loop_header callback which allows embedders to insert custom timeout code at the tops of loops (in addition to the tops of potentially recursive functions). Wasmtime doesn't yet implement timeouts, but this hook is one way it could do so.

In #712 @pchickey says:

Over in Lucet, @iximeow was able to find an implementation bytecodealliance/lucet#150 that makes it possible to terminate running Wasm code due to either timeout or other external triggers, but it is pretty specific to the way stack switching to Wasm code works in Lucet.

We'd like to port Lucet's stack management code into Wasmtime as part of merging the Lucet project in, so I'd expect the kill switch mechanism to be ported in as well.

thomastaylor312 · 2020-02-03T10:06:44Z

@alexcrichton that sounds similar to the path we were starting to go down. so I'll follow along with those issues as well

moonheart08 · 2020-03-09T16:10:36Z

I personally probably have one of the sillier use-cases for the ability to pause/resume (I wanted to put WASM computers in Minecraft as an exercise), but it gives a few very important use cases for Wasmtime some light:

The traditional browser use-case of having separate isolated Wasmtime VMs up and going at once
The embedded into another application use-case in, say, a game, where the VM hanging the place would be disastrous, and you're under a strict schedule to get each frame out on time (and as such need a watchdog system, where each VM gets a per-frame budget of how much it gets to execute before pausing)

fitzgen · 2020-03-09T17:13:57Z

Note that pausing is harder than stopping because you have to save the execution state (stack and all that) so that you can pick it back up again later. We want this eventually since wasm will eventually get stack switching, but we can probably support interrupting and stopping long-running wasm modules sooner.

In addition to the translate loop header stuff, what we would need is a timer thread or something that sets a flag once a module has taken too long, and then the loop header (and function prologues!) would check that flag and do a custom interrupt trap if it is set.

This commit is a relatively large change for wasmtime with two main goals: * Primarily this enables interrupting executing wasm code with a trap, preventing infinite loops in wasm code. Note that resumption of the wasm code is not a goal of this commit. * Additionally this commit reimplements how we handle stack overflow to ensure that host functions always have a reasonable amount of stack to run on. This fixes an issue where we might longjmp out of a host function, skipping destructors. Lots of various odds and ends end up falling out in this commit once the two goals above were implemented. The strategy for implementing this was also lifted from Spidermonkey and existing functionality inside of Cranelift. I've tried to write up thorough documentation of how this all works in `crates/environ/src/cranelift.rs` where gnarly-ish bits are. A brief summary of how this works is that each function and each loop header now checks to see if they're interrupted. Interrupts and the stack overflow check are actually folded into one now, where function headers check to see if they've run out of stack and the sentinel value used to indicate an interrupt, checked in loop headers, tricks functions into thinking they're out of stack. An interrupt is basically just writing a value to a location which is read by JIT code. When interrupts are delivered and what triggers them has been left up to embedders of the `wasmtime` crate. The `wasmtime::Store` type has a method to acquire an `InterruptHandle`, where `InterruptHandle` is a `Send` and `Sync` type which can travel to other threads (or perhaps even a signal handler) to get notified from. It's intended that this provides a good degree of flexibility when interrupting wasm code. Note though that this does have a large caveat where interrupts don't work when you're interrupting host code, so if you've got a host import blocking for a long time an interrupt won't actually be received until the wasm starts running again. Some fallout included from this change is: * Unix signal handlers are no longer registered with `SA_ONSTACK`. Instead they run on the native stack the thread was already using. This is possible since stack overflow isn't handled by hitting the guard page, but rather it's explicitly checked for in wasm now. Native stack overflow will continue to abort the process as usual. * Unix sigaltstack management is now no longer necessary since we don't use it any more. * Windows no longer has any need to reset guard pages since we no longer try to recover from faults on guard pages. * On all targets probestack intrinsics are disabled since we use a different mechanism for catching stack overflow. * The C API has been updated with interrupts handles. An example has also been added which shows off how to interrupt a module. Closes bytecodealliance#139 Closes bytecodealliance#860 Closes bytecodealliance#900

* Implement interrupting wasm code, reimplement stack overflow This commit is a relatively large change for wasmtime with two main goals: * Primarily this enables interrupting executing wasm code with a trap, preventing infinite loops in wasm code. Note that resumption of the wasm code is not a goal of this commit. * Additionally this commit reimplements how we handle stack overflow to ensure that host functions always have a reasonable amount of stack to run on. This fixes an issue where we might longjmp out of a host function, skipping destructors. Lots of various odds and ends end up falling out in this commit once the two goals above were implemented. The strategy for implementing this was also lifted from Spidermonkey and existing functionality inside of Cranelift. I've tried to write up thorough documentation of how this all works in `crates/environ/src/cranelift.rs` where gnarly-ish bits are. A brief summary of how this works is that each function and each loop header now checks to see if they're interrupted. Interrupts and the stack overflow check are actually folded into one now, where function headers check to see if they've run out of stack and the sentinel value used to indicate an interrupt, checked in loop headers, tricks functions into thinking they're out of stack. An interrupt is basically just writing a value to a location which is read by JIT code. When interrupts are delivered and what triggers them has been left up to embedders of the `wasmtime` crate. The `wasmtime::Store` type has a method to acquire an `InterruptHandle`, where `InterruptHandle` is a `Send` and `Sync` type which can travel to other threads (or perhaps even a signal handler) to get notified from. It's intended that this provides a good degree of flexibility when interrupting wasm code. Note though that this does have a large caveat where interrupts don't work when you're interrupting host code, so if you've got a host import blocking for a long time an interrupt won't actually be received until the wasm starts running again. Some fallout included from this change is: * Unix signal handlers are no longer registered with `SA_ONSTACK`. Instead they run on the native stack the thread was already using. This is possible since stack overflow isn't handled by hitting the guard page, but rather it's explicitly checked for in wasm now. Native stack overflow will continue to abort the process as usual. * Unix sigaltstack management is now no longer necessary since we don't use it any more. * Windows no longer has any need to reset guard pages since we no longer try to recover from faults on guard pages. * On all targets probestack intrinsics are disabled since we use a different mechanism for catching stack overflow. * The C API has been updated with interrupts handles. An example has also been added which shows off how to interrupt a module. Closes #139 Closes #860 Closes #900 * Update comment about magical interrupt value * Store stack limit as a global value, not a closure * Run rustfmt * Handle review comments * Add a comment about SA_ONSTACK * Use `usize` for type of `INTERRUPTED` * Parse human-readable durations * Bring back sigaltstack handling Allows libstd to print out stack overflow on failure still. * Add parsing and emission of stack limit-via-preamble * Fix new example for new apis * Fix host segfault test in release mode * Fix new doc example

zeroexcuses · 2023-11-15T04:44:21Z

Sorry, is there a way to pause + resume wasmtime instances right now?

Context: imagine an ant colony simulation, where each ant = tiny wasm vm; we want to run every ant for a bit, then pause it, let other ants run, etc ...

alexcrichton · 2023-11-15T15:08:33Z

Yes to do that you'd want to use Wasmtime's support for async. Fuel and epochs are the mechanisms to suspend computations. I'd also recommend opening a new issue and/or a thread on Zulip instead of posting on older issues.

rylev mentioned this issue Jan 28, 2020

Implement start_container deislabs/wok#106

Merged

This was referenced Feb 1, 2020

How to interrupt execution #43

Closed

Stopping hanging WASM code #712

Closed

alexcrichton mentioned this issue Feb 18, 2020

Ability to both pause and resume a WASM instance #950

Closed

pepyakin mentioned this issue Feb 27, 2020

Tracking issue for guest stack switching in Wasmtime #1007

Closed

alexcrichton mentioned this issue Mar 10, 2020

A way to stop WASM instance when execution hangs #1268

Closed

alexcrichton mentioned this issue Apr 9, 2020

Implement interrupting wasm code, reimplement stack overflow #1490

Merged

sunfishcode closed this as completed in #1490 Apr 21, 2020

r12f mentioned this issue Nov 20, 2022

Add ability to stop sleeping WASM instances #5306

Open

rahulksnv mentioned this issue Dec 3, 2023

Trap on excessive storage use #7550

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add ability to stop running `wasmtime::Instance` #860

Add ability to stop running `wasmtime::Instance` #860

rylev commented Jan 24, 2020 •

edited

Loading

alexcrichton commented Jan 24, 2020

eminence commented Jan 24, 2020

thomastaylor312 commented Jan 28, 2020

alexcrichton commented Feb 1, 2020

thomastaylor312 commented Feb 3, 2020

moonheart08 commented Mar 9, 2020

fitzgen commented Mar 9, 2020

zeroexcuses commented Nov 15, 2023

alexcrichton commented Nov 15, 2023

Add ability to stop running wasmtime::Instance #860

Add ability to stop running wasmtime::Instance #860

Comments

rylev commented Jan 24, 2020 • edited Loading

alexcrichton commented Jan 24, 2020

eminence commented Jan 24, 2020

thomastaylor312 commented Jan 28, 2020

alexcrichton commented Feb 1, 2020

thomastaylor312 commented Feb 3, 2020

moonheart08 commented Mar 9, 2020

fitzgen commented Mar 9, 2020

zeroexcuses commented Nov 15, 2023

alexcrichton commented Nov 15, 2023

Add ability to stop running `wasmtime::Instance` #860

Add ability to stop running `wasmtime::Instance` #860

rylev commented Jan 24, 2020 •

edited

Loading