Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for stack traces from wasm #827

Closed
h3r2tic opened this issue Sep 23, 2019 · 8 comments
Closed

Add support for stack traces from wasm #827

h3r2tic opened this issue Sep 23, 2019 · 8 comments
Labels
🎉 enhancement New feature!

Comments

@h3r2tic
Copy link

h3r2tic commented Sep 23, 2019

Motivation

Stack traces can be immensely helpful in tracking down bugs. In their absence, one resorts to "printf debugging", and binary-chopping the code. At Embark we use wasmer-runtime as our primary extension driver, meaning a lot of code goes through the WebAssembly path. As complexity and volume of said code increases, so does our need for better debugging tools.

Are there any current plans in wasmer to support stack tracing? Is it something that you have in your roadmap, or is it a lower priority thing, for which you would rather accept pull requests?

Proposed solution

In an effort to understand what would be involved to get stack tracing to work, I have implemented a proof-of-concept specifically for the Cranelift backend on x86_64 Linux. The platform-specific code is minimal, and it might even work on Mac OS already.

I opted to capture stack traces by following the linked list of stack frames via the RBP register. Perhaps a more robust approach would involve using stack tracing libraries, but this was the lightest solution without any extra dependencies.

Stacks are captured in guest land (wasm), immediately before leaving it via longjmp(). The latter seems to modify the stack, as I could not get valid traces back in host land. I stash the captured instruction pointers into a thread-local Cell similarly to how CAUGHT_ADDRESSES works in the Cranelift backend runtime.

The stack trace is returned in quite an ad-hoc struct dubbed WasmTrapWrapper. The latter also contains what was previously returned as error info: WasmTrapInfo:

#[repr(C)]
pub struct WasmTrapWrapper {
    pub info: WasmTrapInfo,
    pub stack_trace: Option<StackTrace>,
}

Ideally the stack addresses would be resolved into function names, and passed along in another (user-facing) Error type, but in the quick implementation, it is instead printed in a rather arbitrary location via the print_stack_trace function. The latter one needs a Ctx instance in order to match addresses to functions, and functions to names.

Symbol names are extracted from the "name" custom section in WebAssembly binaries, and matched via function indices.

Alternatives

Clearly the implementation is not production-ready, and some work would be required to clean it up, as well as add support for the other platforms and backends.

Since my familiarity with wasmer is limited, I'm probably missing other potential mechanisms of accomplishing the same goal. Adding information about the source file and line for the stack trace would likely entail a separate investigation.

Additional context

Below you can find output from a modified version of the plugin example, hard-wired to panic in wasm by accessing an out-of-bounds vector index.

I hope this implementation will be useful should you choose to implement stack tracing! Otherwise perhaps we could discuss how this aligns with your plans and ideas, and if the code could be brought up to your standards (either by ourselves or some other contributors).

Thanks!

h3@h3deb:~/code/ext/wasmer$ cargo run --example plugin
    Finished dev [unoptimized + debuginfo] target(s) in 0.08s
     Running `target/debug/examples/plugin`
Hello from outside WASI
wasm backtrace:
   0: rust_panic
   1: std::panicking::rust_panic_with_hook::h6fe9d3817474ba5a
   2: std::panicking::continue_panic_fmt::h1b22eea7f33d6c0a
   3: rust_begin_unwind
   4: core::panicking::panic_fmt::h2daf88b2616ca2b2
   5: core::panicking::panic_bounds_check::h0537ade040df571e
   6: <usize as core::slice::SliceIndex<[T]>>::index::hfa5dcbec686669f7
   7: core::slice::<impl core::ops::index::Index<I> for [T]>::index::h6b2602c1b01906b0
   8: <alloc::vec::Vec<T> as core::ops::index::Index<I>>::index::h61a312a26c0e59c9
   9: plugin_for_example::wasm_fn1::h1c32c2f9d0027f31
  10: core::ops::function::Fn::call::h09b485fa29a128b8
  11: <alloc::boxed::Box<F> as core::ops::function::Fn<A>>::call::h4dbd8e9b2d2c2200
  12: plugin_entrypoint
  13: ???
  14: ???
  15: ???
  16: ???
thread 'main' panicked at 'failed to execute plugin: WebAssembly trap occurred during runtime: unknown', src/libcore/result.rs:999:5
@h3r2tic h3r2tic added the 🎉 enhancement New feature! label Sep 23, 2019
@losfair
Copy link
Contributor

losfair commented Sep 24, 2019

Actually Wasmer already has backtrace support for singlepass and llvm backend. To enable it:

  1. Enable managed feature at compilation.
  2. Run with WASMER_BACKTRACE=1 environment variable.

The produced backtrace contains WASM function indexes, locals and stack values for each stack frame.

What's missing now is parsing function names from the WASM binary, and your implementation here looks good!

Also, it would be better to implement the ModuleStateMap API to enable backtrace in Cranelift, to keep consistent with the other two backends (and also, get state preservation working out of the box).

@h3r2tic
Copy link
Author

h3r2tic commented Sep 24, 2019

Oh wow, I've completely missed the relevant code in state.rs. That looks way more comprehensive than my toy implementation :) Thank you for the information!

I've now had another look around the code base using the clues you provided, and it was actually still a bit tricky to figure out how I would get stack tracing to work using just the runtime. It seems that currently the support is there mostly for the wasmer command line interface.

Nevertheless, by mirroring some of the CLI's code, I've managed to convince the plugin example (using unmodified wasmer) to give me stack traces on panics and signals. For the singlepass backend, the only thing required was a call to push_code_version, which I found by inserting dbg! into some of the code, and tracking what was happening inside fault.rs. The llvm backend seems a bit more nuanced though, as that call alone wasn't sufficient. I did manage to get it to work via run_tiering though.

I am assuming the functionality is a work in progress, as for example the Rust crate documentation for the runtime doesn't mention tiering or stack tracing. It also looks like tiered execution is a part of a longer story around a multi-backend runtime, which probably comes with its own set of tradeoffs. It is a bit unclear whether user code should be following the patterns that run_tiering implements for tracing support, or whether it's an area where breaking changes are expected.

With this in mind (and please correct me if my assumptions are wrong), what are your plans for making stack tracing / debugging more of a first-class citizen?

@vavrusa
Copy link
Contributor

vavrusa commented Nov 12, 2019

@losfair is the goal to use the common trap and signal handling code from runtime-core/fault.rs or have each backend use a custom logic? The LLVM backend, for example, implements trap catching for Func and DynFunc calls in C++ which is pretty similar to catch_unsafe_unwind in fault.rs. But then it also uses different trap catching function when in tiered mode. I wanted to add a stack trace capture for faults triggered in regular calls, but now I'm not sure where would you prefer the logic to be because of the code duplication. Also, is the goal to make all WASM function calls on alternative stack to make them preemptible or just when in tiered mode?

@thedavidmeister
Copy link

where should we be enabling managed?

@syrusakbary
Copy link
Member

We added support for stack traces in the refactor. Once it lands into master we should be able to mark the issue as resolved :)

@thedavidmeister
Copy link

@syrusakbary that's great, where is that documented?

@syrusakbary
Copy link
Member

Thanks to the refactor, we now support stack traces when using the Cranelift backend.

Closing the issue

@kaimast
Copy link

kaimast commented Apr 3, 2021

@syrusakbary that's great, where is that documented?

I would love to know that too. I cannot find the managed feature flag in any of the crates and it does not seem that WASMER_BACKTRACE does anything.

For llvm RUST_BACKTRACE causes to print a stacktrace but it is not really useful.

 0: rust_begin_unwind
             at /rustc/d474075a8f28ae9a410e95d849d009006db4b176/library/std/src/panicking.rs:493:5
   1: core::panicking::panic_fmt
             at /rustc/d474075a8f28ae9a410e95d849d009006db4b176/library/core/src/panicking.rs:92:14
   2: core::panicking::panic
             at /rustc/d474075a8f28ae9a410e95d849d009006db4b176/library/core/src/panicking.rs:50:5
   3: wasmer_vm::trap::traphandlers::raise_lib_trap::{{closure}}
   4: wasmer_vm::trap::traphandlers::tls::with::{{closure}}
   5: std::thread::local::LocalKey<T>::try_with
   6: std::thread::local::LocalKey<T>::with
  7: wasmer_vm::trap::traphandlers::tls::with
   8: wasmer_vm::trap::traphandlers::raise_lib_trap
   9: wasmer_raise_trap
  10: wasmer_function__212
  11: wasmer_function__144
  12: wasmer_function__228
  13: wasmer_function__227
  14: wasmer_function__31
  15: wasmer_function__115
  16: wasmer_function__106
  17: wasmer_trampoline_function_call__4
  18: wasmer_vm::trap::traphandlers::catch_traps::call_closure

(In general, is there a list of things that changed during the "big refactor"? It's so hard to figure out which information in this repo is still up to do date.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
🎉 enhancement New feature!
Projects
None yet
Development

No branches or pull requests

6 participants