Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

More optimizations for calling into WebAssembly #2759

Merged
merged 4 commits into from
Mar 24, 2021
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions Cargo.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

46 changes: 26 additions & 20 deletions crates/runtime/src/traphandlers.rs
Original file line number Diff line number Diff line change
Expand Up @@ -38,19 +38,32 @@ cfg_if::cfg_if! {

pub use sys::SignalHandler;

/// This function performs the low-overhead platform-specific initialization
/// that we want to do eagerly to ensure a more-deterministic global process
/// state.
/// Globally-set callback to determine whether a program counter is actually a
/// wasm trap.
///
/// This is especially relevant for signal handlers since handler ordering
/// depends on installation order: the wasm signal handler must run *before*
/// the other crash handlers and since POSIX signal handlers work LIFO, this
/// function needs to be called at the end of the startup process, after other
/// handlers have been installed. This function can thus be called multiple
/// times, having no effect after the first call.
pub fn init_traps() -> Result<(), Trap> {
/// This is initialized during `init_traps` below. The definition lives within
/// `wasmtime` currently.
static mut IS_WASM_PC: fn(usize) -> bool = |_| false;

/// This function is required to be called before any WebAssembly is entered.
/// This will configure global state such as signal handlers to prepare the
/// process to receive wasm traps.
///
/// This function must not only be called globally once before entering
/// WebAssembly but it must also be called once-per-thread that enters
/// WebAssembly. Currently in wasmtime's integration this function is called on
/// creation of a `Store`.
///
/// The `is_wasm_pc` argument is used when a trap happens to determine if a
/// program counter is the pc of an actual wasm trap or not. This is then used
/// to disambiguate faults that happen due to wasm and faults that happen due to
/// bugs in Rust or elsewhere.
pub fn init_traps(is_wasm_pc: fn(usize) -> bool) -> Result<(), Trap> {
static INIT: Once = Once::new();
INIT.call_once(|| unsafe { sys::platform_init() });
INIT.call_once(|| unsafe {
IS_WASM_PC = is_wasm_pc;
sys::platform_init();
});
sys::lazy_per_thread_init()
}

Expand Down Expand Up @@ -208,10 +221,6 @@ pub unsafe trait TrapInfo {
/// Converts this object into an `Any` to dynamically check its type.
fn as_any(&self) -> &dyn Any;

/// Returns whether the given program counter lies within wasm code,
/// indicating whether we should handle a trap or not.
fn is_wasm_trap(&self, pc: usize) -> bool;

/// Uses `call` to call a custom signal handler, if one is specified.
///
/// Returns `true` if `call` returns true, otherwise returns `false`.
Expand Down Expand Up @@ -290,6 +299,7 @@ impl<'a> CallThreadState<'a> {
/// instance, and the trap handler should quickly return.
/// * a different pointer - a jmp_buf buffer to longjmp to, meaning that
/// the wasm trap was succesfully handled.
#[cfg_attr(target_os = "macos", allow(dead_code))] // macOS is more raw and doesn't use this
fn jmp_buf_if_trap(
&self,
pc: *const u8,
Expand Down Expand Up @@ -318,7 +328,7 @@ impl<'a> CallThreadState<'a> {
}

// If this fault wasn't in wasm code, then it's not our problem
if !self.trap_info.is_wasm_trap(pc as usize) {
if unsafe { !IS_WASM_PC(pc as usize) } {
return ptr::null();
}

Expand Down Expand Up @@ -383,10 +393,6 @@ mod tls {

#[inline(never)] // see module docs for why this is here
pub fn replace(val: Ptr) -> Ptr {
// Mark the current thread as handling interrupts for this specific
// CallThreadState: may clobber the previous entry.
super::super::sys::register_tls(val);

PTR.with(|p| p.replace(val))
}

Expand Down
82 changes: 17 additions & 65 deletions crates/runtime/src/traphandlers/macos.rs
Original file line number Diff line number Diff line change
Expand Up @@ -21,19 +21,18 @@
//! port. This means that, unlike signals, threads can't fix their own traps.
//! Instead a helper thread is spun up to service exception messages. This is
//! also in conflict with Wasmtime's exception handling currently which is to
//! use a thread-local to figure out whether a pc is a wasm pc or not on a
//! trap. To work around this we have a global map from mach thread numbers to
//! the state for that thread, updated on entry/exit from wasm. This is likely
//! slower than signals which do less updating on wasm entry/exit, but hopefully
//! by the time this is a problem we can figure out a better solution.
//! use a thread-local to store information about how to unwind. Additionally
//! this requires that the check of whether a pc is a wasm trap or not is a
//! global check rather than a per-thread check. This necessitates the existence
//! of `GlobalFrameInfo` in the `wasmtime` crate.
//!
//! Otherwise this file heavily uses the `mach` Rust crate for type and
//! function declarations. Many bits and pieces are copied or translated from
//! the SpiderMonkey implementation and it should pass all the tests!

#![allow(non_snake_case)]

use crate::traphandlers::{tls, CallThreadState, Trap, Unwind};
use crate::traphandlers::{tls, Trap, Unwind};
use mach::exception_types::*;
use mach::kern_return::*;
use mach::mach_init::*;
Expand All @@ -43,10 +42,7 @@ use mach::port::*;
use mach::thread_act::*;
use mach::traps::*;
use std::cell::Cell;
use std::collections::HashMap;
use std::mem;
use std::ptr;
use std::sync::Mutex;
use std::thread;

/// Other `mach` declarations awaiting https://github.com/fitzgen/mach/pull/64 to be merged.
Expand Down Expand Up @@ -154,20 +150,10 @@ pub enum Void {}
/// Wasmtime on macOS.
pub type SignalHandler<'a> = dyn Fn(Void) -> bool + 'a;

/// Process-global map for mapping thread names to their state to figure out
/// whether a thread's trap is related to wasm or not. This is extremely
/// unsafe and caution must be used when accessing. Be sure to read
/// documentation below on this.
static mut MAP: *mut Mutex<HashMap<mach_port_name_t, *const CallThreadState<'static>>> =
ptr::null_mut();

/// Process-global port that we use to route thread-level exceptions to.
static mut WASMTIME_PORT: mach_port_name_t = MACH_PORT_NULL;

pub unsafe fn platform_init() {
// Initialize the process global map
MAP = Box::into_raw(Default::default());

// Allocate our WASMTIME_PORT and make sure that it can be sent to so we
// can receive exceptions.
let me = mach_task_self();
Expand Down Expand Up @@ -289,7 +275,7 @@ unsafe fn handle_exception(request: &mut ExceptionRequest) -> bool {

let get_pc = |state: &ThreadState| state.__rip as *const u8;

let resume = |state: &mut ThreadState, pc: usize, jmp_buf: usize| {
let resume = |state: &mut ThreadState, pc: usize| {
// The x86_64 ABI requires a 16-byte stack alignment for
// functions, so typically we'll be 16-byte aligned. In this
// case we simulate a `call` instruction by decrementing the
Expand All @@ -315,7 +301,6 @@ unsafe fn handle_exception(request: &mut ExceptionRequest) -> bool {
}
state.__rip = unwind as u64;
state.__rdi = pc as u64;
state.__rsi = jmp_buf as u64;
};
let mut thread_state = ThreadState::new();
} else if #[cfg(target_arch = "aarch64")] {
Expand All @@ -325,18 +310,17 @@ unsafe fn handle_exception(request: &mut ExceptionRequest) -> bool {

let get_pc = |state: &ThreadState| state.__pc as *const u8;

let resume = |state: &mut ThreadState, pc: usize, jmp_buf: usize| {
let resume = |state: &mut ThreadState, pc: usize| {
// Clobber LR with the faulting PC, so unwinding resumes at the
// faulting instruction. The previous value of LR has been saved
// by the callee (in Cranelift generated code), so no need to
// stash it.
state.__lr = pc as u64;

// Fill in the 2 arguments to unwind here, and set PC to it, so
// Fill in the argument to unwind here, and set PC to it, so
// it looks like a call to unwind.
state.__pc = unwind as u64;
state.__x[0] = pc as u64;
state.__x[1] = jmp_buf as u64;
state.__pc = unwind as u64;
};
let mut thread_state = mem::zeroed::<ThreadState>();
} else {
Expand Down Expand Up @@ -372,27 +356,15 @@ unsafe fn handle_exception(request: &mut ExceptionRequest) -> bool {
// pointer value and if `MAP` changes happen after we read our entry that's
// ok since they won't invalidate our entry.
let pc = get_pc(&thread_state);
let state = (*MAP)
.lock()
.unwrap_or_else(|e| e.into_inner())
.get(&origin_thread)
.copied();
let jmp_buf = match state {
Some(state) => (*state).jmp_buf_if_trap(pc, |_| false),
None => ptr::null(),
};
if jmp_buf.is_null() {
return false;
}
if jmp_buf as usize == 1 {
if !super::IS_WASM_PC(pc as usize) {
return false;
}

// We have determined that this is a wasm trap and we need to actually
// force the thread itself to trap. The thread's register state is
// configured to resume in the `unwind` function below, we update the
// thread's register state, and then we're off to the races.
resume(&mut thread_state, pc as usize, jmp_buf as usize);
resume(&mut thread_state, pc as usize);
let kret = thread_set_state(
origin_thread,
thread_state_flavor,
Expand All @@ -409,13 +381,13 @@ unsafe fn handle_exception(request: &mut ExceptionRequest) -> bool {
/// a native backtrace once we've switched back to the thread itself. After
/// the backtrace is captured we can do the usual `longjmp` back to the source
/// of the wasm code.
unsafe extern "C" fn unwind(wasm_pc: *const u8, jmp_buf: *const u8) -> ! {
tls::with(|state| {
if let Some(state) = state {
state.capture_backtrace(wasm_pc);
}
unsafe extern "C" fn unwind(wasm_pc: *const u8) -> ! {
let jmp_buf = tls::with(|state| {
let state = state.unwrap();
state.capture_backtrace(wasm_pc);
state.jmp_buf.get()
});

debug_assert!(!jmp_buf.is_null());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There was a check in the unix implementation that the jmp_buf could be set to 1 (meaning a custom signal handler has run and the trap handler should return early). Is it not necessary anymore?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah yeah I realized that it was no longer necessary since we don't run the custom trap handler on macOS, it's only run on Unix and Windows now.

Unwind(jmp_buf);
}

Expand Down Expand Up @@ -474,23 +446,3 @@ pub fn lazy_per_thread_init() -> Result<(), Trap> {
});
Ok(())
}

/// This hook is invoked whenever TLS state for the current thread is updated
/// to the `ptr` specified.
///
/// The purpose for hooking this on macOS is we register in a process-global map
/// that our mach thread's state is `ptr` at this time. This allows the
/// exception handling thread to lookup in this map later if our thread
/// generates an exception.
///
/// Note that in general this is quite unsafe since we're moving non-Send state
/// (`ptr`) which is also only valid for a short portion of the program (it
/// lives on the stack) into a global portion of the program. This needs to be
/// kept tightly in sync with `handle_exception` above where it's accessed in a
/// very limited fashion.
pub fn register_tls(ptr: *const CallThreadState<'static>) {
unsafe {
let me = MY_PORT.with(|p| p.0);
(*MAP).lock().unwrap().insert(me, ptr);
}
}
4 changes: 0 additions & 4 deletions crates/runtime/src/traphandlers/unix.rs
Original file line number Diff line number Diff line change
Expand Up @@ -250,7 +250,3 @@ pub fn lazy_per_thread_init() -> Result<(), Trap> {
}
}
}

pub fn register_tls(_: *const CallThreadState<'static>) {
// Unused on unix
}
4 changes: 0 additions & 4 deletions crates/runtime/src/traphandlers/windows.rs
Original file line number Diff line number Diff line change
Expand Up @@ -77,7 +77,3 @@ pub fn lazy_per_thread_init() -> Result<(), Trap> {
// Unused on Windows
Ok(())
}

pub fn register_tls(_: *const CallThreadState<'static>) {
// Unused on Windows
}
1 change: 1 addition & 0 deletions crates/wasmtime/Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -36,6 +36,7 @@ bincode = "1.2.1"
indexmap = "1.6"
paste = "1.0.3"
psm = "0.1.11"
lazy_static = "1.4"

[target.'cfg(target_os = "windows")'.dependencies]
winapi = "0.3.7"
Expand Down
Loading