-
Notifications
You must be signed in to change notification settings - Fork 1.4k
cranelift: stack-switching support #11003
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
cranelift: stack-switching support #11003
Conversation
This initial commit represents the "pr2" base commit with minimal merge conflicts resolved. Due to OOB conflicts, this commit is not functional as-is, but using it as a base in order to allow for easier reviewing of the delta from this commit to what will be used for the PR against upstream. Co-authored-by: Daniel Hillerström <daniel.hillerstrom@ed.ac.uk> Co-authored-by: Paul Osborne <paul.osborne@fastly.com>
This first set of changes updates the base pr in order to compiled and pass basic checks (compile, clippy, fmt) with the biggest part of the change being to eliminate injection of tracing/assertions in JIT'ed code.
…c_environ members
At this point, the only bit we really branch on is what we do in order to avoid problems tying into wasmtime_environ. This is basd on the approach and macro used by the gc code for converting presence/absence of the cranelift feature flag to cranelift compile time. This is a bit of a half-measure for now as we still compile most stack-switching code in cranelift, but this does enough to avoid causing problems with missing definitions in wasmtime_environ.
Replace either with infallible From or fallible, panicing TryFrom alternatives where required.
After removing emission of runtime trace logging and assertions, there were several unused parameters. Remove those from the ControlEffect signatures completely.
This matches a change to the mirrored runtime type in the upstream changes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! Lots of comments below but for the most part these should be pretty straightforward to fix.
Co-authored-by: Daniel Hillerström <daniel.hillerstrom@ed.ac.uk>
The extra parameters here used to be used for emitting runtime assertions, but with those gone we just had unused params and lifetimes, clean those out.
There's already a stub elsewhere and this is not called, when exceptions are added and it is time to revisit, this method can be restored.
Rename VMHostArray -> VMHostArrayRef Change impl to compute address with offset upfront rather than on each load.
Pushed most of the updates but still need to make the suggested updates around the fat pointer stuff and resolve conflicts with upstream. |
This matches the directory structure for gc and aids in visibility for a few members required by stack-switching code in cranelift.
…nelift As part of this, updated translate_ref_is_null to use the wasm type rather than brancing on the ir type being an i128.
@fitzgen I believe this is ready for review again, let me know if there's pieces I missed in my previous updates. As noted, I chose to defer fatpointer changes for now to be potentially addressed as part of (or following) changes to the translation stack. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM modulo a few final comments below, we should be able to merge this and move on to follow ups after they are addressed.
Thanks!
@@ -138,7 +139,7 @@ pub struct FuncEnvironment<'module_environment> { | |||
pcc_vmctx_memtype: Option<ir::MemoryType>, | |||
|
|||
/// Caches of signatures for builtin functions. | |||
builtin_functions: BuiltinFunctions, | |||
pub(crate) builtin_functions: BuiltinFunctions, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This shouldn't need to be pub(crate)
after the stack_switching
module moved to func_environ::stack_switching
, right?
|
||
/// Used by the stack switching feature. If set, we have a allocated a | ||
/// slot on this function's stack to be used for the | ||
/// current stack's `handler_list` field. | ||
pub(crate) stack_switching_handler_list_buffer: Option<ir::StackSlot>, | ||
|
||
/// Used by the stack switching feature. If set, we have a allocated a | ||
/// slot on this function's stack to be used for the | ||
/// current continuation's `values` field. | ||
pub(crate) stack_switching_values_buffer: Option<ir::StackSlot>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ditto
pub fn translate_cont_bind( | ||
&mut self, | ||
builder: &mut FunctionBuilder<'_>, | ||
contobj: ir::Value, | ||
args: &[ir::Value], | ||
) -> ir::Value { | ||
stack_switching::instructions::translate_cont_bind(self, builder, contobj, args) | ||
} | ||
|
||
pub fn translate_cont_new( | ||
&mut self, | ||
builder: &mut FunctionBuilder<'_>, | ||
_state: &FuncTranslationState, | ||
func: ir::Value, | ||
arg_types: &[WasmValType], | ||
return_types: &[WasmValType], | ||
) -> WasmResult<ir::Value> { | ||
stack_switching::instructions::translate_cont_new( | ||
self, | ||
builder, | ||
func, | ||
arg_types, | ||
return_types, | ||
) | ||
} | ||
|
||
pub fn translate_resume( | ||
&mut self, | ||
builder: &mut FunctionBuilder<'_>, | ||
type_index: u32, | ||
contobj: ir::Value, | ||
resume_args: &[ir::Value], | ||
resumetable: &[(u32, Option<ir::Block>)], | ||
) -> WasmResult<Vec<ir::Value>> { | ||
stack_switching::instructions::translate_resume( | ||
self, | ||
builder, | ||
type_index, | ||
contobj, | ||
resume_args, | ||
resumetable, | ||
) | ||
} | ||
|
||
pub fn translate_suspend( | ||
&mut self, | ||
builder: &mut FunctionBuilder<'_>, | ||
tag_index: u32, | ||
suspend_args: &[ir::Value], | ||
tag_return_types: &[ir::Type], | ||
) -> Vec<ir::Value> { | ||
stack_switching::instructions::translate_suspend( | ||
self, | ||
builder, | ||
tag_index, | ||
suspend_args, | ||
tag_return_types, | ||
) | ||
} | ||
|
||
/// Translates switch instructions. | ||
pub fn translate_switch( | ||
&mut self, | ||
builder: &mut FunctionBuilder, | ||
tag_index: u32, | ||
contobj: ir::Value, | ||
switch_args: &[ir::Value], | ||
return_types: &[ir::Type], | ||
) -> WasmResult<Vec<ir::Value>> { | ||
stack_switching::instructions::translate_switch( | ||
self, | ||
builder, | ||
tag_index, | ||
contobj, | ||
switch_args, | ||
return_types, | ||
) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it makes sense to add disas
tests for these operations in tests/disas/stack-switching/*
. Should be as easy as writing a couple .wat
files and then running WASMTIME_TEST_BLESS=1 cargo test --test disas
to initialize the golden output.
Doesn't need to be in this PR, but if not in this PR, then should be added to the tracking issue.
let (lsbs, msbs) = pos.ins().isplit(contobj); | ||
|
||
let (revision_counter, contref) = match env.isa().endianness() { | ||
ir::Endianness::Little => (lsbs, msbs), | ||
ir::Endianness::Big => { | ||
let pad_bits = 64 - env.pointer_type().bits(); | ||
let contref = pos.ins().ushr_imm(lsbs, pad_bits as i64); | ||
(msbs, contref) | ||
} | ||
}; | ||
let contref = if env.pointer_type().bits() < I64.bits() { | ||
pos.ins().ireduce(env.pointer_type(), contref) | ||
} else { | ||
contref | ||
}; | ||
(revision_counter, contref) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can this just be something like
let (lsbs, msbs) = pos.ins().isplit(contobj); | |
let (revision_counter, contref) = match env.isa().endianness() { | |
ir::Endianness::Little => (lsbs, msbs), | |
ir::Endianness::Big => { | |
let pad_bits = 64 - env.pointer_type().bits(); | |
let contref = pos.ins().ushr_imm(lsbs, pad_bits as i64); | |
(msbs, contref) | |
} | |
}; | |
let contref = if env.pointer_type().bits() < I64.bits() { | |
pos.ins().ireduce(env.pointer_type(), contref) | |
} else { | |
contref | |
}; | |
(revision_counter, contref) | |
let ptr_ty = env.pointer_type(); | |
assert!(ptr_ty.bits() <= 64); | |
let contref = pos.ins().ireduce(ptr_ty, contobj); | |
let shifted = pos.ins().ushr_imm(contobj, 64); | |
let revision_counter = pos.ins().ireduce(it::types::I64, shifted); | |
(revision_counter, contref) |
?
That is, we define the fat pointer as 128 bits where the upper 64 are the revision counter and the bottom sizeof(pointer)
bits are the pointer to the continuation. This way I don't think we ever have to branch on endianness or pointer width.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, this makes sense to me, will adopt this approach (with a quick round of testing) until we can split into individual ir values.
let contref_addr = if env.pointer_type().bits() < I64.bits() { | ||
pos.ins().uextend(I64, contref_addr) | ||
} else { | ||
contref_addr | ||
}; | ||
let (msbs, lsbs) = match env.isa().endianness() { | ||
ir::Endianness::Little => (contref_addr, revision_counter), | ||
ir::Endianness::Big => { | ||
let pad_bits = 64 - env.pointer_type().bits(); | ||
let lsbs = pos.ins().ishl_imm(contref_addr, pad_bits as i64); | ||
(revision_counter, lsbs) | ||
} | ||
}; | ||
|
||
let lsbs = pos.ins().uextend(ir::types::I128, lsbs); | ||
let msbs = pos.ins().uextend(ir::types::I128, msbs); | ||
let msbs = pos.ins().ishl_imm(msbs, 64); | ||
pos.ins().bor(lsbs, msbs) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And then construction would become something like this:
let contref_addr = if env.pointer_type().bits() < I64.bits() { | |
pos.ins().uextend(I64, contref_addr) | |
} else { | |
contref_addr | |
}; | |
let (msbs, lsbs) = match env.isa().endianness() { | |
ir::Endianness::Little => (contref_addr, revision_counter), | |
ir::Endianness::Big => { | |
let pad_bits = 64 - env.pointer_type().bits(); | |
let lsbs = pos.ins().ishl_imm(contref_addr, pad_bits as i64); | |
(revision_counter, lsbs) | |
} | |
}; | |
let lsbs = pos.ins().uextend(ir::types::I128, lsbs); | |
let msbs = pos.ins().uextend(ir::types::I128, msbs); | |
let msbs = pos.ins().ishl_imm(msbs, 64); | |
pos.ins().bor(lsbs, msbs) | |
assert!(env.pointer_type().bits() <= 64); | |
let contref_addr = pos.ins().uextend(ir::types::I28, contref_addr); | |
let revision_counter = pos.ins().uextend(ir::types::I128, revision_counter); | |
let shifted_counter = pos.ins().ishl_imm(revision_counter, 64); | |
pos.ins().bor(shifted_counter, contref_addr) |
Note: if you want to switch the order of the contref pointer and the counter in the fat pointer, that's fine (I think I unwittingly switched it from what is in the code now), but the important point is that we should be able do all of this fat pointer construction and destruction without branching on endianness and pointer width.
These changes pull in the cranelift changes from #10177 with some additional stacked changes to resolve conflicts, align with previous changes in the stack-switching series, and address feedback items which were raised on previous iterations of the PR (but mostly not changing anything of significant substance). Tracking Issue: #10248.
The
stack-switching
feature flag is retained and used minimally in this changeset in order to avoid compilation problems, but not really used beyond that. There is at least one item in the tracking issue already related to likely finding a way to drop these compilation flags in most places but I think it is worth deferring that here as it will required touching code more broadly.CC @frank-emrich @dhil