Skip to content

cranelift: stack-switching support #11003

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 20 commits into
base: main
Choose a base branch
from

Conversation

posborne
Copy link
Collaborator

These changes pull in the cranelift changes from #10177 with some additional stacked changes to resolve conflicts, align with previous changes in the stack-switching series, and address feedback items which were raised on previous iterations of the PR (but mostly not changing anything of significant substance). Tracking Issue: #10248.

The stack-switching feature flag is retained and used minimally in this changeset in order to avoid compilation problems, but not really used beyond that. There is at least one item in the tracking issue already related to likely finding a way to drop these compilation flags in most places but I think it is worth deferring that here as it will required touching code more broadly.

CC @frank-emrich @dhil

frank-emrich and others added 7 commits June 10, 2025 16:17
This initial commit represents the "pr2" base commit with
minimal merge conflicts resolved.  Due to OOB conflicts, this
commit is not functional as-is, but using it as a base in
order to allow for easier reviewing of the delta from
this commit to what will be used for the PR against upstream.

Co-authored-by: Daniel Hillerström <daniel.hillerstrom@ed.ac.uk>
Co-authored-by: Paul Osborne <paul.osborne@fastly.com>
This first set of changes updates the base pr in order to
compiled and pass basic checks (compile, clippy, fmt) with
the biggest part of the change being to eliminate injection
of tracing/assertions in JIT'ed code.
At this point, the only bit we really branch on is what we
do in order to avoid problems tying into wasmtime_environ.
This is basd on the approach and macro used by the gc code for
converting presence/absence of the cranelift feature flag to
cranelift compile time.  This is a bit of a half-measure for now
as we still compile most stack-switching code in cranelift, but
this does enough to avoid causing problems with missing definitions
in wasmtime_environ.
Replace either with infallible From or fallible, panicing
TryFrom alternatives where required.
After removing emission of runtime trace logging and assertions,
there were several unused parameters.  Remove those from the
ControlEffect signatures completely.
This matches a change to the mirrored runtime type in
the upstream changes.
@posborne posborne requested review from a team as code owners June 10, 2025 17:47
@posborne posborne requested review from alexcrichton and removed request for a team June 10, 2025 17:47
@fitzgen fitzgen requested review from fitzgen and removed request for alexcrichton June 10, 2025 18:13
@github-actions github-actions bot added cranelift Issues related to the Cranelift code generator cranelift:area:x64 Issues related to x64 codegen labels Jun 10, 2025
Copy link
Member

@fitzgen fitzgen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! Lots of comments below but for the most part these should be pretty straightforward to fix.

posborne and others added 10 commits June 11, 2025 13:27
Co-authored-by: Daniel Hillerström <daniel.hillerstrom@ed.ac.uk>
The extra parameters here used to be used for emitting runtime
assertions, but with those gone we just had unused params
and lifetimes, clean those out.
There's already a stub elsewhere and this is not called, when
exceptions are added and it is time to revisit, this method
can be restored.
Rename VMHostArray -> VMHostArrayRef
Change impl to compute address with offset upfront rather than
on each load.
@posborne
Copy link
Collaborator Author

Pushed most of the updates but still need to make the suggested updates around the fat pointer stuff and resolve conflicts with upstream.

This matches the directory structure for gc and aids in visibility
for a few members required by stack-switching code in cranelift.
posborne added 2 commits June 18, 2025 19:41
…nelift

As part of this, updated translate_ref_is_null to
use the wasm type rather than brancing on the ir type
being an i128.
@posborne
Copy link
Collaborator Author

@fitzgen I believe this is ready for review again, let me know if there's pieces I missed in my previous updates. As noted, I chose to defer fatpointer changes for now to be potentially addressed as part of (or following) changes to the translation stack.

Copy link
Member

@fitzgen fitzgen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM modulo a few final comments below, we should be able to merge this and move on to follow ups after they are addressed.

Thanks!

@@ -138,7 +139,7 @@ pub struct FuncEnvironment<'module_environment> {
pcc_vmctx_memtype: Option<ir::MemoryType>,

/// Caches of signatures for builtin functions.
builtin_functions: BuiltinFunctions,
pub(crate) builtin_functions: BuiltinFunctions,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This shouldn't need to be pub(crate) after the stack_switching module moved to func_environ::stack_switching, right?

Comment on lines +178 to +187

/// Used by the stack switching feature. If set, we have a allocated a
/// slot on this function's stack to be used for the
/// current stack's `handler_list` field.
pub(crate) stack_switching_handler_list_buffer: Option<ir::StackSlot>,

/// Used by the stack switching feature. If set, we have a allocated a
/// slot on this function's stack to be used for the
/// current continuation's `values` field.
pub(crate) stack_switching_values_buffer: Option<ir::StackSlot>,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ditto

Comment on lines +3288 to +3365
pub fn translate_cont_bind(
&mut self,
builder: &mut FunctionBuilder<'_>,
contobj: ir::Value,
args: &[ir::Value],
) -> ir::Value {
stack_switching::instructions::translate_cont_bind(self, builder, contobj, args)
}

pub fn translate_cont_new(
&mut self,
builder: &mut FunctionBuilder<'_>,
_state: &FuncTranslationState,
func: ir::Value,
arg_types: &[WasmValType],
return_types: &[WasmValType],
) -> WasmResult<ir::Value> {
stack_switching::instructions::translate_cont_new(
self,
builder,
func,
arg_types,
return_types,
)
}

pub fn translate_resume(
&mut self,
builder: &mut FunctionBuilder<'_>,
type_index: u32,
contobj: ir::Value,
resume_args: &[ir::Value],
resumetable: &[(u32, Option<ir::Block>)],
) -> WasmResult<Vec<ir::Value>> {
stack_switching::instructions::translate_resume(
self,
builder,
type_index,
contobj,
resume_args,
resumetable,
)
}

pub fn translate_suspend(
&mut self,
builder: &mut FunctionBuilder<'_>,
tag_index: u32,
suspend_args: &[ir::Value],
tag_return_types: &[ir::Type],
) -> Vec<ir::Value> {
stack_switching::instructions::translate_suspend(
self,
builder,
tag_index,
suspend_args,
tag_return_types,
)
}

/// Translates switch instructions.
pub fn translate_switch(
&mut self,
builder: &mut FunctionBuilder,
tag_index: u32,
contobj: ir::Value,
switch_args: &[ir::Value],
return_types: &[ir::Type],
) -> WasmResult<Vec<ir::Value>> {
stack_switching::instructions::translate_switch(
self,
builder,
tag_index,
contobj,
switch_args,
return_types,
)
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it makes sense to add disas tests for these operations in tests/disas/stack-switching/*. Should be as easy as writing a couple .wat files and then running WASMTIME_TEST_BLESS=1 cargo test --test disas to initialize the golden output.

Doesn't need to be in this PR, but if not in this PR, then should be added to the tracking issue.

Comment on lines +25 to +40
let (lsbs, msbs) = pos.ins().isplit(contobj);

let (revision_counter, contref) = match env.isa().endianness() {
ir::Endianness::Little => (lsbs, msbs),
ir::Endianness::Big => {
let pad_bits = 64 - env.pointer_type().bits();
let contref = pos.ins().ushr_imm(lsbs, pad_bits as i64);
(msbs, contref)
}
};
let contref = if env.pointer_type().bits() < I64.bits() {
pos.ins().ireduce(env.pointer_type(), contref)
} else {
contref
};
(revision_counter, contref)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this just be something like

Suggested change
let (lsbs, msbs) = pos.ins().isplit(contobj);
let (revision_counter, contref) = match env.isa().endianness() {
ir::Endianness::Little => (lsbs, msbs),
ir::Endianness::Big => {
let pad_bits = 64 - env.pointer_type().bits();
let contref = pos.ins().ushr_imm(lsbs, pad_bits as i64);
(msbs, contref)
}
};
let contref = if env.pointer_type().bits() < I64.bits() {
pos.ins().ireduce(env.pointer_type(), contref)
} else {
contref
};
(revision_counter, contref)
let ptr_ty = env.pointer_type();
assert!(ptr_ty.bits() <= 64);
let contref = pos.ins().ireduce(ptr_ty, contobj);
let shifted = pos.ins().ushr_imm(contobj, 64);
let revision_counter = pos.ins().ireduce(it::types::I64, shifted);
(revision_counter, contref)

?

That is, we define the fat pointer as 128 bits where the upper 64 are the revision counter and the bottom sizeof(pointer) bits are the pointer to the continuation. This way I don't think we ever have to branch on endianness or pointer width.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, this makes sense to me, will adopt this approach (with a quick round of testing) until we can split into individual ir values.

Comment on lines +53 to +70
let contref_addr = if env.pointer_type().bits() < I64.bits() {
pos.ins().uextend(I64, contref_addr)
} else {
contref_addr
};
let (msbs, lsbs) = match env.isa().endianness() {
ir::Endianness::Little => (contref_addr, revision_counter),
ir::Endianness::Big => {
let pad_bits = 64 - env.pointer_type().bits();
let lsbs = pos.ins().ishl_imm(contref_addr, pad_bits as i64);
(revision_counter, lsbs)
}
};

let lsbs = pos.ins().uextend(ir::types::I128, lsbs);
let msbs = pos.ins().uextend(ir::types::I128, msbs);
let msbs = pos.ins().ishl_imm(msbs, 64);
pos.ins().bor(lsbs, msbs)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And then construction would become something like this:

Suggested change
let contref_addr = if env.pointer_type().bits() < I64.bits() {
pos.ins().uextend(I64, contref_addr)
} else {
contref_addr
};
let (msbs, lsbs) = match env.isa().endianness() {
ir::Endianness::Little => (contref_addr, revision_counter),
ir::Endianness::Big => {
let pad_bits = 64 - env.pointer_type().bits();
let lsbs = pos.ins().ishl_imm(contref_addr, pad_bits as i64);
(revision_counter, lsbs)
}
};
let lsbs = pos.ins().uextend(ir::types::I128, lsbs);
let msbs = pos.ins().uextend(ir::types::I128, msbs);
let msbs = pos.ins().ishl_imm(msbs, 64);
pos.ins().bor(lsbs, msbs)
assert!(env.pointer_type().bits() <= 64);
let contref_addr = pos.ins().uextend(ir::types::I28, contref_addr);
let revision_counter = pos.ins().uextend(ir::types::I128, revision_counter);
let shifted_counter = pos.ins().ishl_imm(revision_counter, 64);
pos.ins().bor(shifted_counter, contref_addr)

Note: if you want to switch the order of the contref pointer and the counter in the fat pointer, that's fine (I think I unwittingly switched it from what is in the code now), but the important point is that we should be able do all of this fat pointer construction and destruction without branching on endianness and pointer width.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cranelift:area:x64 Issues related to x64 codegen cranelift Issues related to the Cranelift code generator
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants