cranelift: stack-switching support #11003

posborne · 2025-06-10T17:47:43Z

These changes pull in the cranelift changes from #10177 with some additional stacked changes to resolve conflicts, align with previous changes in the stack-switching series, and address feedback items which were raised on previous iterations of the PR (but mostly not changing anything of significant substance). Tracking Issue: #10248.

The stack-switching feature flag is retained and used minimally in this changeset in order to avoid compilation problems, but not really used beyond that. There is at least one item in the tracking issue already related to likely finding a way to drop these compilation flags in most places but I think it is worth deferring that here as it will required touching code more broadly.

CC @frank-emrich @dhil

This initial commit represents the "pr2" base commit with minimal merge conflicts resolved. Due to OOB conflicts, this commit is not functional as-is, but using it as a base in order to allow for easier reviewing of the delta from this commit to what will be used for the PR against upstream. Co-authored-by: Daniel Hillerström <daniel.hillerstrom@ed.ac.uk> Co-authored-by: Paul Osborne <paul.osborne@fastly.com>

This first set of changes updates the base pr in order to compiled and pass basic checks (compile, clippy, fmt) with the biggest part of the change being to eliminate injection of tracing/assertions in JIT'ed code.

…c_environ members

At this point, the only bit we really branch on is what we do in order to avoid problems tying into wasmtime_environ. This is basd on the approach and macro used by the gc code for converting presence/absence of the cranelift feature flag to cranelift compile time. This is a bit of a half-measure for now as we still compile most stack-switching code in cranelift, but this does enough to avoid causing problems with missing definitions in wasmtime_environ.

Replace either with infallible From or fallible, panicing TryFrom alternatives where required.

After removing emission of runtime trace logging and assertions, there were several unused parameters. Remove those from the ControlEffect signatures completely.

This matches a change to the mirrored runtime type in the upstream changes.

crates/cranelift/src/stack_switching/instructions.rs

fitzgen

Thanks! Lots of comments below but for the most part these should be pretty straightforward to fix.

cranelift/codegen/src/ir/function.rs

crates/cranelift/src/stack_switching/mod.rs

crates/cranelift/src/func_environ.rs

crates/cranelift/src/translate/code_translator.rs

crates/cranelift/src/stack_switching/fatpointer.rs

crates/cranelift/src/stack_switching/instructions.rs

crates/cranelift/src/stack_switching/mod.rs

Co-authored-by: Daniel Hillerström <daniel.hillerstrom@ed.ac.uk>

The extra parameters here used to be used for emitting runtime assertions, but with those gone we just had unused params and lifetimes, clean those out.

There's already a stub elsewhere and this is not called, when exceptions are added and it is time to revisit, this method can be restored.

Rename VMHostArray -> VMHostArrayRef Change impl to compute address with offset upfront rather than on each load.

posborne · 2025-06-16T16:29:24Z

Pushed most of the updates but still need to make the suggested updates around the fat pointer stuff and resolve conflicts with upstream.

This matches the directory structure for gc and aids in visibility for a few members required by stack-switching code in cranelift.

…nelift As part of this, updated translate_ref_is_null to use the wasm type rather than brancing on the ir type being an i128.

posborne · 2025-06-23T16:15:48Z

@fitzgen I believe this is ready for review again, let me know if there's pieces I missed in my previous updates. As noted, I chose to defer fatpointer changes for now to be potentially addressed as part of (or following) changes to the translation stack.

fitzgen

LGTM modulo a few final comments below, we should be able to merge this and move on to follow ups after they are addressed.

Thanks!

fitzgen · 2025-06-24T15:56:52Z

crates/cranelift/src/func_environ.rs

@@ -138,7 +139,7 @@ pub struct FuncEnvironment<'module_environment> {
    pcc_vmctx_memtype: Option<ir::MemoryType>,

    /// Caches of signatures for builtin functions.
-    builtin_functions: BuiltinFunctions,
+    pub(crate) builtin_functions: BuiltinFunctions,


This shouldn't need to be pub(crate) after the stack_switching module moved to func_environ::stack_switching, right?

fitzgen · 2025-06-24T15:56:57Z

crates/cranelift/src/func_environ.rs

+
+    /// Used by the stack switching feature. If set, we have a allocated a
+    /// slot on this function's stack to be used for the
+    /// current stack's `handler_list` field.
+    pub(crate) stack_switching_handler_list_buffer: Option<ir::StackSlot>,
+
+    /// Used by the stack switching feature. If set, we have a allocated a
+    /// slot on this function's stack to be used for the
+    /// current continuation's `values` field.
+    pub(crate) stack_switching_values_buffer: Option<ir::StackSlot>,


fitzgen · 2025-06-24T16:17:52Z

crates/cranelift/src/func_environ.rs

+    pub fn translate_cont_bind(
+        &mut self,
+        builder: &mut FunctionBuilder<'_>,
+        contobj: ir::Value,
+        args: &[ir::Value],
+    ) -> ir::Value {
+        stack_switching::instructions::translate_cont_bind(self, builder, contobj, args)
+    }
+
+    pub fn translate_cont_new(
+        &mut self,
+        builder: &mut FunctionBuilder<'_>,
+        _state: &FuncTranslationState,
+        func: ir::Value,
+        arg_types: &[WasmValType],
+        return_types: &[WasmValType],
+    ) -> WasmResult<ir::Value> {
+        stack_switching::instructions::translate_cont_new(
+            self,
+            builder,
+            func,
+            arg_types,
+            return_types,
+        )
+    }
+
+    pub fn translate_resume(
+        &mut self,
+        builder: &mut FunctionBuilder<'_>,
+        type_index: u32,
+        contobj: ir::Value,
+        resume_args: &[ir::Value],
+        resumetable: &[(u32, Option<ir::Block>)],
+    ) -> WasmResult<Vec<ir::Value>> {
+        stack_switching::instructions::translate_resume(
+            self,
+            builder,
+            type_index,
+            contobj,
+            resume_args,
+            resumetable,
+        )
+    }
+
+    pub fn translate_suspend(
+        &mut self,
+        builder: &mut FunctionBuilder<'_>,
+        tag_index: u32,
+        suspend_args: &[ir::Value],
+        tag_return_types: &[ir::Type],
+    ) -> Vec<ir::Value> {
+        stack_switching::instructions::translate_suspend(
+            self,
+            builder,
+            tag_index,
+            suspend_args,
+            tag_return_types,
+        )
+    }
+
+    /// Translates switch instructions.
+    pub fn translate_switch(
+        &mut self,
+        builder: &mut FunctionBuilder,
+        tag_index: u32,
+        contobj: ir::Value,
+        switch_args: &[ir::Value],
+        return_types: &[ir::Type],
+    ) -> WasmResult<Vec<ir::Value>> {
+        stack_switching::instructions::translate_switch(
+            self,
+            builder,
+            tag_index,
+            contobj,
+            switch_args,
+            return_types,
+        )
+    }


I think it makes sense to add disas tests for these operations in tests/disas/stack-switching/*. Should be as easy as writing a couple .wat files and then running WASMTIME_TEST_BLESS=1 cargo test --test disas to initialize the golden output.

Doesn't need to be in this PR, but if not in this PR, then should be added to the tracking issue.

fitzgen · 2025-06-24T16:27:01Z

crates/cranelift/src/func_environ/stack_switching/fatpointer.rs

+    let (lsbs, msbs) = pos.ins().isplit(contobj);
+
+    let (revision_counter, contref) = match env.isa().endianness() {
+        ir::Endianness::Little => (lsbs, msbs),
+        ir::Endianness::Big => {
+            let pad_bits = 64 - env.pointer_type().bits();
+            let contref = pos.ins().ushr_imm(lsbs, pad_bits as i64);
+            (msbs, contref)
+        }
+    };
+    let contref = if env.pointer_type().bits() < I64.bits() {
+        pos.ins().ireduce(env.pointer_type(), contref)
+    } else {
+        contref
+    };
+    (revision_counter, contref)


Can this just be something like

Suggested change

let (lsbs, msbs) = pos.ins().isplit(contobj);

let (revision_counter, contref) = match env.isa().endianness() {

ir::Endianness::Little => (lsbs, msbs),

ir::Endianness::Big => {

let pad_bits = 64 - env.pointer_type().bits();

let contref = pos.ins().ushr_imm(lsbs, pad_bits as i64);

(msbs, contref)

}

};

let contref = if env.pointer_type().bits() < I64.bits() {

pos.ins().ireduce(env.pointer_type(), contref)

} else {

contref

};

(revision_counter, contref)

let ptr_ty = env.pointer_type();

assert!(ptr_ty.bits() <= 64);

let contref = pos.ins().ireduce(ptr_ty, contobj);

let shifted = pos.ins().ushr_imm(contobj, 64);

let revision_counter = pos.ins().ireduce(it::types::I64, shifted);

(revision_counter, contref)

?

That is, we define the fat pointer as 128 bits where the upper 64 are the revision counter and the bottom sizeof(pointer) bits are the pointer to the continuation. This way I don't think we ever have to branch on endianness or pointer width.

Yeah, this makes sense to me, will adopt this approach (with a quick round of testing) until we can split into individual ir values.

fitzgen · 2025-06-24T16:36:40Z

crates/cranelift/src/func_environ/stack_switching/fatpointer.rs

+    let contref_addr = if env.pointer_type().bits() < I64.bits() {
+        pos.ins().uextend(I64, contref_addr)
+    } else {
+        contref_addr
+    };
+    let (msbs, lsbs) = match env.isa().endianness() {
+        ir::Endianness::Little => (contref_addr, revision_counter),
+        ir::Endianness::Big => {
+            let pad_bits = 64 - env.pointer_type().bits();
+            let lsbs = pos.ins().ishl_imm(contref_addr, pad_bits as i64);
+            (revision_counter, lsbs)
+        }
+    };
+
+    let lsbs = pos.ins().uextend(ir::types::I128, lsbs);
+    let msbs = pos.ins().uextend(ir::types::I128, msbs);
+    let msbs = pos.ins().ishl_imm(msbs, 64);
+    pos.ins().bor(lsbs, msbs)


And then construction would become something like this:

Suggested change

let contref_addr = if env.pointer_type().bits() < I64.bits() {

pos.ins().uextend(I64, contref_addr)

} else {

contref_addr

};

let (msbs, lsbs) = match env.isa().endianness() {

ir::Endianness::Little => (contref_addr, revision_counter),

ir::Endianness::Big => {

let pad_bits = 64 - env.pointer_type().bits();

let lsbs = pos.ins().ishl_imm(contref_addr, pad_bits as i64);

(revision_counter, lsbs)

}

};

let lsbs = pos.ins().uextend(ir::types::I128, lsbs);

let msbs = pos.ins().uextend(ir::types::I128, msbs);

let msbs = pos.ins().ishl_imm(msbs, 64);

pos.ins().bor(lsbs, msbs)

assert!(env.pointer_type().bits() <= 64);

let contref_addr = pos.ins().uextend(ir::types::I28, contref_addr);

let revision_counter = pos.ins().uextend(ir::types::I128, revision_counter);

let shifted_counter = pos.ins().ishl_imm(revision_counter, 64);

pos.ins().bor(shifted_counter, contref_addr)

Note: if you want to switch the order of the contref pointer and the counter in the fat pointer, that's fine (I think I unwittingly switched it from what is in the code now), but the important point is that we should be able do all of this fat pointer construction and destruction without branching on endianness and pointer width.

frank-emrich and others added 7 commits June 10, 2025 16:17

cranelift: stack-switching updates pass 1

0853a16

This first set of changes updates the base pr in order to compiled and pass basic checks (compile, clippy, fmt) with the biggest part of the change being to eliminate injection of tracing/assertions in JIT'ed code.

cranelift: stack-switching: restore original visibility for a few fun…

858b22a

…c_environ members

cranelift: avoid "as" casts in stack-switching

100d621

Replace either with infallible From or fallible, panicing TryFrom alternatives where required.

cranelift: cleanup stack-switching control_effect signatures

00c2b56

After removing emission of runtime trace logging and assertions, there were several unused parameters. Remove those from the ControlEffect signatures completely.

cranelift: rename stack-switching VMArray to VMHostArray

e9fa92d

This matches a change to the mirrored runtime type in the upstream changes.

posborne requested review from a team as code owners June 10, 2025 17:47

posborne requested review from alexcrichton and removed request for a team June 10, 2025 17:47

posborne commented Jun 10, 2025

View reviewed changes

crates/cranelift/src/stack_switching/instructions.rs Outdated Show resolved Hide resolved

fitzgen requested review from fitzgen and removed request for alexcrichton June 10, 2025 18:13

github-actions bot added cranelift Issues related to the Cranelift code generator cranelift:area:x64 Issues related to x64 codegen labels Jun 10, 2025

fitzgen reviewed Jun 11, 2025

View reviewed changes

dhil reviewed Jun 11, 2025

View reviewed changes

crates/cranelift/src/stack_switching/fatpointer.rs Outdated Show resolved Hide resolved

crates/cranelift/src/stack_switching/instructions.rs Outdated Show resolved Hide resolved

crates/cranelift/src/stack_switching/mod.rs Outdated Show resolved Hide resolved

posborne and others added 10 commits June 11, 2025 13:27

stack-switching: fix typo

866fbff

Co-authored-by: Daniel Hillerström <daniel.hillerstrom@ed.ac.uk>

stack-switching: used Index impl for get_stack_slot_data

115e503

stack-switching: use smallvec over vec in several cases

11c6b4a

stack-switching: avoid resumetable naming confusion

fa8001e

stack-switching: cleanup unused params from unchecked_get_continuation

51017f2

The extra parameters here used to be used for emitting runtime assertions, but with those gone we just had unused params and lifetimes, clean those out.

stack_switching: simplify store_data_entries assertion

44c2b34

stack-switching: simplify translate_table_{grow,fill} control flow

b8fb8b3

stack-switching: remove translate_resume_throw stub

a693350

There's already a stub elsewhere and this is not called, when exceptions are added and it is time to revisit, this method can be restored.

stack-switching: compute control_context_size based on target triple

eeab1b1

stack-switching: VMHostArrayRef updates

3bd138f

Rename VMHostArray -> VMHostArrayRef Change impl to compute address with offset upfront rather than on each load.

stack-switching: move cranelift code to live under func_environ

49313bc

This matches the directory structure for gc and aids in visibility for a few members required by stack-switching code in cranelift.

posborne added 2 commits June 18, 2025 19:41

Merge remote-tracking branch 'upstream/main' into stack-switching-cra…

5ccd3f7

…nelift As part of this, updated translate_ref_is_null to use the wasm type rather than brancing on the ir type being an i128.

stack-switching: formatting fix

c86b06a

fitzgen reviewed Jun 24, 2025

View reviewed changes

cranelift: stack-switching support #11003

Are you sure you want to change the base?

cranelift: stack-switching support #11003

Uh oh!

Conversation

posborne commented Jun 10, 2025

Uh oh!

Uh oh!

fitzgen left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

posborne commented Jun 16, 2025

Uh oh!

posborne commented Jun 23, 2025

Uh oh!

fitzgen left a comment

Choose a reason for hiding this comment

Uh oh!

fitzgen Jun 24, 2025

Choose a reason for hiding this comment

Uh oh!

fitzgen Jun 24, 2025

Choose a reason for hiding this comment

Uh oh!

fitzgen Jun 24, 2025

Choose a reason for hiding this comment

Uh oh!

fitzgen Jun 24, 2025

Choose a reason for hiding this comment

Uh oh!

posborne Jun 25, 2025

Choose a reason for hiding this comment

Uh oh!

fitzgen Jun 24, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!