[Task 55] CPS handler codegen: foundation through Phase 4a (multi-arm dispatch) by boldfield · Pull Request #22 · boldfield/sigil

boldfield · 2026-04-26T04:44:51Z

Status: ready for review

Plan B Task 55 — CPS expansion for Expr::Handle and Expr::Perform. Lifts the E0133 + E0134 staged-feature gates and wires effect handlers through the runtime ABI from PR #21 (Task 56). Lands across nine commits on plan-b-task-55 per [DEVIATION Task 55 — Foundation phase ships separately…] in PLAN_B_DEVIATIONS.md; the deviation entry's Closure point is this PR's squash-merge.

Diff stats: 8 files, +2,214 / −266 (~1,950 LOC net). Compiler lib tests 419 → 421 (+2). Three new e2e tests + one differential property test (24 trials, ~50 compile-and-run cycles).

What's in scope

The PR ships the supported subset of handler shapes; restrictions outside the subset are rejected at codegen entry by unsupported_handle_construct with a Phase-4-pointing diagnostic. Each restriction has a [DEVIATION Task 55 …] entry naming its Closure point (the Phase-4 sub-task that lifts it).

Phase	Commit	What it ships
Foundation	`b3af204`	Lift E0133; route `Item::Effect` through codegen entry walker.
2	`2d69b52`	Lift E0134; effect/op IDs assigned at typecheck; `Expr::Handle` body-pass-through (no-perform case); codegen-entry guard for shapes outside the supported subset.
3a	`ef4be8d`	Wire `sigil_handler_frame_new` / `_push` / `_pop` around every handle body (arm fn pointers still null).
3b	`d0aa4c4`	Per-handler-arm synthetic CPS fn synthesis; `sigil_handler_frame_set_arm` + `sigil_perform` lowering. First real handler dispatch end-to-end.
3b fixup	`2e7c0de`	Route `sigil_perform`'s `NextStep::Call` through `sigil_run_loop` (the dispatcher).
4a	`adcb897`	Multi-arm single-effect handlers.
Review fixups #1	`54b4a60`	Walker recursion bug; `Stmt::Perform` non-IO crash; cleanup (dead-code, op_names.dedup, span param, stale comments, Closure-point lines, etc.).
Review fixups #2	`95472ec`	MF2 differential identity property test; deviation entries for PB1/PB3/PB4.
Elaborate fix	`6057373`	Pre-existing bug exposed by MF2: `elaborate` skipped recursion into `Expr::Call` / `Expr::RecordLit` / `Expr::Handle` children.

Supported handler subset (Phase 3b/4a)

1+ op-arms, all targeting the same effect (multi-effect → Phase 4e).
Each op has zero user args (args-buffer packing → Phase 4b).
Each arm body is Expr::IntLit only (richer arm bodies → Phase 4c).
Arms cannot reference k (continuation reification → Phase 4d).
No return arm (synthetic return-fn registration → Phase 4f).
Handle body's non-IO performs must be zero-arg (lifts in Phase 4b alongside the op-arg restriction).

Programs outside this subset surface a clear in-progress diagnostic at codegen entry instead of compiling and crashing.

Tests

Lib tests (compiler crate): 419 → 421 (+2 — the elaborate recursion regressions). Existing typecheck handle tests (9) all pass under the new gate state.

e2e tests added on this PR:

effect_decl_with_no_handler_use_compiles_and_runs — foundation phase, effect-only program compiles end-to-end.
handle_with_no_perform_in_body_compiles_and_runs — Phase 2 body-pass-through path.
handle_with_non_io_perform_runs_arm_and_returns_value — Phase 3b first real handler dispatch (prints 42).
handle_with_two_arms_dispatches_correct_arm_by_op_id — Phase 4a multi-arm dispatch.
handle_with_mixed_effect_arms_is_rejected_at_codegen — Phase 4e blocker pinned.
handle_with_non_intlit_arm_body_is_rejected_at_codegen — IntLit-only restriction (Call shape + Binary shape).
nested_handle_in_outer_body_propagates_inner_unsupported_diagnostic — walker-recursion regression.
statement_form_non_io_perform_inside_handle_compiles_and_runs — Stmt::Perform non-IO crash regression.
cps_wrapped_identity_matches_native_on_native_eligible_programs — MF2 differential property test, 24 deterministic trials comparing native vs handle EXPR with { E_eff.op(k) => 999 } wrapped output.

Review-comment items (closed)

Item	Closed by
#1 Walker doesn't recurse into handle body	`54b4a60` (recursion + regression test).
#2 Dead `body_contains_non_io_perform` helpers	`54b4a60` (deleted ~115 LOC).
#3 `op_names.dedup()` masks invariant	`54b4a60` (replaced with `debug_assert!`).
#4 Duplicate Phase 3a paragraph in PROGRESS	`54b4a60`.
#5 Misleading test name `…_uses_k_…`	`54b4a60` (renamed + added `1+1` variant).
#6 No stackmap entry on synthetic arm fns	`54b4a60` (TODO with Phase 4c closure).
#7 Unused `span` param in `check_handle`	`54b4a60` (dropped from signature).
MF1 `Stmt::Perform` non-IO crash	`54b4a60` (dispatch + e2e test).
MF2 Differential identity property test	`95472ec` + `6057373`.
MF3 Color stub comment	`54b4a60` (named Phase 4d as closer).
MF4 Stale offset-24 comments	`54b4a60`.
MF5 Duplicate Phase 3a paragraph	`54b4a60`.
MF6 Stale E0133/E0134 comments	`54b4a60` (5 sites + TODO@2955).
PB1 Walker doesn't follow call edges	`95472ec` (deviation entry; closure: Phase 4b).
PB2 Closure-ptr null marker	`54b4a60` (`PHASE-4-RESTRICTION` comment).
PB3 Frame leak on body unwind	`95472ec` (deviation entry; closure: Task 57).
PB4 Synchronous `sigil_run_loop` from native callers	`95472ec` (deviation entry; closure: Phase 4d).
Closure-point lines on Task 55 deviation entries	`54b4a60`.

Known carry-forwards (out of Task 55 scope; tracked elsewhere)

Per-monomorph color variance under handler-context (PR [Tasks 51 + 52] Stage 5 closeout: generic_map example + P16/P17 prompts #18 / [Task 54] Type checker: handler typing + E0220 one-shot linearity #20 / [Task 56] Runtime: HandlerFrame, arena, sigil_perform, run_loop, counters #21 reviewer's Stage 6 ask). The colorer doesn't refine handler-discharge in the general case (only the inline-handle case via find_non_io_perform_in_expr). Closes alongside Phase 4d (the comment in compiler/src/color.rs::find_non_io_perform_in_expr names this dependency).
Reviewer's [PLAN-B] Task 54: handler arm surface (qualified-only vs bare-op-as-sugar) QUESTIONS.md entry. Stays open for the Stage 6 review checkpoint.

Reviewer-only run notes

If the e2e tests fail with link errors against _sigil_handle_push / _sigil_handler_frame_new etc., the local target/debug/libsigil_runtime.a is stale. Rebuild via cargo build -p sigil-runtime and re-run.

Test plan

CI green on ubuntu-24.04 + macos-14 for build + test and cold-checkout at 6057373.
Pod-verify clean (cargo check workspace, fmt, clippy on both crates one-at-a-time, runtime lib tests, discipline greps).
Reproducibility script: 4/4 example binaries byte-identical across builds — synthetic sigil_handler_arm_<global_index> symbol naming verified deterministic.
All Task 54 typecheck tests still pass (handler-typing, E0220 linearity, E0136-E0141, no spurious E0046 on arm-param/k/return-arm bindings).
All Task 55 codegen tests pass.
MF2 differential property test green (24 trials, native vs CPS-wrapped identity holds).

…gen entry walker Foundation phase of Plan B Task 55. Asymmetric gate state: E0133 lifted, E0134 still firing. This is the first commit on `plan-b-task-55`; the CPS calling convention, body transform, per-arm closure synthesis, and E0134 lift land in follow-up commits on this same branch (no PR until the full path runs at least one real handler example end-to-end). - typecheck: drop E0133 emission for `Item::Effect` (effect decls now flow through with only the registry-population walk + op-signature type checks). E0134 emission for `Expr::Handle` retained with an updated message pointing at the in-progress task. - codegen: drop the `Item::Effect → return true` short-circuit in `contains_apply_or_generic_ref`. Effect declarations have no codegen output; downstream passes (monomorphize, color, closure_convert) already pass them through unchanged from Task 54. - catalog: remove the E0133 entry; rewrite the E0134 entry's long-form text to describe the in-progress codegen status. - typecheck tests: 21 co-firing `assert!(has_code(&errs, "E013[34]"))` lines stripped via perl one-liner; 4 pure-emission tests rewritten for the asymmetric state — `well_formed_effect_decl_typechecks_cleanly` and `effect_decl_with_invalid_op_type_emits_e0112` for the lifted half, `handle_expr_still_emits_e0134_until_codegen_lands` and the two co-firing variants for the still-gated half. - codegen tests: new positive walker test `walker_accepts_program_with_effect_decl` pins the new behavior. - e2e: new `effect_decl_with_no_handler_use_compiles_and_runs` test proves an effect-only program compiles end-to-end through codegen for the first time (CI-only; pod can't run sigil binaries). - discipline: `no_user_facing_error_uses_e0001` sweep programs unchanged; comment updated to describe the asymmetric state. - PLAN_B_PROGRESS / PLAN_B_DEVIATIONS updated. New deviation entry documents the foundation/CPS commit split on a single branch. Pod-verify clean. Compiler lib-test count delta: net +1 (4 pure-emission tests rewritten in place + 1 new walker test). Catalog gains 0 entries (E0133 removed, E0134 kept).

…-perform case) Phase 2 minimum of Plan B Task 55. Both staged-feature gates are now lifted (E0133 in `b3af204`, E0134 here). Well-formed `handle` expressions reach codegen for the first time. The full handler-frame setup + per-arm CPS fn synthesis + `sigil_perform` wiring lands in Phase 3+ on this same branch. - typecheck: drop the E0134 emission from `check_handle`. All handler- typing diagnostics emitted above the gate (E0046, E0220, E0044, E0136-E0141) continue to surface as before. Catalog entry for E0134 removed; the code number stays retired per the `seed_entries_are_unique_and_non_empty` discipline. - typecheck: add `effect_ids: BTreeMap<String, u32>` and `op_ids: BTreeMap<(String, String), u32>` to `CheckedProgram`, populated at the end of `typecheck()` with deterministic alphabetical assignment over `tc.effects`. These are the IDs the runtime ABI uses (`sigil_handler_frame_new(effect_id, ...)`, `sigil_perform(effect_id, op_id, ...)`); same source program → same IDs across builds. 3 new tests pin the assignment scheme. - codegen: replace the `unreachable!` in `lower_expr` for `Expr::Handle` with a body-pass-through (`self.lower_expr(body)`). `type_of_expr` for `Expr::Handle` forwards to body's type. The pass-through is safe because the codegen-entry guard `unsupported_handle_construct` rejects programs whose body would actually perform a non-IO effect with a clear in-progress message. IO `perform` in handle bodies is fine — `IO` is hard-wired in `lower_perform`. - codegen: new `unsupported_handle_construct(program) -> Option<String>` walker traverses every fn body, every block, every expression, flagging handle expressions whose body contains a non-IO `Expr::Perform` site or a `return` arm. Conservative scope: only direct `Expr::Perform` in the body is inspected (transitive performs through called fns aren't followed); acceptable for the Phase 2 e2e test program (body is a literal) but a documented limitation until Phase 3 ships the real handler-frame setup. - typecheck tests: the four asymmetric-state tests added in `b3af204` rewritten as positive `well_formed_handle_expr_typechecks_cleanly` + co-firing-diagnostic tests (`handle_expr_with_nested_body_type_error_surfaces`, `handle_arm_body_type_mismatch_surfaces_e0044_or_e0065`). - e2e: new `handle_with_no_perform_in_body_compiles_and_runs` — first time a handle expression compiles + runs end-to-end (prints `42`, exit 0). New `handle_with_non_io_perform_in_body_is_rejected_at_codegen` pins the codegen-time guard fires cleanly with a Task-55-referencing error message. - discipline sweep `no_user_facing_error_uses_e0001` programs unchanged; comment updated. - color.rs synthetic test fixture extended with the two new `CheckedProgram` fields (empty BTreeMaps). - PLAN_B_PROGRESS / PLAN_B_DEVIATIONS updated. New deviation entry documents the Phase 2 → Phase 3+ split. Pod-verify clean. 41 related compiler unit tests pass locally.

… handle bodies Phase 3a of Plan B Task 55. Wires the runtime handler-frame ABI from Task 56 around every `handle` body: each `handle BODY with { arms }` now allocates a frame, pushes it on the runtime handler stack, evaluates the body inline, then pops the frame. Arm fn pointers are not yet set (left null by `sigil_handler_frame_new`'s zero-init); Phase 3b adds per-arm CPS fn synthesis + `sigil_handler_frame_set_arm` + `sigil_perform` lowering. The frame is functionally a no-op in Phase 3a — its arm slots are never read because the codegen-entry guard `unsupported_handle_construct` rejects programs whose body would actually `perform` the handled effect — but the FFI plumbing is real, observable in compiled output, and the path Phase 3b builds on. - codegen: declare 3 new runtime imports — `sigil_handler_frame_new(effect_id: u32, arm_count: u32) -> *mut HandlerFrame`, `sigil_handle_push(frame)`, and `sigil_handle_pop() -> *mut HandlerFrame`. - codegen: `Lowerer` gains `handler_frame_new_ref` / `handle_push_ref` / `handle_pop_ref` FuncRef fields, plus an `effect_ids: &BTreeMap<String, u32>` borrow from the `CheckedProgram` for effect-id lookup at handle sites. - codegen: replace the Phase 2 body-pass-through in `Expr::Handle` with the frame_new + push + body-eval + pop sequence. The handle's Cranelift value is still the body's value (the frame allocation has no effect on the body's evaluation). - codegen: `unsupported_handle_construct` tightened — also rejects multi-arm handles, return arms, and (defensively) zero-arm handles. Phase 3a's codegen path indexes `op_arms[0]` for the unique effect-name lookup, so any handle that escapes these checks would trip out-of-bounds; the guard front-loads a clear in-progress diagnostic instead. - The Phase 2 e2e test `handle_with_no_perform_in_body_compiles_and_runs` continues to pass — it now exercises the frame ABI for the first time end-to-end (runtime allocates + pushes + pops a frame for the `Raise` handler around the literal `42`). The `handle_with_non_io_perform_in_body_is_rejected_at_codegen` test continues to pin the codegen-time rejection. Tiny runtime regression: the no-perform handle now allocates + pushes + pops a frame on every invocation (previously a no-op pass-through). Acceptable Phase 3a cost; Phase 3b makes the frame functional rather than just present. PLAN_B_PROGRESS / PLAN_B_DEVIATIONS updated. New `[DEVIATION Task 55 — 2026-04-26]` entry on the Phase 3a / Phase 3b split. Pod-verify clean.

…; first real handler dispatch Phase 3b of Plan B Task 55. Handlers now dispatch arms at runtime — the first time a `perform` actually invokes a user-written handler arm end-to-end. Combined with Phase 3a's frame ABI (`ef4be8d`), this completes the simplest meaningful subset of the handler runtime: single-arm, single-effect, zero-arg op, k-discarding arm, literal arm body. Phase 4+ lifts each restriction. - codegen: 3 new FFI declarations — `sigil_handler_frame_set_arm`, `sigil_perform`, `sigil_next_step_done`. - codegen: pre-pass `collect_handle_arms_in_block` / `_in_expr` walks every user fn body, finds every `Expr::Handle`, allocates one synthetic CPS-fn FuncId per arm with the uniform calling convention `extern "C" fn(closure_ptr, args_ptr, args_len) -> *mut NextStep`. Symbol naming `sigil_handler_arm_<global_index>`. Side-table `handler_arm_indices: BTreeMap<Span, Vec<usize>>` threads per-handle FuncIds through to the calling fn's lowering. - codegen: synthetic-fn definition pass at the bottom of `emit_object` lowers each arm body via a small hand-rolled Cranelift sequence: `iconst(value)` → `call sigil_next_step_done(value)` → `return result`. Phase 3b restricts arm bodies to `Expr::IntLit` so this hand-roll suffices. - codegen: `Expr::Handle` lowering in `Lowerer::lower_expr` extended to populate arm slots between `frame_new` and `push`. For each arm: `func_addr` to get the synthetic fn's pointer, then `sigil_handler_frame_set_arm(frame, op_id, fn_ptr, null_closure)`. - codegen: new `lower_perform_non_io_to_value` method emits `sigil_perform(effect_id, op_id, args_ptr=null, args_len=0, k_closure=null, k_fn=null)` for non-IO performs, then reads `(*next_step).value` at offset 24 (per the `#[repr(C)]` NextStep layout) as i64. `Expr::Perform` arm in `lower_expr` dispatches: IO → existing path returning Unit (i8 0); non-IO → CPS path returning the value. - codegen: `type_of_expr` for `Expr::Perform` looks up the op's declared return type from the `effects` registry for non-IO; defaults to I8 for IO. New `effects` and `op_ids` borrows on `Lowerer`. - codegen: `unsupported_handle_construct` walker updated to enforce Phase 3b restrictions: arm body is `IntLit` only, arm has no user op-args (only `k`), body's non-IO perform must be zero-arg. The body-perform check from Phase 2/3a is gone (now supported); multi-arm/return-arm checks remain (Phase 4+). - e2e: replaced `handle_with_non_io_perform_in_body_is_rejected_at_codegen` (Phase 2 rejection test, no longer rejected) with two new tests: `handle_with_non_io_perform_runs_arm_and_returns_value` (the Phase 3b milestone — `handle (perform Raise.fail()) with { Raise.fail(k) => 42 }` compiles, runs, prints `42`), `handle_with_arm_that_uses_k_is_rejected_at_codegen` (pins the Phase 3b restriction; arm body referencing `k` rejected with a Phase-4-pointing message). - The Phase 2/3a `handle_with_no_perform_in_body_compiles_and_runs` test continues to pass, now exercising the full `frame_new` + `set_arm` + `push` + `pop` sequence (set_arm is new — the arm fn pointer is registered even though the handler is never invoked). PLAN_B_PROGRESS / PLAN_B_DEVIATIONS updated. New `[DEVIATION Task 55 — 2026-04-26]` entry on Phase 3b restrictions. Pod-verify clean. Local e2e tests can't run (Cranelift OOM on pod); CI is the source of truth for the new e2e tests.

…gh sigil_run_loop CI failure on the new e2e test (`handle_with_non_io_perform_runs_arm_and_returns_value`) revealed an incorrect assumption: `sigil_perform` does NOT invoke the arm directly — it builds a `NextStep::Call` to the arm + packed `(args, k_closure, k_fn)` and returns that Call NextStep. The caller is responsible for dispatching it (i.e. running the trampoline) to reach the terminal `Done(value)`. The previous Phase 3b lowering read `(*ns).value` at offset 24 from the returned NextStep, which for a Call NextStep is unwritten — so on Linux + macOS the value came back as 0 (arena zero-init), making the e2e test print "0\n" instead of "42\n". Fix: declare `sigil_run_loop(initial: *mut NextStep) -> u64` as a new FFI import and pass `sigil_perform`'s Call NextStep to it inside `lower_perform_non_io_to_value`. `sigil_run_loop` invokes the arm fn, then dispatches any further Calls the arm returns, until a terminal Done — returning the Done's value as u64. Native code uses the u64 directly. This is also the more general path: even when arm bodies grow more complex (Phase 4+ — k usage, op-args, multi-step computations), `sigil_run_loop` continues to be the right dispatcher; only the codegen for arm bodies and perform-side args packing changes. Pod-verify clean. Single-fixup commit.

Phase 4a of Plan B Task 55. Handlers can now have multiple op-arms when all arms target the same effect. The runtime `HandlerFrame` already supports up to MAX_HANDLER_ARMS=14 arm slots (Task 56); codegen iterates `op_arms` to emit one `sigil_handler_frame_set_arm(frame, op_id, fn_ptr, null_closure)` call per arm. The `unsupported_handle_construct` codegen-entry guard is updated to: - Allow `op_arms.len() > 1` (was rejected in Phase 3a/3b). - Reject mixed-effect handles with a clear "Phase 4e" diagnostic (the runtime frame's `effect_id` is single-valued; multi-effect needs frame-per-effect). - Defensively reject duplicate `(effect, op)` arm pairs (typecheck E0140 catches this earlier; double-check for safety). - Per-arm checks (IntLit body, no op-args) now run for each arm (was checked only for `op_arms[0]` in Phase 3b). The synthetic-fn pre-pass and definition pass already handle multiple arms per handle (one synthetic CPS fn per arm); no changes needed there. - e2e: new `handle_with_two_arms_dispatches_correct_arm_by_op_id` test runs `handle (perform Choose.right()) with { Choose.left(k) => 10, Choose.right(k) => 20 }` → prints `20` (op_id=1 dispatch through the runtime arm-slot table; alphabetical ordering puts `left` at op_id=0, `right` at op_id=1). - e2e: new `handle_with_mixed_effect_arms_is_rejected_at_codegen` pins the Phase 4a restriction (mixed-effect arms still rejected). Pod-verify clean. Builds on Phase 3b (`2e7c0de`).

boldfield

Review of PR #22 (Plan B Task 55 — handle codegen, Phase 2 → 3b → 4a)

Overall the change is well-scoped and the asymmetric-gate progression is sensible. Tests pin both the lifted gate and the in-progress restrictions. Several issues to address before this leaves draft.

Bugs

1. unsupported_handle_construct doesn't recurse into the handle's body.

In compiler/src/codegen.rs::expr_unsupported_handle, the Expr::Handle arm checks the outer handle's restrictions, walks body_contains_non_io_perform_with_args(body) (which short-circuits on Expr::Handle { .. } => None by design), and then recurses only into arm.body — never into body itself.

A nested handle inside the body bypasses every Phase 3b restriction. Example:

handle (handle 0 with { A.op1(k) => 1, B.op2(k) => 2 })  // multi-effect — never checked
with { Outer.op(k) => 0 }

Because the inner handle is only consulted via body_contains_non_io_perform_filtered (which returns None for Expr::Handle { .. }), its multi-effect / non-IntLit / return-arm / multi-arg restrictions are never enforced. At runtime this can register arms under the wrong effect_id and crash inside sigil_perform's handler-stack walk.

Fix: add if let Some(msg) = expr_unsupported_handle(body) { return Some(msg); } to the Expr::Handle arm of expr_unsupported_handle. Add a regression test exercising a nested multi-effect handle inside another handle's body.

Quality issues

2. Dead code: body_contains_non_io_perform and block_contains_non_io_perform are unused.

Both are marked #[allow(dead_code)] and are entirely subsumed by body_contains_non_io_perform_filtered / block_contains_non_io_perform_filtered. The #[allow(dead_code)] is a smell — just delete them. ~115 lines of cruft.

3. op_names.dedup() silently masks a typecheck invariant.

In compiler/src/typecheck.rs (op_ids assignment loop):

let mut op_names: Vec<&str> = eff_decl.ops.iter().map(|o| o.name.as_str()).collect();
op_names.sort();
op_names.dedup();

Duplicate op names within an effect already fire E0137 upstream. The dedup() here is either dead defensive code, OR (worse) silently corrupts ID assignment if duplicates somehow leak through with errors collected but not aborted. Replace with debug_assert!(op_names.windows(2).all(|w| w[0] != w[1]), "E0137 should have caught this"), or just drop the dedup() entirely.

4. PLAN_B_PROGRESS.md has a duplicated Phase 3a paragraph.

Diff lines 65 and 66 add two near-identical paragraphs:

Phase 3a (ef4be8d): wires the runtime handler-frame ABI ...
Phase 3a (this commit): wires the runtime handler-frame ABI ...

Same content, just different prefix. Looks like a copy-paste leftover from the Phase 3a → Phase 3b transition where the old "(this commit)" version wasn't cleaned up. Drop the second one.

5. Test name handle_with_arm_that_uses_k_is_rejected_at_codegen misleads about what's being tested.

The arm body is k(0) — an Expr::Call, not an Expr::IntLit. The unsupported_handle_construct guard rejects it at the IntLit check, never reaching any "uses k" logic (because there is none — k usage is implicitly forbidden by IntLit-only). The test would pass equally for arm body 1 + 1 or if true then 1 else 2. Either:

Rename to handle_with_non_intlit_arm_body_is_rejected_at_codegen and add a 1 + 1 variant, OR
Add a real "k is referenced inside an IntLit context" test once Phase 4c lands richer arm bodies and a true k-usage check is needed.

Minor

6. Synthetic arm fns at the bottom of emit_object have no stackmap entries.

The hand-rolled iconst → call sigil_next_step_done → return sequence calls into the runtime, which can allocate (NextStep arena bump). There's no stackmap.push_placeholder for that call. Safe in Phase 3b only because (a) closure_ptr arg is null and (b) IntLit body produces no GC roots. But the closure_ptr param IS in the CPS calling convention — once Phase 4c adds richer arm bodies with captures, closure_ptr becomes a live GC root across the sigil_next_step_done call and you'll silently miss it on the stackmap. Worth a TODO comment at minimum, or wire it through now.

7. let _ = span; in check_handle after E0134 lift.

The span arg was used for E0134 emission; with the gate lifted it's only kept warm via let _ = span;. Either drop the param from check_handle's signature or comment why span is retained for Phase 4+.

Recommended order: fix #1 (correctness) and #4 (doc cleanup) before any further commits land on this branch; #2/#3/#5 anytime; #6/#7 can ride along with later phase work.

boldfield · 2026-04-26T15:49:45Z

WIP review — items to address before draft → ready

Reviewing at adcb897. Mechanical state is clean (clippy/fmt/runtime tests/lib tests/pod-verify/reproducibility all pass; e2e 28/30 with 2 environmental perf failures unrelated to this PR). Verdict shape is "items by phase," not "merge / request changes," since this is an explicit WIP draft. Per your standing directive on this branch, all real issues below are must-fix-before-ready.

Per-phase status

Phase	Status
Foundation (E0133 lift)	landed, clean
Phase 2 (E0134 lift, body-pass-through, effect/op IDs)	landed, clean
Phase 3a (frame ABI)	landed, clean
Phase 3b (sigil_perform + arm dispatch)	landed; one real crash bug (see MF1), several stale comments
Phase 3b fixup (Call NextStep through run_loop)	landed; correct shape
Phase 4a (multi-arm single-effect)	landed, clean

Mechanical (verified locally)

cargo fmt, clippy -D warnings on all four workspace crates: clean
cargo test -p sigil-runtime: 64 passed (incl. 3 GC-stress via subprocess)
cargo test -p sigil-compiler --lib: 419 passed
cargo test -p sigil-compiler --test e2e: 28/30 (2 fails are fib_perf / tree perf-floor, environmental, pre-existing — see PLAN_B_DEVIATIONS.md verification-debt entry)
bash scripts/pod-verify.sh: clean
bash scripts/reproducibility.sh: 4/4 example binaries byte-identical across builds — synthetic sigil_handler_arm_<global_index> symbol naming verified deterministic
Note for reviewers running locally: initial e2e run failed all 3 handle tests with link errors against _sigil_handle_push etc. Caused by a stale target/debug/libsigil_runtime.a. Rebuilding the runtime debug staticlib (cargo build -p sigil-runtime) resolved. Worth flagging in the PR description so future reviewers don't chase a phantom link bug.

Must-fix before draft → ready

[MF1] Statement-form non-IO perform crashes the compiler. Lowerer::lower_stmt for Stmt::Perform(p) unconditionally calls self.lower_perform(p) at compiler/src/codegen.rs:2053-2055, which asserts p.effect == "IO" at :2061. Verified by inspection. Repro:

effect Raise { fail: () -> Int }
fn helper() -> Int ![Raise] {
  perform Raise.fail();
  0
}
fn main() -> Int ![IO] {
  let n: Int = handle helper() with { Raise.fail(k) => 42 };
  perform IO.println(int_to_string(n));
  0
}

→ assertion left == right failed: non-IO effect reached lower_perform; left: "Raise" right: "IO".

The unsupported_handle_construct walker doesn't catch this — helper() is a Call expression in the handle body and the walker only inspects direct Expr::Perform sites. Two fixes needed:

lower_stmt must dispatch IO vs non-IO the same way Expr::Perform does (call lower_perform_non_io_to_value for non-IO and discard the value, or refactor the dispatch into a shared helper).
Add an e2e test pinning Stmt::Perform non-IO. The existing handle e2e tests all use expression-form perform in tail position; statement-form is uncovered.

[MF2] Differential identity property test (standing precondition [P1]) is not present. Searched for differential|prop.*identity|Native.*equiv|cps_eq_native|proptest across both crates — zero matches. This was flagged as a Task 55 must-fix precondition in the PR #21 review and on the pre-registered Stage 6 worry list. It's the only evidence you can trust without auditing the CPS transform's correctness reasoning by hand.

Phase 3b lowers the first real CPS-flavoured path (synthetic arm fns + perform routing through run_loop); Phases 4b/4c will lift restrictions and the property test becomes harder to add later, not easier. Land before draft → ready, ideally as a small AST fuzzer plus a runner that compares direct-style native execution to the CPS lowering on Native-eligible programs (no perform).

[MF3] Color analysis still on Task 53 stub. compiler/src/color.rs:380-411 find_non_io_perform_in_expr's Expr::Handle arm carries a "Task 55 will replace this stub with the proper handler-context color rule the PR #18 reviewer flagged for Stage 6" comment. Task 55 hasn't replaced it. With Phase 3b shipping real handler dispatch, the color analyser's classification of handle-containing fns now drives codegen behavior. Today it's still acting as if the handle discharges all body effects from the parent fn's color — correct for current Phase 3b/4a restrictions (synchronous, single-shot, IntLit arm, no k), but Phase 4b+ (k usage, multi-step bodies) requires the rule to be properly stated. Either replace the stub now per PR #18's reviewer ask, or upgrade the comment to spell out exactly which Phase-4 sub-task closes the loop.

[MF4] Stale comments documenting offset-24 reads that no longer happen.

compiler/src/codegen.rs:2207 in Expr::Perform's match arm: "we extract the arm's value from (*ns).value at offset 24." Verified the actual code at :2147 just returns sigil_run_loop's u64 result. The Phase 3b fixup commit (2e7c0de) routed through run_loop but didn't update this comment.
PLAN_B_PROGRESS.md Phase 3b entry: same staleness.

Either delete the offset reference or rewrite to accurately describe the run_loop path. Layout-coupling concerns from the prior PR's worry list are now moot for codegen (no offset-24 read), but document explicitly so future reviewers don't chase a phantom.

[MF5] Duplicate Phase 3a entry in PLAN_B_PROGRESS.md. Two consecutive bullet points say "Phase 3a (ef4be8d)" and "Phase 3a (this commit)" with identical content. One was meant to describe a prior commit, the other a future state. Pick one.

[MF6] Stale E0133 / E0134 comments in compiler/src/typecheck.rs describing those gates as "still firing" when they've been lifted: lines 345-349, 473-476, 2380-2389, 2770-2774, 4218-4222.

The TODO at compiler/src/typecheck.rs:2955-2964 is the worrying one — explicitly says "Today the gate fires first so this case is unreachable in user-visible output." Verified: gate is gone (lifted in Phase 2 via 2d69b52). The case is now reachable. Either prove it can't happen with current Phase 4a restrictions (and update the comment to reflect that), or write a test program demonstrating it doesn't break, or address the underlying inference question (pin via body's perform-call return-type chain, or surface E0132-style ambiguous-polymorphism diagnostic).

Phase-blocking restrictions (must land in subsequent phases on this branch)

[PB1] unsupported_handle_construct doesn't follow call edges (Phase 4 blocker). Already noted in MF1 — walker rejects only direct Expr::Perform in handle bodies. Once MF1 fixes the compiler crash, the walker isn't load-bearing for soundness, but Phase 4b's lifting of the IntLit-only-arm-body restriction will need a real story for transitive perform analysis. Document the closure point in the deviation entry.

[PB2] Closure capture across CPS boundary not yet addressed. Expr::Handle lowering passes closure_ptr: null to sigil_handler_frame_set_arm (codegen.rs:2308, 2322). Correct today (arm bodies are IntLit). Phase 4b/4c (k usage + multi-step bodies + outer-scope captures) will require a real closure record threaded into each arm slot. Add a // PHASE-4-RESTRICTION marker on the null_ptr line with the specific phase that lifts it, so the dependency is grep-findable.

[PB3] Frame leak on body panic. Expr::Handle lowering does frame_new → set_arm* → push → body → pop (codegen.rs:2285-2334). If the body aborts (today: only sigil_panic_arith_error, which kills the process — fine), the frame leaks on the runtime handler stack. When Task 57 replaces arith panic with Raise[ArithError], this leak becomes a real bug — the recovery handler will see a stale frame from a different scope. Phase 4 needs to either (a) introduce a scope guard that pops on early exit, or (b) make pop idempotent and explicitly tear down on the unwind path. Document the dependency on Task 57.

[PB4] Tail-call preservation worry. Phase 3b's lower_perform_non_io_to_value synchronously calls sigil_perform then sigil_run_loop from native code. Works for IntLit-only single-step arms (run_loop completes in 1-2 dispatches and returns). When Phase 4c lifts to multi-step arm bodies and Phase 4d wires k invocation, native callers driving run_loop becomes a stack-blocking pattern that defeats the trampoline's purpose — the fn calling the synchronous-blocking shape needs to be Cps-color so the perform site emits a real NextStep::Call to its caller's trampoline. This is the [MF3] color stub bite. Document explicitly in a deviation entry.

Deferred (genuine future-task / v2+ work only)

Layout-coupling against NextStep shape is moot today (no offset-24 read in codegen); revisit at Task 58 (multi-shot may add fields) or v2.
match-arm CPS join-point unification (worry Plan A2 Task 29: parser extensions (lambda syntax) #5) and let-bindings under nested forms (worry Plan A2 Task 30: typechecker fn types + application unification + capture analysis #6) — Phase 4c/4d concerns; today's restrictions make them unreachable.
Walker consolidation: unsupported_handle_construct is 240+ lines split across expr_unsupported_handle, block_unsupported_handle, body_contains_non_io_perform_with_args, etc. Worth consolidating in Phase 4 when it grows further; not urgent today.
body_contains_non_io_perform at compiler/src/codegen.rs:789 is #[allow(dead_code)] — either delete or wire the live caller.

Spec adherence

The PR ships toward what Task 55 specifies (CPS expansion for CPS-color monomorphs, perform → sigil_perform, handler frame push/pop), with phase restrictions appropriate for an in-progress branch. The 4 deviation entries (foundation, Phase 2, Phase 3a, Phase 3b) follow the file's stated 4-section format but lack an explicit "Closure point" field that prior entries use (Task 4.5.5, Task 47, Task 48, etc.). Recommend adding a Closure point line to each Task-55 deviation naming the specific Phase 4 sub-task (or the PR-22 merge) that lifts each restriction, so reviewers know where each loop closes.

effect_ids / op_ids deterministic-alphabetical assignment: verified clean. tc.effects is BTreeMap (sorted iteration); op-name list explicitly sorted before enumeration. Reproducibility script confirms 4/4 example binaries byte-identical across builds.

Once MF1–MF6 land, the foundation + Phase 4a state is solid enough to base Phase 4b on. PB1–PB4 must close before draft → ready (specifically: PB3 hard-depends on Task 57, PB4 hard-depends on MF3).

Fixes correctness items + cleanup from the WIP review at adcb897: Bugs: - Walker recursion: expr_unsupported_handle's Expr::Handle arm now recurses into the outer handle's body, not just into arm bodies. Without this, a nested handle inside another handle's body bypassed every Phase 4 restriction (multi-effect, return-arm, etc.) and could register arms under the wrong effect_id at runtime. Regression test: nested_handle_in_outer_body_propagates_inner_unsupported_diagnostic. - Stmt::Perform crash: lower_stmt now dispatches IO vs non-IO the same way Expr::Perform does. Previously, statement-form non-IO performs hit the IO-only assertion in lower_perform and crashed the compiler. Pinned by statement_form_non_io_perform_inside_handle_compiles_and_runs. Cleanup: - Delete unused body_contains_non_io_perform / _block walkers (~115 lines, fully subsumed by the _filtered variants). - Replace op_names.dedup() with debug_assert (E0137 catches dups upstream; the dedup masked a typecheck invariant). - Rename handle_with_arm_that_uses_k_is_rejected_at_codegen to handle_with_non_intlit_arm_body_is_rejected_at_codegen and add a '1+1' variant; the original test exercised the IntLit guard, not any 'k usage' check. - Add stackmap TODO on synthetic arm-fn definition pass (Phase 4c closes once arm bodies grow GC roots). - Drop unused span param from check_handle's signature (E0134 lift made it dead). - Replace 'we extract from (*ns).value at offset 24' codegen comment with the actual run_loop dispatch description (Phase 3b fixup routed through run_loop but didn't update the comment). - Remove duplicate Phase 3a paragraph from PLAN_B_PROGRESS.md; refresh status line + commits list to reflect Phase 3b/3b-fixup/4a. - Refresh stale 'E0133/E0134 still firing' comments at five sites in typecheck.rs to point at the lift commits (b3af204 / 2d69b52) and the Phase-4 work that follows. - Upgrade color.rs's Task 53 'stub' comment to name Phase 4d as the closing dependency (the handler-context color rule is load-bearing once arms reify k). - Add PHASE-4-RESTRICTION marker on the null closure_ptr passed to sigil_handler_frame_set_arm so the Phase 4b/4c/4d dependency is grep-findable. - Add Closure point lines to all four Task 55 deviation entries (per other entries' convention) naming the Phase that lifts each restriction.

Closes the remaining must-fix and phase-blocking items from the PR #22 WIP review. MF2 — Differential identity property test. - New e2e test cps_wrapped_identity_matches_native_on_native_eligible_programs generates 24 deterministic Native-eligible Int expressions (small ints + arithmetic + if-else), compiles each twice (raw vs wrapped in `handle EXPR with { E_eff.op(k) => 999 }`), and asserts identical stdout. Catches CPS-lowering bugs that hand-rolled tests would miss by sampling the expression-shape space rather than hitting one case. Embeds an xorshift64 PRNG inline rather than pulling in proptest; determinism + reproducibility take priority over fuzz coverage at this scale. Deviation entries (closure points named): - PB1: walker doesn't follow call edges. After MF1's dispatch fix, this is no longer a soundness concern — transitive performs lower correctly through sigil_perform/run_loop. The walker stays as a Phase-4 ergonomic gate; closure point is Phase 4b's args check. - PB3: handler frame leak on body unwind. Today only sigil_panic_arith_error aborts mid-body (kills process, leak invisible). Closure point is Task 57 — when ArithError becomes recoverable, every Expr::Handle codegen site needs a scope guard or unwind-safe pop. - PB4: native callers drive sigil_run_loop synchronously. Correct for Phase 3b/4a's restricted shape (1-2 dispatches per perform); the synchronous-blocking pattern only defeats the trampoline's stack discipline at Phase 4c+ depths. Closure point is Phase 4d (k-using arms via continuation reification), which closes alongside the colorer's handler-discharge refinement (PR #18 reviewer's Stage-6 ask).

…rdLit/Handle children The MF2 differential property test in the previous commit (95472ec) caught a real preexisting bug: `elaborate` skipped recursion into the children of `Expr::Call`, `Expr::RecordLit`, and `Expr::Handle`, so an `Expr::If` nested inside a call argument / record-literal field value / handle body was never desugared to `Expr::Match`. The unelaborated `Expr::If` then tripped the `unreachable!("Expr::If should have been desugared by elaborate")` arm in `lower_expr` and crashed the compiler. The bug was unreachable in earlier phases because: - Call: existing tests bind compound values to `let` first; no test passed an if-else directly as a call argument. - RecordLit: same pattern — fields were bound to lets first. - Handle: E0134 staged-feature gate short-circuited the pipeline before elaborate; lifted in Task 55 Phase 2 (`2d69b52`). Fix: each arm now recurses into its children with `need_trivial=false` (Stage-3 ANF flattening at call sites is still deferred per the original comment) and aggregates hoisted bindings appropriately: - Call: hoisted bindings bubble up to the caller (a Call is not a scope boundary). - RecordLit: same. - Handle: body + return-arm body + each op-arm body are inner scopes; hoisted bindings are wrapped in `Expr::Block` so they cannot leak past the handle/arm boundary, mirroring the existing Match-arm and Lambda-body patterns. Two new lib tests pin the recursion behaviour directly (independent of the MF2 e2e property test): - `if_inside_call_args_is_desugared_to_match` - `if_inside_handle_body_is_desugared_to_match` `compiler/src/elaborate.rs` 410 → 412 lib tests (+2). Pod-verify clean.

boldfield

Re-review of PR #22 (Plan B Task 55 — handle codegen, foundation through Phase 4a)

All seven items from the prior review (#1 walker recursion, #2 dead code, #3 op_names.dedup, #4 PROGRESS duplicate, #5 misleading test name, #6 stackmap TODO, #7 unused span param) are addressed cleanly. The MF2 differential property test, the elaborate-non-recursion fix, and the Stmt::Perform non-IO crash fix are all good additions that the author found and closed independently. The deviation entries for PB1/PB3/PB4 with named closure points are well-formed and correctly downgrade PB1 from "soundness" to "ergonomic gate" given the MF1 dispatch fix.

A few new concerns from this read.

Substantive

1. indices.insert(span.clone(), arm_indices) overwrites silently.

compiler/src/codegen.rs::collect_handle_arms_in_expr (the Expr::Handle arm) does:

indices.insert(span.clone(), arm_indices);

If two Expr::Handle nodes share a span — e.g., a future mono pass that clones a generic fn body, a closure-conversion lambda lift that duplicates a handle, or any other AST-cloning transform — the second insert silently overwrites the first. lower_expr then looks up handler_arm_refs_per_handle.get(span) and gets the wrong FuncRefs for one of the handles.

Today this is safe only because Phase 3b restricts arm bodies to IntLit (parameter-free, so all aliased monomorphs would share semantics by construction). Phase 4c lifts that restriction. The PR description acknowledges per-monomorph color variance as a Stage-6 carry-forward; the bug becomes user-observable the moment that lands.

Fix: debug_assert!(indices.insert(span.clone(), arm_indices).is_none(), "duplicate handle span — pre-pass invariant broke for {span:?}"); — cheap, fails loudly when the assumption stops holding. Same idea as the op_names dedup→debug_assert swap from review-fixup commit 54b4a60.

2. MF2 property test doesn't exercise the perform-dispatch path.

The wrapper is handle EXPR with { E_eff.op(k) => 999 } where EXPR is the generated Native-eligible body. The body never performs E_eff.op, so the synthetic arm fn is dead code at runtime — the test validates only the Phase 3a frame plumbing (frame_new + set_arm + push + pop) doesn't corrupt the body's native semantics.

The Phase 3b/4a dispatch path (sigil_perform → sigil_run_loop → arm → next_step_done → value extraction) is not validated by the property test despite being the load-bearing mechanism Phase 4 builds on. Hand-rolled tests cover one or two cases (Raise.fail() => 42, Choose.right() => 20) but they don't sample the shape space.

Fix recommendation: add a paired generator that emits a wrapper-with-perform shape — handle (perform E_eff.op()) with { E_eff.op(k) => N } for some constant N, asserting stdout N\n. The body is IntLit-bodied (so it stays inside Phase 4a's subset) but the dispatch loop runs end-to-end on every trial. Even 8–12 trials would catch a real regression and stay inside the CI time budget.

This isn't a blocker for merge, but the MF2 test as it stands is named "differential identity" yet tests only the no-perform half of identity. Worth a docstring caveat at minimum.

3. No e2e test for op_id ≥ 2.

Phase 4a's runtime supports MAX_HANDLER_ARMS = 14 (per Task 56 / PR #21). The current matrix exercises 1-arm (Raise.fail at op_id=0) and 2-arm (Choose.{left,right} at op_id ∈ {0,1}). A 3+-arm test would validate op_id arithmetic isn't off-by-one for op_id ≥ 2 — e.g., effect Pick { a, b, c } performing Pick.c() would expect op_id=2 dispatch.

Cheap to add, would catch a class of subtle indexing bugs that 1- and 2-arm tests can't surface.

Nits (no action needed unless trivial)

4. Xorshift64::bool method name shadows the built-in bool type identifier at call sites. Calling rng.bool() reads ambiguously — next_bool() would mirror next() and make method-call sites unambiguous. Cosmetic.

5. The Phase 3b deviation entry's "FFI surface" diagram still says "next_step_done → NextStep value extraction" rather than "→ run_loop dispatch". The lowering changed in fixup 2e7c0de to route through sigil_run_loop, but the old diagram wording in the deviation rationale wasn't fully updated. Code comments are correct; only the deviation-entry rationale text is stale. (The neighbouring deviation entries that name the closure point ARE updated.)

The substantive items (#1, #2, #3) are not merge blockers individually, but #1 in particular is the kind of latent invariant that's much cheaper to assert now than to debug post-Phase-4c when it actually bites. Recommend addressing #1 in this PR; #2/#3 can land in a follow-up.

boldfield · 2026-04-26T17:01:24Z

Re-review at `6057373`

PR moved from draft → ready. Three new fix-up commits since prior review (54b4a60, 95472ec, 6057373). Mechanical state still clean — all 4 CI lanes SUCCESS, 419 compiler lib tests pass locally. Per your standing directive on this branch, the items still pending below are must-fix.

Headline result

The MF2 differential property test caught a real preexisting bug. Commit 6057373 documents this directly: the property test exercised expression shapes that hand-rolled tests had missed, and elaborate was skipping recursion into Expr::Call / Expr::RecordLit / Expr::Handle children — so a nested Expr::If inside a call argument / record field / handle body was never desugared to Expr::Match, then crashed lower_expr's unreachable!. Was unreachable in earlier phases for incidental reasons (existing tests bound compound values to let first); Phase 2's E0134 lift exposed it. Two regression tests added at the elaborate level. Exactly the structural-correctness category I argued the property test would catch.

Per-item verification

Item	Status
MF1 — `Stmt::Perform` non-IO crash	✅ fixed; regression test `statement_form_non_io_perform_inside_handle_compiles_and_runs` added
MF2 — differential identity property test	✅ added; substantive (xorshift64 PRNG, 24 deterministic Native-eligible expressions, vacuous-handler wrapping, raw vs wrapped stdout comparison). Caught the elaborate non-recursion bug.
MF3 — color stub at `color.rs:380-411`	❌ comment essentially unchanged. The PB4 deviation entry now names Phase 4d as the closure point for the color-handler-discharge refinement, but the in-source comment still reads "Task 55 will replace this stub" without referencing PB4 / Phase 4d. Update the comment to cross-reference the PB4 deviation entry's Phase 4d closure point.
MF4 — `codegen.rs:2207` stale "offset 24" comment	❌ unchanged. Verified: the comment still says "we extract the arm's value from `(*ns).value` at offset 24" but the actual code routes through `lower_perform_non_io_to_value` → `sigil_run_loop` and reads its u64 return. Either delete the offset reference or rewrite to describe the run_loop path.
MF5 — duplicate Phase 3a entry in `PLAN_B_PROGRESS.md`	❌ unchanged. Lines 103 + 104 still both labeled "Phase 3a" with identical content. Delete one.
MF6 — TODO at `typecheck.rs:2955`	❌ unchanged. Still says "Today the gate fires first so this case is unreachable in user-visible output." Verified: E0134 was lifted in Phase 2 (`2d69b52`). The case IS now reachable. Either prove it can't happen with current Phase 4a restrictions and update the TODO comment, or write a test program that demonstrates it doesn't break, or address the underlying inference question.
MF6 — stale `E0133`/`E0134` comments in `typecheck.rs`	❌ unchanged. Lines 345-346, 473, 621, 2386, 2770 all still describe gates as firing. Sample at `:473`: "E0133 still fires per `Item::Effect`" — not true since `b3af204` lifted it. The sweep test at `typecheck.rs:7008-7015` correctly asserts `!has_code(&errs, "E0133")` so the test agrees with the actual behavior, but the prose comments contradict it.
PB1 — walker doesn't follow call edges	✅ deviation entry added (Closure point: Phase 4b). Substantive.
PB2 — `// PHASE-4-RESTRICTION` markers on null_closure lines	❌ not added. `codegen.rs:1736-1737` and `:2368-2370` still just say `let null_closure = builder.ins().iconst(pointer_ty, 0)` with no Phase-4-RESTRICTION marker. Add the marker on each `null_closure` line so the dependency on Phase 4b/4c (closure capture lift) is grep-findable.
PB3 — handler frame leak on body unwind	✅ deviation entry added (Closure point: Task 57). Substantive.
PB4 — native callers drive `sigil_run_loop` synchronously	✅ deviation entry added (Closure point: Phase 4d, tied to colorer refinement). Substantive.

Mechanical (verified locally)

cargo test -p sigil-compiler --lib: 419 passed
CI: all 4 lanes SUCCESS on 6057373
The dead-code body_contains_non_io_perform walker (~120 lines) was deleted as a bonus during the fix-up — clean

Remaining must-fix list (post-prior-review)

MF3 — update compiler/src/color.rs:380-411 comment to cross-reference the new PB4 deviation entry (or its Phase 4d closure point), so the in-source dependency is discoverable.
MF4 — fix compiler/src/codegen.rs:2207 stale offset-24 comment.
MF5 — remove the duplicate Phase 3a entry in PLAN_B_PROGRESS.md (line 103 or line 104).
MF6 — address the TODO at compiler/src/typecheck.rs:2955 (the "gate fires first → unreachable" claim is now false).
MF6 — sweep compiler/src/typecheck.rs for stale "E0133/E0134 fires" comments and update each to reflect the lift (lines 345-346, 473, 621, 2386, 2770 known; consider a git grep for additional sites).
PB2 — add // PHASE-4-RESTRICTION (closure capture, Phase 4b/4c) markers next to the two null_closure lines in compiler/src/codegen.rs.

All six are documentation/comment fixes — none touches functional code or invalidates the test suite. They're small but they're the kind of drift that compounds across phases and confuses future readers (or future agents resuming this work). The PB4 deviation entry itself is excellent — it just needs the in-source comment at color.rs to point at it.

Once these six land, this is ready to merge.

Substantive items: - indices.insert overwrite now asserted via debug_assert!(prev.is_none(), ...) at codegen.rs:967-985. Two `Expr::Handle` nodes sharing a span would silently lose one's `FuncRef`s; today AST-cloning passes don't do this, but the assertion fails loudly when the assumption stops holding (mirrors the op_names dedup → debug_assert pattern from 54b4a60). Insert lives outside the assert so the side effect runs in release builds. - Paired MF2 generator `cps_dispatch_returns_arm_value_across_op_id_shape_space` exercises the full `sigil_perform → sigil_run_loop → arm fn → next_step_done` dispatch path on every trial (12 trials with arm values sampled across [-99, 99]). The original MF2 test only exercised the no-perform half of identity; this closes that gap. - New e2e `handle_with_three_arms_dispatches_op_id_two`: 3-arm handler (Pick.{a,b,c}), performs `Pick.c()`, asserts dispatch returns 2. Validates op_id assignment (alphabetical: a=0, b=1, c=2), arm-slot bitmap indexing at i=2, and `sigil_perform`'s linear walk for op_id ≥ 2 — coverage gap not closed by the existing 1-arm and 2-arm e2e tests. Documentation/comment fixes: - color.rs `find_non_io_perform_in_expr` Phase-coverage comment now cross-references the PB4 deviation entry by title and names the Phase 4d closure point. - typecheck.rs:2962 TODO rewritten: the unsolved-handler-overall edge case is closed today not by typecheck but by codegen's IntLit-only restriction (every arm pins handler_overall to Int); Phase 4c lifts the restriction at which point this path needs a decision. - typecheck.rs asymmetric-gate-tests section comment updated to reflect both gates now lifted. - typecheck.rs:7248 + :7280 stale "still fires" / "still gated" comments updated to point at the lift commits. - codegen.rs `lower_perform_non_io_to_value` now has explicit PHASE-4-RESTRICTION markers on `null_args_ptr` (Phase 4b — args packing) and `null_k_closure`/`null_k_fn` (Phase 4d — continuation reification). Pairs with the existing marker on the set_arm null_ptr at line 2242. - PLAN_B_DEVIATIONS.md Phase 3b entry's FFI-surface diagram updated: `sigil_perform → sigil_run_loop dispatch → arm fn → next_step_done → run_loop returns u64 → pop` (was: `→ NextStep value extraction → pop`, stale since the 2e7c0de fixup). Nit: - Xorshift64::bool → Xorshift64::next_bool (unambiguous at call sites, mirrors `next` and `range`).

boldfield

Re-re-review of PR #22 — fixup commit `ca326594`

All five items from the prior re-review are addressed cleanly. No new issues.

Substantive items closed:

#1 indices.insert overwrite — debug_assert!(prev.is_none(), ...) added at compiler/src/codegen.rs:967-985. Insert lives outside the assert so the side effect runs in release. Comment correctly references the op_names dedup→debug_assert pattern from 54b4a60 and names PB4's Phase 4d as the most likely future trigger.
#2 MF2 dispatch coverage — new cps_dispatch_returns_arm_value_across_op_id_shape_space (12 trials, arm values sampled from [-99, 99]). Body is (perform E_eff.op()) so the full sigil_perform → run_loop → arm fn → next_step_done path runs every iteration. Different PRNG seed (0x4D_46_32_44_49_53_50_00 = "MF2DISP\0") avoids overlap with the original MF2's seed. Verified the parser folds -N literals into Expr::IntLit(-N) (parser.rs:907), so negative arm values pass the codegen IntLit-only guard.
#3 3-arm e2e test — new handle_with_three_arms_dispatches_op_id_two. effect Pick { a, b, c } performs Pick.c(), asserts dispatch returns 2 (op_id=2). Validates alphabetical ID assignment, arm-slot indexing at i≥2, and sigil_perform's linear walk past unmatched arms 0/1.

Nits closed:

#4 Xorshift64::bool → next_bool — done at all 3 call sites.
#5 Phase 3b deviation FFI diagram — updated to frame_new → set_arm → push → sigil_perform → sigil_run_loop dispatch → arm fn → next_step_done → run_loop returns u64 → pop, with an explanatory sentence about the 2e7c0de fixup that introduced the run_loop step.

Bonus cleanups (not requested, all reasonable):

color.rs find_non_io_perform_in_expr — cross-references PB4 deviation entry by title and names Phase 4d as the closure point.
typecheck.rs:2962 TODO rewritten as a Closure-point entry; the unsolved-handler-overall edge case is correctly identified as closed by codegen's IntLit-only restriction (every IntLit arm body unifies handler_overall to Int), with Phase 4c as the lift point.
typecheck.rs "asymmetric gate state" header + two inline test comments updated to reflect both gates lifted.
codegen.rs::lower_perform_non_io_to_value — PHASE-4-RESTRICTION markers on null_args_ptr/zero_len (Phase 4b) and null_k_closure/null_k_fn (Phase 4d), pairing with the existing marker on set_arm's null_ptr. Phase-4 dependency edges are now grep-findable.

CI green at ca326594 on both ubuntu-24.04 and macos-14 (build+test and cold-checkout).

Nothing else to flag from my side. Ready to leave draft once any other reviewers sign off.

boldfield · 2026-04-26T17:25:47Z

Re-review at `ca32659` — merge

All 6 items from the prior re-review are addressed. Plus three bonus additions per the commit message (defensive indices.insert assertion, paired MF2 generator exercising the full dispatch path, 3-arm e2e validating op_id ≥ 2). Mechanical state clean.

(Triage note: my initial verification on the prior pass missed the fixes because my local checkout was 4 commits behind origin — git pull --ff-only resolved. Worth flagging that the agent should keep working locally only if confident the local state matches the PR; the squash-merged result is what reviewers see, but pre-merge mid-flight reviews need the actual pulled branch.)

Per-item verification (all post-pull)

Item	Verification
MF3 — color stub cross-reference	✅ `color.rs:399` now contains explicit "See `PLAN_B_DEVIATIONS.md` entries:" block referencing the PB4 deviation entry by title and naming Phase 4d as the closure point
MF4 — codegen.rs:2207 stale offset-24	✅ `rg 'offset 24' compiler/src/codegen.rs` returns nothing. Stale comment removed
MF5 — duplicate Phase 3a in PROGRESS.md	✅ One Phase 3a entry remains (the original `(\ef4be8d``)`-tagged one); duplicate "(this commit)" entry deleted
MF6 — typecheck.rs:2955 TODO	✅ Rewritten as a "Closure point (Plan B Task 55, Phase 4c)" comment explaining the case is closed today by codegen's IntLit-only restriction (every IntLit arm pins handler_overall to Int), with Phase 4c lifting the restriction as the named closure point
MF6 — stale "E0133/E0134 still fires" comments	✅ My grep for `still fires\|still gated` in typecheck.rs returns only E0066 / E0112 references (unrelated diagnostics that genuinely still fire). All E0133/E0134 stale comments updated
PB2 — PHASE-4-RESTRICTION markers	✅ Three markers added: `codegen.rs:2054` (Phase 4b — args packing), `:2065` (Phase 4d — continuation reification), `:2266` (Phase 4b/4c — closure capture). Grep-findable for future phase work

Bonus additions (per `ca32659` commit message)

indices.insert overwrite now asserted via debug_assert!(prev.is_none(), ...) at codegen.rs:967-985 — defensive guard against silent FuncRef loss if two Expr::Handle nodes ever share a span
Paired MF2 generator cps_dispatch_returns_arm_value_across_op_id_shape_space exercises the full sigil_perform → sigil_run_loop → arm fn → next_step_done dispatch path (12 trials, arm values across [-99, 99]) — closes the gap where the original MF2 only exercised the no-perform half of identity
New e2e handle_with_three_arms_dispatches_op_id_two validates op_id assignment (alphabetical: a=0, b=1, c=2), arm-slot bitmap indexing at i=2, and sigil_perform's linear walk for op_id ≥ 2 — coverage gap not closed by 1-arm and 2-arm tests
Xorshift64::bool → Xorshift64::next_bool (consistency with next and range)

Mechanical

cargo test -p sigil-compiler --lib (post-pull): 419 passed, 0 failed
CI status at time of review: macos-14 lanes SUCCESS; ubuntu-24.04 lanes IN_PROGRESS. Expect green based on local verification + macos parity
All prior must-fix items closed; no new findings

Once CI confirms ubuntu-24.04 lanes pass, ready to merge.

Addresses PR #23 review feedback at `dea0106`. MF1 (args-content verification gap): the three Phase 4b e2e tests verify FFI plumbing compiles + runs but cannot observe arg values arriving correctly because arm bodies are still IntLit-only. Per reviewer's option (3) — cheapest of the three offered — defer arg-content verification to Phase 4c with an explicit pre-registered acceptance criterion in the Phase 4b deviation entry. Phase 4c's PR is now required to ship 4 named arg-content readback tests (Int / Bool+Char / String / multi-arg-in-declared-order) before merging. A bisecting agent investigating wrong-args-value bugs after Phase 4c lands gets a pointer back to Phase 4b as the suspect. MF2 (stack-slot lifetime tied to synchronous call pattern): add a `// PHASE-4-RESTRICTION (Plan B Task 55, Phase 4d):` marker on the `create_sized_stack_slot` line. Phase 4d converts perform sites to return `NextStep::Call` to the caller's trampoline rather than synchronously calling `sigil_run_loop` from native code; at that point the stack slot dies before the trampoline reads it on the next dispatch and the args buffer must migrate to arena allocation. Marker is grep-findable; cross-references the existing `Native callers drive sigil_run_loop synchronously` deviation entry. Closure-point note also added to the new Phase 4b deviation entry. MF3 (PR #22 single-branch deviation entry is now stale): PR #22's foundation-phase entry claimed "only one PR opens for Task 55." That turned out wrong in practice — Phase 4b landed as a separate PR (#23) on its own branch (`plan-b-task-55-phase-4b`). Add a "Cadence pivot" addendum to that entry recording the per-phase-PR cadence with reasoning (reviewability, independent CI cycles, rollback granularity, review checkpoint cadence). Original "single-PR convention" claim preserved unchanged for historical accuracy. Future Task 55 phase work expected to follow the per-phase-PR cadence. Review #2 (doc/code drift on assert vs debug_assert): the new Phase 4b deviation entry said "defensive `debug_assert`" but the actual code was `assert!`. Per reviewer's recommendation, change the code to `debug_assert!` and let the runtime's `sigil_perform` overflow check (which has a named `effect_id` / `op_id` message) be the source of truth for users; the compiler-side `debug_assert!` exists to catch the bug pre-link in dev builds before the runtime guard fires. Deviation text updated to match. v1's effect arities (0–2 user args) sit so far below the 30-arg cap that this delta has zero observable surface today. Review #3 (widening fallthrough silently miscompiles future floats in release builds): change `debug_assert_eq!` to `assert!` so a future F32/F64 surface type or 32-bit-target port that smuggles a non-pointer-width value through the args-buffer path panics in *both* dev and release builds rather than silently storing the bit-pattern as if it were a pointer-sized value. Comment expanded to explain the trade-off. Float / 32-bit-target safety net documented in the Phase 4b deviation entry. Pod-verify clean.

boldfield added 6 commits April 26, 2026 04:30

boldfield commented Apr 26, 2026

View reviewed changes

boldfield added 3 commits April 26, 2026 16:06

boldfield changed the title ~~[Task 55 WIP] Foundation: lift E0133 + route Item::Effect through codegen~~ [Task 55] CPS handler codegen: foundation through Phase 4a (multi-arm dispatch) Apr 26, 2026

boldfield marked this pull request as ready for review April 26, 2026 16:49

boldfield commented Apr 26, 2026

View reviewed changes

boldfield merged commit 2f56e87 into main Apr 26, 2026
4 checks passed

boldfield mentioned this pull request Apr 26, 2026

[Task 55] Phase 4b: args-buffer packing on perform side #23

Merged

11 tasks

This was referenced Apr 26, 2026

[Task 55] Phase 4c: richer arm bodies via Lowerer + arm-side arg unpacking #24

Merged

[Task 55] Phase 4d MVP: closure-capture + tail-position k via identity continuation #25

Merged

[Task 55] Phase 4e: comprehensive — foundation commit #26

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Task 55] CPS handler codegen: foundation through Phase 4a (multi-arm dispatch)#22

[Task 55] CPS handler codegen: foundation through Phase 4a (multi-arm dispatch)#22
boldfield merged 10 commits into
mainfrom
plan-b-task-55

boldfield commented Apr 26, 2026 •

edited

Loading

Uh oh!

boldfield left a comment

Uh oh!

boldfield commented Apr 26, 2026

Uh oh!

boldfield left a comment

Uh oh!

boldfield commented Apr 26, 2026

Uh oh!

boldfield left a comment

Uh oh!

boldfield commented Apr 26, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

boldfield commented Apr 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Status: ready for review

What's in scope

Supported handler subset (Phase 3b/4a)

Tests

Review-comment items (closed)

Known carry-forwards (out of Task 55 scope; tracked elsewhere)

Reviewer-only run notes

Test plan

Uh oh!

boldfield left a comment

Choose a reason for hiding this comment

Review of PR #22 (Plan B Task 55 — handle codegen, Phase 2 → 3b → 4a)

Bugs

Quality issues

Minor

Uh oh!

boldfield commented Apr 26, 2026

WIP review — items to address before draft → ready

Per-phase status

Mechanical (verified locally)

Must-fix before draft → ready

Phase-blocking restrictions (must land in subsequent phases on this branch)

Deferred (genuine future-task / v2+ work only)

Spec adherence

Uh oh!

boldfield left a comment

Choose a reason for hiding this comment

Re-review of PR #22 (Plan B Task 55 — handle codegen, foundation through Phase 4a)

Substantive

Nits (no action needed unless trivial)

Uh oh!

boldfield commented Apr 26, 2026

Re-review at 6057373

Headline result

Per-item verification

Mechanical (verified locally)

Remaining must-fix list (post-prior-review)

Uh oh!

boldfield left a comment

Choose a reason for hiding this comment

Re-re-review of PR #22 — fixup commit ca326594

Uh oh!

boldfield commented Apr 26, 2026

Re-review at ca32659 — merge

Per-item verification (all post-pull)

Bonus additions (per ca32659 commit message)

Mechanical

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

boldfield commented Apr 26, 2026 •

edited

Loading

Re-review at `6057373`

Re-re-review of PR #22 — fixup commit `ca326594`

Re-review at `ca32659` — merge

Bonus additions (per `ca32659` commit message)