Skip to content

[Task 55] CPS handler codegen: foundation through Phase 4a (multi-arm dispatch)#22

Merged
boldfield merged 10 commits into
mainfrom
plan-b-task-55
Apr 26, 2026
Merged

[Task 55] CPS handler codegen: foundation through Phase 4a (multi-arm dispatch)#22
boldfield merged 10 commits into
mainfrom
plan-b-task-55

Conversation

@boldfield
Copy link
Copy Markdown
Owner

@boldfield boldfield commented Apr 26, 2026

Status: ready for review

Plan B Task 55 — CPS expansion for Expr::Handle and Expr::Perform. Lifts the E0133 + E0134 staged-feature gates and wires effect handlers through the runtime ABI from PR #21 (Task 56). Lands across nine commits on plan-b-task-55 per [DEVIATION Task 55 — Foundation phase ships separately…] in PLAN_B_DEVIATIONS.md; the deviation entry's Closure point is this PR's squash-merge.

Diff stats: 8 files, +2,214 / −266 (~1,950 LOC net). Compiler lib tests 419 → 421 (+2). Three new e2e tests + one differential property test (24 trials, ~50 compile-and-run cycles).

What's in scope

The PR ships the supported subset of handler shapes; restrictions outside the subset are rejected at codegen entry by unsupported_handle_construct with a Phase-4-pointing diagnostic. Each restriction has a [DEVIATION Task 55 …] entry naming its Closure point (the Phase-4 sub-task that lifts it).

Phase Commit What it ships
Foundation b3af204 Lift E0133; route Item::Effect through codegen entry walker.
2 2d69b52 Lift E0134; effect/op IDs assigned at typecheck; Expr::Handle body-pass-through (no-perform case); codegen-entry guard for shapes outside the supported subset.
3a ef4be8d Wire sigil_handler_frame_new / _push / _pop around every handle body (arm fn pointers still null).
3b d0aa4c4 Per-handler-arm synthetic CPS fn synthesis; sigil_handler_frame_set_arm + sigil_perform lowering. First real handler dispatch end-to-end.
3b fixup 2e7c0de Route sigil_perform's NextStep::Call through sigil_run_loop (the dispatcher).
4a adcb897 Multi-arm single-effect handlers.
Review fixups #1 54b4a60 Walker recursion bug; Stmt::Perform non-IO crash; cleanup (dead-code, op_names.dedup, span param, stale comments, Closure-point lines, etc.).
Review fixups #2 95472ec MF2 differential identity property test; deviation entries for PB1/PB3/PB4.
Elaborate fix 6057373 Pre-existing bug exposed by MF2: elaborate skipped recursion into Expr::Call / Expr::RecordLit / Expr::Handle children.

Supported handler subset (Phase 3b/4a)

  • 1+ op-arms, all targeting the same effect (multi-effect → Phase 4e).
  • Each op has zero user args (args-buffer packing → Phase 4b).
  • Each arm body is Expr::IntLit only (richer arm bodies → Phase 4c).
  • Arms cannot reference k (continuation reification → Phase 4d).
  • No return arm (synthetic return-fn registration → Phase 4f).
  • Handle body's non-IO performs must be zero-arg (lifts in Phase 4b alongside the op-arg restriction).

Programs outside this subset surface a clear in-progress diagnostic at codegen entry instead of compiling and crashing.

Tests

Lib tests (compiler crate): 419 → 421 (+2 — the elaborate recursion regressions). Existing typecheck handle tests (9) all pass under the new gate state.

e2e tests added on this PR:

  • effect_decl_with_no_handler_use_compiles_and_runs — foundation phase, effect-only program compiles end-to-end.
  • handle_with_no_perform_in_body_compiles_and_runs — Phase 2 body-pass-through path.
  • handle_with_non_io_perform_runs_arm_and_returns_value — Phase 3b first real handler dispatch (prints 42).
  • handle_with_two_arms_dispatches_correct_arm_by_op_id — Phase 4a multi-arm dispatch.
  • handle_with_mixed_effect_arms_is_rejected_at_codegen — Phase 4e blocker pinned.
  • handle_with_non_intlit_arm_body_is_rejected_at_codegen — IntLit-only restriction (Call shape + Binary shape).
  • nested_handle_in_outer_body_propagates_inner_unsupported_diagnostic — walker-recursion regression.
  • statement_form_non_io_perform_inside_handle_compiles_and_runsStmt::Perform non-IO crash regression.
  • cps_wrapped_identity_matches_native_on_native_eligible_programsMF2 differential property test, 24 deterministic trials comparing native vs handle EXPR with { E_eff.op(k) => 999 } wrapped output.

Review-comment items (closed)

Item Closed by
#1 Walker doesn't recurse into handle body 54b4a60 (recursion + regression test).
#2 Dead body_contains_non_io_perform helpers 54b4a60 (deleted ~115 LOC).
#3 op_names.dedup() masks invariant 54b4a60 (replaced with debug_assert!).
#4 Duplicate Phase 3a paragraph in PROGRESS 54b4a60.
#5 Misleading test name …_uses_k_… 54b4a60 (renamed + added 1+1 variant).
#6 No stackmap entry on synthetic arm fns 54b4a60 (TODO with Phase 4c closure).
#7 Unused span param in check_handle 54b4a60 (dropped from signature).
MF1 Stmt::Perform non-IO crash 54b4a60 (dispatch + e2e test).
MF2 Differential identity property test 95472ec + 6057373.
MF3 Color stub comment 54b4a60 (named Phase 4d as closer).
MF4 Stale offset-24 comments 54b4a60.
MF5 Duplicate Phase 3a paragraph 54b4a60.
MF6 Stale E0133/E0134 comments 54b4a60 (5 sites + TODO@2955).
PB1 Walker doesn't follow call edges 95472ec (deviation entry; closure: Phase 4b).
PB2 Closure-ptr null marker 54b4a60 (PHASE-4-RESTRICTION comment).
PB3 Frame leak on body unwind 95472ec (deviation entry; closure: Task 57).
PB4 Synchronous sigil_run_loop from native callers 95472ec (deviation entry; closure: Phase 4d).
Closure-point lines on Task 55 deviation entries 54b4a60.

Known carry-forwards (out of Task 55 scope; tracked elsewhere)

Reviewer-only run notes

If the e2e tests fail with link errors against _sigil_handle_push / _sigil_handler_frame_new etc., the local target/debug/libsigil_runtime.a is stale. Rebuild via cargo build -p sigil-runtime and re-run.

Test plan

  • CI green on ubuntu-24.04 + macos-14 for build + test and cold-checkout at 6057373.
  • Pod-verify clean (cargo check workspace, fmt, clippy on both crates one-at-a-time, runtime lib tests, discipline greps).
  • Reproducibility script: 4/4 example binaries byte-identical across builds — synthetic sigil_handler_arm_<global_index> symbol naming verified deterministic.
  • All Task 54 typecheck tests still pass (handler-typing, E0220 linearity, E0136-E0141, no spurious E0046 on arm-param/k/return-arm bindings).
  • All Task 55 codegen tests pass.
  • MF2 differential property test green (24 trials, native vs CPS-wrapped identity holds).

…gen entry walker

Foundation phase of Plan B Task 55. Asymmetric gate state: E0133 lifted,
E0134 still firing. This is the first commit on `plan-b-task-55`; the
CPS calling convention, body transform, per-arm closure synthesis, and
E0134 lift land in follow-up commits on this same branch (no PR until
the full path runs at least one real handler example end-to-end).

- typecheck: drop E0133 emission for `Item::Effect` (effect decls now
  flow through with only the registry-population walk + op-signature
  type checks). E0134 emission for `Expr::Handle` retained with an
  updated message pointing at the in-progress task.
- codegen: drop the `Item::Effect → return true` short-circuit in
  `contains_apply_or_generic_ref`. Effect declarations have no codegen
  output; downstream passes (monomorphize, color, closure_convert)
  already pass them through unchanged from Task 54.
- catalog: remove the E0133 entry; rewrite the E0134 entry's long-form
  text to describe the in-progress codegen status.
- typecheck tests: 21 co-firing `assert!(has_code(&errs, "E013[34]"))`
  lines stripped via perl one-liner; 4 pure-emission tests rewritten
  for the asymmetric state — `well_formed_effect_decl_typechecks_cleanly`
  and `effect_decl_with_invalid_op_type_emits_e0112` for the lifted
  half, `handle_expr_still_emits_e0134_until_codegen_lands` and the
  two co-firing variants for the still-gated half.
- codegen tests: new positive walker test
  `walker_accepts_program_with_effect_decl` pins the new behavior.
- e2e: new `effect_decl_with_no_handler_use_compiles_and_runs` test
  proves an effect-only program compiles end-to-end through codegen
  for the first time (CI-only; pod can't run sigil binaries).
- discipline: `no_user_facing_error_uses_e0001` sweep programs
  unchanged; comment updated to describe the asymmetric state.
- PLAN_B_PROGRESS / PLAN_B_DEVIATIONS updated. New deviation entry
  documents the foundation/CPS commit split on a single branch.

Pod-verify clean. Compiler lib-test count delta: net +1 (4 pure-emission
tests rewritten in place + 1 new walker test). Catalog gains 0 entries
(E0133 removed, E0134 kept).
…-perform case)

Phase 2 minimum of Plan B Task 55. Both staged-feature gates are now
lifted (E0133 in `b3af204`, E0134 here). Well-formed `handle`
expressions reach codegen for the first time. The full handler-frame
setup + per-arm CPS fn synthesis + `sigil_perform` wiring lands in
Phase 3+ on this same branch.

- typecheck: drop the E0134 emission from `check_handle`. All handler-
  typing diagnostics emitted above the gate (E0046, E0220, E0044,
  E0136-E0141) continue to surface as before. Catalog entry for E0134
  removed; the code number stays retired per the
  `seed_entries_are_unique_and_non_empty` discipline.
- typecheck: add `effect_ids: BTreeMap<String, u32>` and
  `op_ids: BTreeMap<(String, String), u32>` to `CheckedProgram`,
  populated at the end of `typecheck()` with deterministic
  alphabetical assignment over `tc.effects`. These are the IDs the
  runtime ABI uses (`sigil_handler_frame_new(effect_id, ...)`,
  `sigil_perform(effect_id, op_id, ...)`); same source program → same
  IDs across builds. 3 new tests pin the assignment scheme.
- codegen: replace the `unreachable!` in `lower_expr` for `Expr::Handle`
  with a body-pass-through (`self.lower_expr(body)`). `type_of_expr`
  for `Expr::Handle` forwards to body's type. The pass-through is
  safe because the codegen-entry guard `unsupported_handle_construct`
  rejects programs whose body would actually perform a non-IO effect
  with a clear in-progress message. IO `perform` in handle bodies is
  fine — `IO` is hard-wired in `lower_perform`.
- codegen: new `unsupported_handle_construct(program) -> Option<String>`
  walker traverses every fn body, every block, every expression,
  flagging handle expressions whose body contains a non-IO
  `Expr::Perform` site or a `return` arm. Conservative scope: only
  direct `Expr::Perform` in the body is inspected (transitive performs
  through called fns aren't followed); acceptable for the Phase 2
  e2e test program (body is a literal) but a documented limitation
  until Phase 3 ships the real handler-frame setup.
- typecheck tests: the four asymmetric-state tests added in `b3af204`
  rewritten as positive `well_formed_handle_expr_typechecks_cleanly`
  + co-firing-diagnostic tests (`handle_expr_with_nested_body_type_error_surfaces`,
  `handle_arm_body_type_mismatch_surfaces_e0044_or_e0065`).
- e2e: new `handle_with_no_perform_in_body_compiles_and_runs` —
  first time a handle expression compiles + runs end-to-end (prints
  `42`, exit 0). New `handle_with_non_io_perform_in_body_is_rejected_at_codegen`
  pins the codegen-time guard fires cleanly with a Task-55-referencing
  error message.
- discipline sweep `no_user_facing_error_uses_e0001` programs
  unchanged; comment updated.
- color.rs synthetic test fixture extended with the two new
  `CheckedProgram` fields (empty BTreeMaps).
- PLAN_B_PROGRESS / PLAN_B_DEVIATIONS updated. New deviation entry
  documents the Phase 2 → Phase 3+ split.

Pod-verify clean. 41 related compiler unit tests pass locally.
… handle bodies

Phase 3a of Plan B Task 55. Wires the runtime handler-frame ABI from
Task 56 around every `handle` body: each `handle BODY with { arms }`
now allocates a frame, pushes it on the runtime handler stack,
evaluates the body inline, then pops the frame. Arm fn pointers are
not yet set (left null by `sigil_handler_frame_new`'s zero-init);
Phase 3b adds per-arm CPS fn synthesis + `sigil_handler_frame_set_arm`
+ `sigil_perform` lowering. The frame is functionally a no-op in
Phase 3a — its arm slots are never read because the codegen-entry
guard `unsupported_handle_construct` rejects programs whose body
would actually `perform` the handled effect — but the FFI plumbing
is real, observable in compiled output, and the path Phase 3b builds
on.

- codegen: declare 3 new runtime imports —
  `sigil_handler_frame_new(effect_id: u32, arm_count: u32) -> *mut HandlerFrame`,
  `sigil_handle_push(frame)`, and `sigil_handle_pop() -> *mut HandlerFrame`.
- codegen: `Lowerer` gains `handler_frame_new_ref` / `handle_push_ref`
  / `handle_pop_ref` FuncRef fields, plus an `effect_ids: &BTreeMap<String, u32>`
  borrow from the `CheckedProgram` for effect-id lookup at handle sites.
- codegen: replace the Phase 2 body-pass-through in `Expr::Handle`
  with the frame_new + push + body-eval + pop sequence. The handle's
  Cranelift value is still the body's value (the frame allocation
  has no effect on the body's evaluation).
- codegen: `unsupported_handle_construct` tightened — also rejects
  multi-arm handles, return arms, and (defensively) zero-arm handles.
  Phase 3a's codegen path indexes `op_arms[0]` for the unique
  effect-name lookup, so any handle that escapes these checks would
  trip out-of-bounds; the guard front-loads a clear in-progress
  diagnostic instead.
- The Phase 2 e2e test `handle_with_no_perform_in_body_compiles_and_runs`
  continues to pass — it now exercises the frame ABI for the first
  time end-to-end (runtime allocates + pushes + pops a frame for the
  `Raise` handler around the literal `42`). The
  `handle_with_non_io_perform_in_body_is_rejected_at_codegen` test
  continues to pin the codegen-time rejection.

Tiny runtime regression: the no-perform handle now allocates +
pushes + pops a frame on every invocation (previously a no-op
pass-through). Acceptable Phase 3a cost; Phase 3b makes the frame
functional rather than just present.

PLAN_B_PROGRESS / PLAN_B_DEVIATIONS updated. New
`[DEVIATION Task 55 — 2026-04-26]` entry on the Phase 3a / Phase 3b
split. Pod-verify clean.
…; first real handler dispatch

Phase 3b of Plan B Task 55. Handlers now dispatch arms at runtime —
the first time a `perform` actually invokes a user-written handler
arm end-to-end. Combined with Phase 3a's frame ABI (`ef4be8d`), this
completes the simplest meaningful subset of the handler runtime:
single-arm, single-effect, zero-arg op, k-discarding arm, literal
arm body. Phase 4+ lifts each restriction.

- codegen: 3 new FFI declarations — `sigil_handler_frame_set_arm`,
  `sigil_perform`, `sigil_next_step_done`.
- codegen: pre-pass `collect_handle_arms_in_block` / `_in_expr`
  walks every user fn body, finds every `Expr::Handle`, allocates
  one synthetic CPS-fn FuncId per arm with the uniform calling
  convention `extern "C" fn(closure_ptr, args_ptr, args_len) ->
  *mut NextStep`. Symbol naming `sigil_handler_arm_<global_index>`.
  Side-table `handler_arm_indices: BTreeMap<Span, Vec<usize>>`
  threads per-handle FuncIds through to the calling fn's lowering.
- codegen: synthetic-fn definition pass at the bottom of
  `emit_object` lowers each arm body via a small hand-rolled
  Cranelift sequence: `iconst(value)` →
  `call sigil_next_step_done(value)` → `return result`. Phase 3b
  restricts arm bodies to `Expr::IntLit` so this hand-roll suffices.
- codegen: `Expr::Handle` lowering in `Lowerer::lower_expr` extended
  to populate arm slots between `frame_new` and `push`. For each
  arm: `func_addr` to get the synthetic fn's pointer, then
  `sigil_handler_frame_set_arm(frame, op_id, fn_ptr, null_closure)`.
- codegen: new `lower_perform_non_io_to_value` method emits
  `sigil_perform(effect_id, op_id, args_ptr=null, args_len=0,
  k_closure=null, k_fn=null)` for non-IO performs, then reads
  `(*next_step).value` at offset 24 (per the `#[repr(C)]` NextStep
  layout) as i64. `Expr::Perform` arm in `lower_expr` dispatches:
  IO → existing path returning Unit (i8 0); non-IO → CPS path
  returning the value.
- codegen: `type_of_expr` for `Expr::Perform` looks up the op's
  declared return type from the `effects` registry for non-IO;
  defaults to I8 for IO. New `effects` and `op_ids` borrows on
  `Lowerer`.
- codegen: `unsupported_handle_construct` walker updated to enforce
  Phase 3b restrictions: arm body is `IntLit` only, arm has no
  user op-args (only `k`), body's non-IO perform must be zero-arg.
  The body-perform check from Phase 2/3a is gone (now supported);
  multi-arm/return-arm checks remain (Phase 4+).
- e2e: replaced `handle_with_non_io_perform_in_body_is_rejected_at_codegen`
  (Phase 2 rejection test, no longer rejected) with two new tests:
  `handle_with_non_io_perform_runs_arm_and_returns_value` (the
  Phase 3b milestone — `handle (perform Raise.fail()) with {
  Raise.fail(k) => 42 }` compiles, runs, prints `42`),
  `handle_with_arm_that_uses_k_is_rejected_at_codegen` (pins the
  Phase 3b restriction; arm body referencing `k` rejected with a
  Phase-4-pointing message).
- The Phase 2/3a `handle_with_no_perform_in_body_compiles_and_runs`
  test continues to pass, now exercising the full `frame_new` +
  `set_arm` + `push` + `pop` sequence (set_arm is new — the arm fn
  pointer is registered even though the handler is never invoked).

PLAN_B_PROGRESS / PLAN_B_DEVIATIONS updated. New
`[DEVIATION Task 55 — 2026-04-26]` entry on Phase 3b restrictions.

Pod-verify clean. Local e2e tests can't run (Cranelift OOM on pod);
CI is the source of truth for the new e2e tests.
…gh sigil_run_loop

CI failure on the new e2e test (`handle_with_non_io_perform_runs_arm_and_returns_value`)
revealed an incorrect assumption: `sigil_perform` does NOT invoke
the arm directly — it builds a `NextStep::Call` to the arm + packed
`(args, k_closure, k_fn)` and returns that Call NextStep. The
caller is responsible for dispatching it (i.e. running the
trampoline) to reach the terminal `Done(value)`.

The previous Phase 3b lowering read `(*ns).value` at offset 24 from
the returned NextStep, which for a Call NextStep is unwritten — so
on Linux + macOS the value came back as 0 (arena zero-init), making
the e2e test print "0\n" instead of "42\n".

Fix: declare `sigil_run_loop(initial: *mut NextStep) -> u64` as a
new FFI import and pass `sigil_perform`'s Call NextStep to it
inside `lower_perform_non_io_to_value`. `sigil_run_loop` invokes
the arm fn, then dispatches any further Calls the arm returns,
until a terminal Done — returning the Done's value as u64. Native
code uses the u64 directly.

This is also the more general path: even when arm bodies grow more
complex (Phase 4+ — k usage, op-args, multi-step computations),
`sigil_run_loop` continues to be the right dispatcher; only the
codegen for arm bodies and perform-side args packing changes.

Pod-verify clean. Single-fixup commit.
Phase 4a of Plan B Task 55. Handlers can now have multiple op-arms
when all arms target the same effect. The runtime `HandlerFrame`
already supports up to MAX_HANDLER_ARMS=14 arm slots (Task 56);
codegen iterates `op_arms` to emit one
`sigil_handler_frame_set_arm(frame, op_id, fn_ptr, null_closure)`
call per arm. The `unsupported_handle_construct` codegen-entry
guard is updated to:

- Allow `op_arms.len() > 1` (was rejected in Phase 3a/3b).
- Reject mixed-effect handles with a clear "Phase 4e" diagnostic
  (the runtime frame's `effect_id` is single-valued; multi-effect
  needs frame-per-effect).
- Defensively reject duplicate `(effect, op)` arm pairs (typecheck
  E0140 catches this earlier; double-check for safety).
- Per-arm checks (IntLit body, no op-args) now run for each arm
  (was checked only for `op_arms[0]` in Phase 3b).

The synthetic-fn pre-pass and definition pass already handle
multiple arms per handle (one synthetic CPS fn per arm); no
changes needed there.

- e2e: new `handle_with_two_arms_dispatches_correct_arm_by_op_id`
  test runs `handle (perform Choose.right()) with { Choose.left(k)
  => 10, Choose.right(k) => 20 }` → prints `20` (op_id=1 dispatch
  through the runtime arm-slot table; alphabetical ordering puts
  `left` at op_id=0, `right` at op_id=1).
- e2e: new `handle_with_mixed_effect_arms_is_rejected_at_codegen`
  pins the Phase 4a restriction (mixed-effect arms still
  rejected).

Pod-verify clean. Builds on Phase 3b (`2e7c0de`).
Copy link
Copy Markdown
Owner Author

@boldfield boldfield left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review of PR #22 (Plan B Task 55 — handle codegen, Phase 2 → 3b → 4a)

Overall the change is well-scoped and the asymmetric-gate progression is sensible. Tests pin both the lifted gate and the in-progress restrictions. Several issues to address before this leaves draft.

Bugs

1. unsupported_handle_construct doesn't recurse into the handle's body.

In compiler/src/codegen.rs::expr_unsupported_handle, the Expr::Handle arm checks the outer handle's restrictions, walks body_contains_non_io_perform_with_args(body) (which short-circuits on Expr::Handle { .. } => None by design), and then recurses only into arm.body — never into body itself.

A nested handle inside the body bypasses every Phase 3b restriction. Example:

handle (handle 0 with { A.op1(k) => 1, B.op2(k) => 2 })  // multi-effect — never checked
with { Outer.op(k) => 0 }

Because the inner handle is only consulted via body_contains_non_io_perform_filtered (which returns None for Expr::Handle { .. }), its multi-effect / non-IntLit / return-arm / multi-arg restrictions are never enforced. At runtime this can register arms under the wrong effect_id and crash inside sigil_perform's handler-stack walk.

Fix: add if let Some(msg) = expr_unsupported_handle(body) { return Some(msg); } to the Expr::Handle arm of expr_unsupported_handle. Add a regression test exercising a nested multi-effect handle inside another handle's body.

Quality issues

2. Dead code: body_contains_non_io_perform and block_contains_non_io_perform are unused.

Both are marked #[allow(dead_code)] and are entirely subsumed by body_contains_non_io_perform_filtered / block_contains_non_io_perform_filtered. The #[allow(dead_code)] is a smell — just delete them. ~115 lines of cruft.

3. op_names.dedup() silently masks a typecheck invariant.

In compiler/src/typecheck.rs (op_ids assignment loop):

let mut op_names: Vec<&str> = eff_decl.ops.iter().map(|o| o.name.as_str()).collect();
op_names.sort();
op_names.dedup();

Duplicate op names within an effect already fire E0137 upstream. The dedup() here is either dead defensive code, OR (worse) silently corrupts ID assignment if duplicates somehow leak through with errors collected but not aborted. Replace with debug_assert!(op_names.windows(2).all(|w| w[0] != w[1]), "E0137 should have caught this"), or just drop the dedup() entirely.

4. PLAN_B_PROGRESS.md has a duplicated Phase 3a paragraph.

Diff lines 65 and 66 add two near-identical paragraphs:

  • Phase 3a (ef4be8d): wires the runtime handler-frame ABI ...
  • Phase 3a (this commit): wires the runtime handler-frame ABI ...

Same content, just different prefix. Looks like a copy-paste leftover from the Phase 3a → Phase 3b transition where the old "(this commit)" version wasn't cleaned up. Drop the second one.

5. Test name handle_with_arm_that_uses_k_is_rejected_at_codegen misleads about what's being tested.

The arm body is k(0) — an Expr::Call, not an Expr::IntLit. The unsupported_handle_construct guard rejects it at the IntLit check, never reaching any "uses k" logic (because there is none — k usage is implicitly forbidden by IntLit-only). The test would pass equally for arm body 1 + 1 or if true then 1 else 2. Either:

  • Rename to handle_with_non_intlit_arm_body_is_rejected_at_codegen and add a 1 + 1 variant, OR
  • Add a real "k is referenced inside an IntLit context" test once Phase 4c lands richer arm bodies and a true k-usage check is needed.

Minor

6. Synthetic arm fns at the bottom of emit_object have no stackmap entries.

The hand-rolled iconst → call sigil_next_step_done → return sequence calls into the runtime, which can allocate (NextStep arena bump). There's no stackmap.push_placeholder for that call. Safe in Phase 3b only because (a) closure_ptr arg is null and (b) IntLit body produces no GC roots. But the closure_ptr param IS in the CPS calling convention — once Phase 4c adds richer arm bodies with captures, closure_ptr becomes a live GC root across the sigil_next_step_done call and you'll silently miss it on the stackmap. Worth a TODO comment at minimum, or wire it through now.

7. let _ = span; in check_handle after E0134 lift.

The span arg was used for E0134 emission; with the gate lifted it's only kept warm via let _ = span;. Either drop the param from check_handle's signature or comment why span is retained for Phase 4+.


Recommended order: fix #1 (correctness) and #4 (doc cleanup) before any further commits land on this branch; #2/#3/#5 anytime; #6/#7 can ride along with later phase work.

@boldfield
Copy link
Copy Markdown
Owner Author

WIP review — items to address before draft → ready

Reviewing at adcb897. Mechanical state is clean (clippy/fmt/runtime tests/lib tests/pod-verify/reproducibility all pass; e2e 28/30 with 2 environmental perf failures unrelated to this PR). Verdict shape is "items by phase," not "merge / request changes," since this is an explicit WIP draft. Per your standing directive on this branch, all real issues below are must-fix-before-ready.

Per-phase status

Phase Status
Foundation (E0133 lift) landed, clean
Phase 2 (E0134 lift, body-pass-through, effect/op IDs) landed, clean
Phase 3a (frame ABI) landed, clean
Phase 3b (sigil_perform + arm dispatch) landed; one real crash bug (see MF1), several stale comments
Phase 3b fixup (Call NextStep through run_loop) landed; correct shape
Phase 4a (multi-arm single-effect) landed, clean

Mechanical (verified locally)

  • cargo fmt, clippy -D warnings on all four workspace crates: clean
  • cargo test -p sigil-runtime: 64 passed (incl. 3 GC-stress via subprocess)
  • cargo test -p sigil-compiler --lib: 419 passed
  • cargo test -p sigil-compiler --test e2e: 28/30 (2 fails are fib_perf / tree perf-floor, environmental, pre-existing — see PLAN_B_DEVIATIONS.md verification-debt entry)
  • bash scripts/pod-verify.sh: clean
  • bash scripts/reproducibility.sh: 4/4 example binaries byte-identical across builds — synthetic sigil_handler_arm_<global_index> symbol naming verified deterministic
  • Note for reviewers running locally: initial e2e run failed all 3 handle tests with link errors against _sigil_handle_push etc. Caused by a stale target/debug/libsigil_runtime.a. Rebuilding the runtime debug staticlib (cargo build -p sigil-runtime) resolved. Worth flagging in the PR description so future reviewers don't chase a phantom link bug.

Must-fix before draft → ready

[MF1] Statement-form non-IO perform crashes the compiler. Lowerer::lower_stmt for Stmt::Perform(p) unconditionally calls self.lower_perform(p) at compiler/src/codegen.rs:2053-2055, which asserts p.effect == "IO" at :2061. Verified by inspection. Repro:

effect Raise { fail: () -> Int }
fn helper() -> Int ![Raise] {
  perform Raise.fail();
  0
}
fn main() -> Int ![IO] {
  let n: Int = handle helper() with { Raise.fail(k) => 42 };
  perform IO.println(int_to_string(n));
  0
}

assertion left == right failed: non-IO effect reached lower_perform; left: "Raise" right: "IO".

The unsupported_handle_construct walker doesn't catch this — helper() is a Call expression in the handle body and the walker only inspects direct Expr::Perform sites. Two fixes needed:

  1. lower_stmt must dispatch IO vs non-IO the same way Expr::Perform does (call lower_perform_non_io_to_value for non-IO and discard the value, or refactor the dispatch into a shared helper).
  2. Add an e2e test pinning Stmt::Perform non-IO. The existing handle e2e tests all use expression-form perform in tail position; statement-form is uncovered.

[MF2] Differential identity property test (standing precondition [P1]) is not present. Searched for differential|prop.*identity|Native.*equiv|cps_eq_native|proptest across both crates — zero matches. This was flagged as a Task 55 must-fix precondition in the PR #21 review and on the pre-registered Stage 6 worry list. It's the only evidence you can trust without auditing the CPS transform's correctness reasoning by hand.

Phase 3b lowers the first real CPS-flavoured path (synthetic arm fns + perform routing through run_loop); Phases 4b/4c will lift restrictions and the property test becomes harder to add later, not easier. Land before draft → ready, ideally as a small AST fuzzer plus a runner that compares direct-style native execution to the CPS lowering on Native-eligible programs (no perform).

[MF3] Color analysis still on Task 53 stub. compiler/src/color.rs:380-411 find_non_io_perform_in_expr's Expr::Handle arm carries a "Task 55 will replace this stub with the proper handler-context color rule the PR #18 reviewer flagged for Stage 6" comment. Task 55 hasn't replaced it. With Phase 3b shipping real handler dispatch, the color analyser's classification of handle-containing fns now drives codegen behavior. Today it's still acting as if the handle discharges all body effects from the parent fn's color — correct for current Phase 3b/4a restrictions (synchronous, single-shot, IntLit arm, no k), but Phase 4b+ (k usage, multi-step bodies) requires the rule to be properly stated. Either replace the stub now per PR #18's reviewer ask, or upgrade the comment to spell out exactly which Phase-4 sub-task closes the loop.

[MF4] Stale comments documenting offset-24 reads that no longer happen.

  • compiler/src/codegen.rs:2207 in Expr::Perform's match arm: "we extract the arm's value from (*ns).value at offset 24." Verified the actual code at :2147 just returns sigil_run_loop's u64 result. The Phase 3b fixup commit (2e7c0de) routed through run_loop but didn't update this comment.
  • PLAN_B_PROGRESS.md Phase 3b entry: same staleness.

Either delete the offset reference or rewrite to accurately describe the run_loop path. Layout-coupling concerns from the prior PR's worry list are now moot for codegen (no offset-24 read), but document explicitly so future reviewers don't chase a phantom.

[MF5] Duplicate Phase 3a entry in PLAN_B_PROGRESS.md. Two consecutive bullet points say "Phase 3a (ef4be8d)" and "Phase 3a (this commit)" with identical content. One was meant to describe a prior commit, the other a future state. Pick one.

[MF6] Stale E0133 / E0134 comments in compiler/src/typecheck.rs describing those gates as "still firing" when they've been lifted: lines 345-349, 473-476, 2380-2389, 2770-2774, 4218-4222.

The TODO at compiler/src/typecheck.rs:2955-2964 is the worrying one — explicitly says "Today the gate fires first so this case is unreachable in user-visible output." Verified: gate is gone (lifted in Phase 2 via 2d69b52). The case is now reachable. Either prove it can't happen with current Phase 4a restrictions (and update the comment to reflect that), or write a test program demonstrating it doesn't break, or address the underlying inference question (pin via body's perform-call return-type chain, or surface E0132-style ambiguous-polymorphism diagnostic).

Phase-blocking restrictions (must land in subsequent phases on this branch)

[PB1] unsupported_handle_construct doesn't follow call edges (Phase 4 blocker). Already noted in MF1 — walker rejects only direct Expr::Perform in handle bodies. Once MF1 fixes the compiler crash, the walker isn't load-bearing for soundness, but Phase 4b's lifting of the IntLit-only-arm-body restriction will need a real story for transitive perform analysis. Document the closure point in the deviation entry.

[PB2] Closure capture across CPS boundary not yet addressed. Expr::Handle lowering passes closure_ptr: null to sigil_handler_frame_set_arm (codegen.rs:2308, 2322). Correct today (arm bodies are IntLit). Phase 4b/4c (k usage + multi-step bodies + outer-scope captures) will require a real closure record threaded into each arm slot. Add a // PHASE-4-RESTRICTION marker on the null_ptr line with the specific phase that lifts it, so the dependency is grep-findable.

[PB3] Frame leak on body panic. Expr::Handle lowering does frame_new → set_arm* → push → body → pop (codegen.rs:2285-2334). If the body aborts (today: only sigil_panic_arith_error, which kills the process — fine), the frame leaks on the runtime handler stack. When Task 57 replaces arith panic with Raise[ArithError], this leak becomes a real bug — the recovery handler will see a stale frame from a different scope. Phase 4 needs to either (a) introduce a scope guard that pops on early exit, or (b) make pop idempotent and explicitly tear down on the unwind path. Document the dependency on Task 57.

[PB4] Tail-call preservation worry. Phase 3b's lower_perform_non_io_to_value synchronously calls sigil_perform then sigil_run_loop from native code. Works for IntLit-only single-step arms (run_loop completes in 1-2 dispatches and returns). When Phase 4c lifts to multi-step arm bodies and Phase 4d wires k invocation, native callers driving run_loop becomes a stack-blocking pattern that defeats the trampoline's purpose — the fn calling the synchronous-blocking shape needs to be Cps-color so the perform site emits a real NextStep::Call to its caller's trampoline. This is the [MF3] color stub bite. Document explicitly in a deviation entry.

Deferred (genuine future-task / v2+ work only)

  • Layout-coupling against NextStep shape is moot today (no offset-24 read in codegen); revisit at Task 58 (multi-shot may add fields) or v2.
  • match-arm CPS join-point unification (worry Plan A2 Task 29: parser extensions (lambda syntax) #5) and let-bindings under nested forms (worry Plan A2 Task 30: typechecker fn types + application unification + capture analysis #6) — Phase 4c/4d concerns; today's restrictions make them unreachable.
  • Walker consolidation: unsupported_handle_construct is 240+ lines split across expr_unsupported_handle, block_unsupported_handle, body_contains_non_io_perform_with_args, etc. Worth consolidating in Phase 4 when it grows further; not urgent today.
  • body_contains_non_io_perform at compiler/src/codegen.rs:789 is #[allow(dead_code)] — either delete or wire the live caller.

Spec adherence

The PR ships toward what Task 55 specifies (CPS expansion for CPS-color monomorphs, perform → sigil_perform, handler frame push/pop), with phase restrictions appropriate for an in-progress branch. The 4 deviation entries (foundation, Phase 2, Phase 3a, Phase 3b) follow the file's stated 4-section format but lack an explicit "Closure point" field that prior entries use (Task 4.5.5, Task 47, Task 48, etc.). Recommend adding a Closure point line to each Task-55 deviation naming the specific Phase 4 sub-task (or the PR-22 merge) that lifts each restriction, so reviewers know where each loop closes.

effect_ids / op_ids deterministic-alphabetical assignment: verified clean. tc.effects is BTreeMap (sorted iteration); op-name list explicitly sorted before enumeration. Reproducibility script confirms 4/4 example binaries byte-identical across builds.


Once MF1–MF6 land, the foundation + Phase 4a state is solid enough to base Phase 4b on. PB1–PB4 must close before draft → ready (specifically: PB3 hard-depends on Task 57, PB4 hard-depends on MF3).

Fixes correctness items + cleanup from the WIP review at adcb897:

Bugs:
- Walker recursion: expr_unsupported_handle's Expr::Handle arm now
  recurses into the outer handle's body, not just into arm bodies.
  Without this, a nested handle inside another handle's body bypassed
  every Phase 4 restriction (multi-effect, return-arm, etc.) and
  could register arms under the wrong effect_id at runtime.
  Regression test: nested_handle_in_outer_body_propagates_inner_unsupported_diagnostic.
- Stmt::Perform crash: lower_stmt now dispatches IO vs non-IO the
  same way Expr::Perform does. Previously, statement-form non-IO
  performs hit the IO-only assertion in lower_perform and crashed
  the compiler. Pinned by statement_form_non_io_perform_inside_handle_compiles_and_runs.

Cleanup:
- Delete unused body_contains_non_io_perform / _block walkers
  (~115 lines, fully subsumed by the _filtered variants).
- Replace op_names.dedup() with debug_assert (E0137 catches dups
  upstream; the dedup masked a typecheck invariant).
- Rename handle_with_arm_that_uses_k_is_rejected_at_codegen to
  handle_with_non_intlit_arm_body_is_rejected_at_codegen and add a
  '1+1' variant; the original test exercised the IntLit guard, not
  any 'k usage' check.
- Add stackmap TODO on synthetic arm-fn definition pass (Phase 4c
  closes once arm bodies grow GC roots).
- Drop unused span param from check_handle's signature (E0134 lift
  made it dead).
- Replace 'we extract from (*ns).value at offset 24' codegen comment
  with the actual run_loop dispatch description (Phase 3b fixup
  routed through run_loop but didn't update the comment).
- Remove duplicate Phase 3a paragraph from PLAN_B_PROGRESS.md;
  refresh status line + commits list to reflect Phase 3b/3b-fixup/4a.
- Refresh stale 'E0133/E0134 still firing' comments at five sites
  in typecheck.rs to point at the lift commits (b3af204 / 2d69b52)
  and the Phase-4 work that follows.
- Upgrade color.rs's Task 53 'stub' comment to name Phase 4d as the
  closing dependency (the handler-context color rule is load-bearing
  once arms reify k).
- Add PHASE-4-RESTRICTION marker on the null closure_ptr passed to
  sigil_handler_frame_set_arm so the Phase 4b/4c/4d dependency is
  grep-findable.
- Add Closure point lines to all four Task 55 deviation entries
  (per other entries' convention) naming the Phase that lifts each
  restriction.
Closes the remaining must-fix and phase-blocking items from the PR #22
WIP review.

MF2 — Differential identity property test.
- New e2e test cps_wrapped_identity_matches_native_on_native_eligible_programs
  generates 24 deterministic Native-eligible Int expressions (small
  ints + arithmetic + if-else), compiles each twice (raw vs wrapped
  in `handle EXPR with { E_eff.op(k) => 999 }`), and asserts identical
  stdout. Catches CPS-lowering bugs that hand-rolled tests would miss
  by sampling the expression-shape space rather than hitting one case.
  Embeds an xorshift64 PRNG inline rather than pulling in proptest;
  determinism + reproducibility take priority over fuzz coverage at
  this scale.

Deviation entries (closure points named):
- PB1: walker doesn't follow call edges. After MF1's dispatch fix,
  this is no longer a soundness concern — transitive performs lower
  correctly through sigil_perform/run_loop. The walker stays as a
  Phase-4 ergonomic gate; closure point is Phase 4b's args check.
- PB3: handler frame leak on body unwind. Today only sigil_panic_arith_error
  aborts mid-body (kills process, leak invisible). Closure point
  is Task 57 — when ArithError becomes recoverable, every Expr::Handle
  codegen site needs a scope guard or unwind-safe pop.
- PB4: native callers drive sigil_run_loop synchronously. Correct
  for Phase 3b/4a's restricted shape (1-2 dispatches per perform);
  the synchronous-blocking pattern only defeats the trampoline's
  stack discipline at Phase 4c+ depths. Closure point is Phase 4d
  (k-using arms via continuation reification), which closes
  alongside the colorer's handler-discharge refinement (PR #18
  reviewer's Stage-6 ask).
…rdLit/Handle children

The MF2 differential property test in the previous commit (95472ec)
caught a real preexisting bug: `elaborate` skipped recursion into
the children of `Expr::Call`, `Expr::RecordLit`, and `Expr::Handle`,
so an `Expr::If` nested inside a call argument / record-literal field
value / handle body was never desugared to `Expr::Match`. The
unelaborated `Expr::If` then tripped the
`unreachable!("Expr::If should have been desugared by elaborate")`
arm in `lower_expr` and crashed the compiler.

The bug was unreachable in earlier phases because:
- Call: existing tests bind compound values to `let` first; no
  test passed an if-else directly as a call argument.
- RecordLit: same pattern — fields were bound to lets first.
- Handle: E0134 staged-feature gate short-circuited the pipeline
  before elaborate; lifted in Task 55 Phase 2 (`2d69b52`).

Fix: each arm now recurses into its children with `need_trivial=false`
(Stage-3 ANF flattening at call sites is still deferred per the
original comment) and aggregates hoisted bindings appropriately:
- Call: hoisted bindings bubble up to the caller (a Call is not a
  scope boundary).
- RecordLit: same.
- Handle: body + return-arm body + each op-arm body are inner
  scopes; hoisted bindings are wrapped in `Expr::Block` so they
  cannot leak past the handle/arm boundary, mirroring the existing
  Match-arm and Lambda-body patterns.

Two new lib tests pin the recursion behaviour directly (independent
of the MF2 e2e property test):
- `if_inside_call_args_is_desugared_to_match`
- `if_inside_handle_body_is_desugared_to_match`

`compiler/src/elaborate.rs` 410 → 412 lib tests (+2). Pod-verify clean.
@boldfield boldfield changed the title [Task 55 WIP] Foundation: lift E0133 + route Item::Effect through codegen [Task 55] CPS handler codegen: foundation through Phase 4a (multi-arm dispatch) Apr 26, 2026
@boldfield boldfield marked this pull request as ready for review April 26, 2026 16:49
Copy link
Copy Markdown
Owner Author

@boldfield boldfield left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Re-review of PR #22 (Plan B Task 55 — handle codegen, foundation through Phase 4a)

All seven items from the prior review (#1 walker recursion, #2 dead code, #3 op_names.dedup, #4 PROGRESS duplicate, #5 misleading test name, #6 stackmap TODO, #7 unused span param) are addressed cleanly. The MF2 differential property test, the elaborate-non-recursion fix, and the Stmt::Perform non-IO crash fix are all good additions that the author found and closed independently. The deviation entries for PB1/PB3/PB4 with named closure points are well-formed and correctly downgrade PB1 from "soundness" to "ergonomic gate" given the MF1 dispatch fix.

A few new concerns from this read.

Substantive

1. indices.insert(span.clone(), arm_indices) overwrites silently.

compiler/src/codegen.rs::collect_handle_arms_in_expr (the Expr::Handle arm) does:

indices.insert(span.clone(), arm_indices);

If two Expr::Handle nodes share a span — e.g., a future mono pass that clones a generic fn body, a closure-conversion lambda lift that duplicates a handle, or any other AST-cloning transform — the second insert silently overwrites the first. lower_expr then looks up handler_arm_refs_per_handle.get(span) and gets the wrong FuncRefs for one of the handles.

Today this is safe only because Phase 3b restricts arm bodies to IntLit (parameter-free, so all aliased monomorphs would share semantics by construction). Phase 4c lifts that restriction. The PR description acknowledges per-monomorph color variance as a Stage-6 carry-forward; the bug becomes user-observable the moment that lands.

Fix: debug_assert!(indices.insert(span.clone(), arm_indices).is_none(), "duplicate handle span — pre-pass invariant broke for {span:?}"); — cheap, fails loudly when the assumption stops holding. Same idea as the op_names dedup→debug_assert swap from review-fixup commit 54b4a60.

2. MF2 property test doesn't exercise the perform-dispatch path.

The wrapper is handle EXPR with { E_eff.op(k) => 999 } where EXPR is the generated Native-eligible body. The body never performs E_eff.op, so the synthetic arm fn is dead code at runtime — the test validates only the Phase 3a frame plumbing (frame_new + set_arm + push + pop) doesn't corrupt the body's native semantics.

The Phase 3b/4a dispatch path (sigil_performsigil_run_loop → arm → next_step_done → value extraction) is not validated by the property test despite being the load-bearing mechanism Phase 4 builds on. Hand-rolled tests cover one or two cases (Raise.fail() => 42, Choose.right() => 20) but they don't sample the shape space.

Fix recommendation: add a paired generator that emits a wrapper-with-perform shape — handle (perform E_eff.op()) with { E_eff.op(k) => N } for some constant N, asserting stdout N\n. The body is IntLit-bodied (so it stays inside Phase 4a's subset) but the dispatch loop runs end-to-end on every trial. Even 8–12 trials would catch a real regression and stay inside the CI time budget.

This isn't a blocker for merge, but the MF2 test as it stands is named "differential identity" yet tests only the no-perform half of identity. Worth a docstring caveat at minimum.

3. No e2e test for op_id ≥ 2.

Phase 4a's runtime supports MAX_HANDLER_ARMS = 14 (per Task 56 / PR #21). The current matrix exercises 1-arm (Raise.fail at op_id=0) and 2-arm (Choose.{left,right} at op_id ∈ {0,1}). A 3+-arm test would validate op_id arithmetic isn't off-by-one for op_id ≥ 2 — e.g., effect Pick { a, b, c } performing Pick.c() would expect op_id=2 dispatch.

Cheap to add, would catch a class of subtle indexing bugs that 1- and 2-arm tests can't surface.

Nits (no action needed unless trivial)

4. Xorshift64::bool method name shadows the built-in bool type identifier at call sites. Calling rng.bool() reads ambiguously — next_bool() would mirror next() and make method-call sites unambiguous. Cosmetic.

5. The Phase 3b deviation entry's "FFI surface" diagram still says "next_step_done → NextStep value extraction" rather than "→ run_loop dispatch". The lowering changed in fixup 2e7c0de to route through sigil_run_loop, but the old diagram wording in the deviation rationale wasn't fully updated. Code comments are correct; only the deviation-entry rationale text is stale. (The neighbouring deviation entries that name the closure point ARE updated.)


The substantive items (#1, #2, #3) are not merge blockers individually, but #1 in particular is the kind of latent invariant that's much cheaper to assert now than to debug post-Phase-4c when it actually bites. Recommend addressing #1 in this PR; #2/#3 can land in a follow-up.

@boldfield
Copy link
Copy Markdown
Owner Author

Re-review at 6057373

PR moved from draft → ready. Three new fix-up commits since prior review (54b4a60, 95472ec, 6057373). Mechanical state still clean — all 4 CI lanes SUCCESS, 419 compiler lib tests pass locally. Per your standing directive on this branch, the items still pending below are must-fix.

Headline result

The MF2 differential property test caught a real preexisting bug. Commit 6057373 documents this directly: the property test exercised expression shapes that hand-rolled tests had missed, and elaborate was skipping recursion into Expr::Call / Expr::RecordLit / Expr::Handle children — so a nested Expr::If inside a call argument / record field / handle body was never desugared to Expr::Match, then crashed lower_expr's unreachable!. Was unreachable in earlier phases for incidental reasons (existing tests bound compound values to let first); Phase 2's E0134 lift exposed it. Two regression tests added at the elaborate level. Exactly the structural-correctness category I argued the property test would catch.

Per-item verification

Item Status
MF1 — Stmt::Perform non-IO crash ✅ fixed; regression test statement_form_non_io_perform_inside_handle_compiles_and_runs added
MF2 — differential identity property test ✅ added; substantive (xorshift64 PRNG, 24 deterministic Native-eligible expressions, vacuous-handler wrapping, raw vs wrapped stdout comparison). Caught the elaborate non-recursion bug.
MF3 — color stub at color.rs:380-411 ❌ comment essentially unchanged. The PB4 deviation entry now names Phase 4d as the closure point for the color-handler-discharge refinement, but the in-source comment still reads "Task 55 will replace this stub" without referencing PB4 / Phase 4d. Update the comment to cross-reference the PB4 deviation entry's Phase 4d closure point.
MF4 — codegen.rs:2207 stale "offset 24" comment ❌ unchanged. Verified: the comment still says "we extract the arm's value from (*ns).value at offset 24" but the actual code routes through lower_perform_non_io_to_valuesigil_run_loop and reads its u64 return. Either delete the offset reference or rewrite to describe the run_loop path.
MF5 — duplicate Phase 3a entry in PLAN_B_PROGRESS.md ❌ unchanged. Lines 103 + 104 still both labeled "Phase 3a" with identical content. Delete one.
MF6 — TODO at typecheck.rs:2955 ❌ unchanged. Still says "Today the gate fires first so this case is unreachable in user-visible output." Verified: E0134 was lifted in Phase 2 (2d69b52). The case IS now reachable. Either prove it can't happen with current Phase 4a restrictions and update the TODO comment, or write a test program that demonstrates it doesn't break, or address the underlying inference question.
MF6 — stale E0133/E0134 comments in typecheck.rs ❌ unchanged. Lines 345-346, 473, 621, 2386, 2770 all still describe gates as firing. Sample at :473: "E0133 still fires per Item::Effect" — not true since b3af204 lifted it. The sweep test at typecheck.rs:7008-7015 correctly asserts !has_code(&errs, "E0133") so the test agrees with the actual behavior, but the prose comments contradict it.
PB1 — walker doesn't follow call edges ✅ deviation entry added (Closure point: Phase 4b). Substantive.
PB2 — // PHASE-4-RESTRICTION markers on null_closure lines ❌ not added. codegen.rs:1736-1737 and :2368-2370 still just say let null_closure = builder.ins().iconst(pointer_ty, 0) with no Phase-4-RESTRICTION marker. Add the marker on each null_closure line so the dependency on Phase 4b/4c (closure capture lift) is grep-findable.
PB3 — handler frame leak on body unwind ✅ deviation entry added (Closure point: Task 57). Substantive.
PB4 — native callers drive sigil_run_loop synchronously ✅ deviation entry added (Closure point: Phase 4d, tied to colorer refinement). Substantive.

Mechanical (verified locally)

  • cargo test -p sigil-compiler --lib: 419 passed
  • CI: all 4 lanes SUCCESS on 6057373
  • The dead-code body_contains_non_io_perform walker (~120 lines) was deleted as a bonus during the fix-up — clean

Remaining must-fix list (post-prior-review)

  1. MF3 — update compiler/src/color.rs:380-411 comment to cross-reference the new PB4 deviation entry (or its Phase 4d closure point), so the in-source dependency is discoverable.
  2. MF4 — fix compiler/src/codegen.rs:2207 stale offset-24 comment.
  3. MF5 — remove the duplicate Phase 3a entry in PLAN_B_PROGRESS.md (line 103 or line 104).
  4. MF6 — address the TODO at compiler/src/typecheck.rs:2955 (the "gate fires first → unreachable" claim is now false).
  5. MF6 — sweep compiler/src/typecheck.rs for stale "E0133/E0134 fires" comments and update each to reflect the lift (lines 345-346, 473, 621, 2386, 2770 known; consider a git grep for additional sites).
  6. PB2 — add // PHASE-4-RESTRICTION (closure capture, Phase 4b/4c) markers next to the two null_closure lines in compiler/src/codegen.rs.

All six are documentation/comment fixes — none touches functional code or invalidates the test suite. They're small but they're the kind of drift that compounds across phases and confuses future readers (or future agents resuming this work). The PB4 deviation entry itself is excellent — it just needs the in-source comment at color.rs to point at it.

Once these six land, this is ready to merge.

Substantive items:
- indices.insert overwrite now asserted via debug_assert!(prev.is_none(),
  ...) at codegen.rs:967-985. Two `Expr::Handle` nodes sharing a span
  would silently lose one's `FuncRef`s; today AST-cloning passes don't
  do this, but the assertion fails loudly when the assumption stops
  holding (mirrors the op_names dedup → debug_assert pattern from
  54b4a60). Insert lives outside the assert so the side effect runs
  in release builds.
- Paired MF2 generator `cps_dispatch_returns_arm_value_across_op_id_shape_space`
  exercises the full `sigil_perform → sigil_run_loop → arm fn →
  next_step_done` dispatch path on every trial (12 trials with arm
  values sampled across [-99, 99]). The original MF2 test only
  exercised the no-perform half of identity; this closes that gap.
- New e2e `handle_with_three_arms_dispatches_op_id_two`: 3-arm handler
  (Pick.{a,b,c}), performs `Pick.c()`, asserts dispatch returns 2.
  Validates op_id assignment (alphabetical: a=0, b=1, c=2), arm-slot
  bitmap indexing at i=2, and `sigil_perform`'s linear walk for
  op_id ≥ 2 — coverage gap not closed by the existing 1-arm and
  2-arm e2e tests.

Documentation/comment fixes:
- color.rs `find_non_io_perform_in_expr` Phase-coverage comment now
  cross-references the PB4 deviation entry by title and names the
  Phase 4d closure point.
- typecheck.rs:2962 TODO rewritten: the unsolved-handler-overall
  edge case is closed today not by typecheck but by codegen's
  IntLit-only restriction (every arm pins handler_overall to Int);
  Phase 4c lifts the restriction at which point this path needs a
  decision.
- typecheck.rs asymmetric-gate-tests section comment updated to
  reflect both gates now lifted.
- typecheck.rs:7248 + :7280 stale "still fires" / "still gated"
  comments updated to point at the lift commits.
- codegen.rs `lower_perform_non_io_to_value` now has explicit
  PHASE-4-RESTRICTION markers on `null_args_ptr` (Phase 4b — args
  packing) and `null_k_closure`/`null_k_fn` (Phase 4d — continuation
  reification). Pairs with the existing marker on the
  set_arm null_ptr at line 2242.
- PLAN_B_DEVIATIONS.md Phase 3b entry's FFI-surface diagram
  updated: `sigil_perform → sigil_run_loop dispatch → arm fn →
  next_step_done → run_loop returns u64 → pop` (was: `→ NextStep
  value extraction → pop`, stale since the 2e7c0de fixup).

Nit:
- Xorshift64::bool → Xorshift64::next_bool (unambiguous at call
  sites, mirrors `next` and `range`).
Copy link
Copy Markdown
Owner Author

@boldfield boldfield left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Re-re-review of PR #22 — fixup commit ca326594

All five items from the prior re-review are addressed cleanly. No new issues.

Substantive items closed:

  • #1 indices.insert overwritedebug_assert!(prev.is_none(), ...) added at compiler/src/codegen.rs:967-985. Insert lives outside the assert so the side effect runs in release. Comment correctly references the op_names dedup→debug_assert pattern from 54b4a60 and names PB4's Phase 4d as the most likely future trigger.
  • #2 MF2 dispatch coverage — new cps_dispatch_returns_arm_value_across_op_id_shape_space (12 trials, arm values sampled from [-99, 99]). Body is (perform E_eff.op()) so the full sigil_perform → run_loop → arm fn → next_step_done path runs every iteration. Different PRNG seed (0x4D_46_32_44_49_53_50_00 = "MF2DISP\0") avoids overlap with the original MF2's seed. Verified the parser folds -N literals into Expr::IntLit(-N) (parser.rs:907), so negative arm values pass the codegen IntLit-only guard.
  • #3 3-arm e2e test — new handle_with_three_arms_dispatches_op_id_two. effect Pick { a, b, c } performs Pick.c(), asserts dispatch returns 2 (op_id=2). Validates alphabetical ID assignment, arm-slot indexing at i≥2, and sigil_perform's linear walk past unmatched arms 0/1.

Nits closed:

  • #4 Xorshift64::boolnext_bool — done at all 3 call sites.
  • #5 Phase 3b deviation FFI diagram — updated to frame_new → set_arm → push → sigil_perform → sigil_run_loop dispatch → arm fn → next_step_done → run_loop returns u64 → pop, with an explanatory sentence about the 2e7c0de fixup that introduced the run_loop step.

Bonus cleanups (not requested, all reasonable):

  • color.rs find_non_io_perform_in_expr — cross-references PB4 deviation entry by title and names Phase 4d as the closure point.
  • typecheck.rs:2962 TODO rewritten as a Closure-point entry; the unsolved-handler-overall edge case is correctly identified as closed by codegen's IntLit-only restriction (every IntLit arm body unifies handler_overall to Int), with Phase 4c as the lift point.
  • typecheck.rs "asymmetric gate state" header + two inline test comments updated to reflect both gates lifted.
  • codegen.rs::lower_perform_non_io_to_valuePHASE-4-RESTRICTION markers on null_args_ptr/zero_len (Phase 4b) and null_k_closure/null_k_fn (Phase 4d), pairing with the existing marker on set_arm's null_ptr. Phase-4 dependency edges are now grep-findable.

CI green at ca326594 on both ubuntu-24.04 and macos-14 (build+test and cold-checkout).

Nothing else to flag from my side. Ready to leave draft once any other reviewers sign off.

@boldfield
Copy link
Copy Markdown
Owner Author

Re-review at ca32659 — merge

All 6 items from the prior re-review are addressed. Plus three bonus additions per the commit message (defensive indices.insert assertion, paired MF2 generator exercising the full dispatch path, 3-arm e2e validating op_id ≥ 2). Mechanical state clean.

(Triage note: my initial verification on the prior pass missed the fixes because my local checkout was 4 commits behind origin — git pull --ff-only resolved. Worth flagging that the agent should keep working locally only if confident the local state matches the PR; the squash-merged result is what reviewers see, but pre-merge mid-flight reviews need the actual pulled branch.)

Per-item verification (all post-pull)

Item Verification
MF3 — color stub cross-reference color.rs:399 now contains explicit "See PLAN_B_DEVIATIONS.md entries:" block referencing the PB4 deviation entry by title and naming Phase 4d as the closure point
MF4 — codegen.rs:2207 stale offset-24 rg 'offset 24' compiler/src/codegen.rs returns nothing. Stale comment removed
MF5 — duplicate Phase 3a in PROGRESS.md ✅ One Phase 3a entry remains (the original (\ef4be8d`)`-tagged one); duplicate "(this commit)" entry deleted
MF6 — typecheck.rs:2955 TODO ✅ Rewritten as a "Closure point (Plan B Task 55, Phase 4c)" comment explaining the case is closed today by codegen's IntLit-only restriction (every IntLit arm pins handler_overall to Int), with Phase 4c lifting the restriction as the named closure point
MF6 — stale "E0133/E0134 still fires" comments ✅ My grep for still fires|still gated in typecheck.rs returns only E0066 / E0112 references (unrelated diagnostics that genuinely still fire). All E0133/E0134 stale comments updated
PB2 — PHASE-4-RESTRICTION markers ✅ Three markers added: codegen.rs:2054 (Phase 4b — args packing), :2065 (Phase 4d — continuation reification), :2266 (Phase 4b/4c — closure capture). Grep-findable for future phase work

Bonus additions (per ca32659 commit message)

  • indices.insert overwrite now asserted via debug_assert!(prev.is_none(), ...) at codegen.rs:967-985 — defensive guard against silent FuncRef loss if two Expr::Handle nodes ever share a span
  • Paired MF2 generator cps_dispatch_returns_arm_value_across_op_id_shape_space exercises the full sigil_perform → sigil_run_loop → arm fn → next_step_done dispatch path (12 trials, arm values across [-99, 99]) — closes the gap where the original MF2 only exercised the no-perform half of identity
  • New e2e handle_with_three_arms_dispatches_op_id_two validates op_id assignment (alphabetical: a=0, b=1, c=2), arm-slot bitmap indexing at i=2, and sigil_perform's linear walk for op_id ≥ 2 — coverage gap not closed by 1-arm and 2-arm tests
  • Xorshift64::boolXorshift64::next_bool (consistency with next and range)

Mechanical

  • cargo test -p sigil-compiler --lib (post-pull): 419 passed, 0 failed
  • CI status at time of review: macos-14 lanes SUCCESS; ubuntu-24.04 lanes IN_PROGRESS. Expect green based on local verification + macos parity
  • All prior must-fix items closed; no new findings

Once CI confirms ubuntu-24.04 lanes pass, ready to merge.

@boldfield boldfield merged commit 2f56e87 into main Apr 26, 2026
4 checks passed
boldfield added a commit that referenced this pull request Apr 26, 2026
Addresses PR #23 review feedback at `dea0106`.

MF1 (args-content verification gap): the three Phase 4b e2e
tests verify FFI plumbing compiles + runs but cannot observe
arg values arriving correctly because arm bodies are still
IntLit-only. Per reviewer's option (3) — cheapest of the
three offered — defer arg-content verification to Phase 4c
with an explicit pre-registered acceptance criterion in the
Phase 4b deviation entry. Phase 4c's PR is now required to
ship 4 named arg-content readback tests (Int / Bool+Char /
String / multi-arg-in-declared-order) before merging. A
bisecting agent investigating wrong-args-value bugs after
Phase 4c lands gets a pointer back to Phase 4b as the
suspect.

MF2 (stack-slot lifetime tied to synchronous call pattern):
add a `// PHASE-4-RESTRICTION (Plan B Task 55, Phase 4d):`
marker on the `create_sized_stack_slot` line. Phase 4d
converts perform sites to return `NextStep::Call` to the
caller's trampoline rather than synchronously calling
`sigil_run_loop` from native code; at that point the stack
slot dies before the trampoline reads it on the next
dispatch and the args buffer must migrate to arena
allocation. Marker is grep-findable; cross-references the
existing `Native callers drive sigil_run_loop synchronously`
deviation entry. Closure-point note also added to the new
Phase 4b deviation entry.

MF3 (PR #22 single-branch deviation entry is now stale):
PR #22's foundation-phase entry claimed "only one PR opens
for Task 55." That turned out wrong in practice — Phase 4b
landed as a separate PR (#23) on its own branch
(`plan-b-task-55-phase-4b`). Add a "Cadence pivot" addendum
to that entry recording the per-phase-PR cadence with
reasoning (reviewability, independent CI cycles, rollback
granularity, review checkpoint cadence). Original "single-PR
convention" claim preserved unchanged for historical
accuracy. Future Task 55 phase work expected to follow the
per-phase-PR cadence.

Review #2 (doc/code drift on assert vs debug_assert): the
new Phase 4b deviation entry said "defensive `debug_assert`"
but the actual code was `assert!`. Per reviewer's
recommendation, change the code to `debug_assert!` and let
the runtime's `sigil_perform` overflow check (which has a
named `effect_id` / `op_id` message) be the source of truth
for users; the compiler-side `debug_assert!` exists to catch
the bug pre-link in dev builds before the runtime guard
fires. Deviation text updated to match. v1's effect arities
(0–2 user args) sit so far below the 30-arg cap that this
delta has zero observable surface today.

Review #3 (widening fallthrough silently miscompiles future
floats in release builds): change `debug_assert_eq!` to
`assert!` so a future F32/F64 surface type or 32-bit-target
port that smuggles a non-pointer-width value through the
args-buffer path panics in *both* dev and release builds
rather than silently storing the bit-pattern as if it were
a pointer-sized value. Comment expanded to explain the
trade-off. Float / 32-bit-target safety net documented in
the Phase 4b deviation entry.

Pod-verify clean.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant