Skip to content

Plan A1 code-review fixes: 6 issues from the post-review audit#1

Merged
boldfield merged 9 commits into
mainfrom
fix/a1-review
Apr 23, 2026
Merged

Plan A1 code-review fixes: 6 issues from the post-review audit#1
boldfield merged 9 commits into
mainfrom
fix/a1-review

Conversation

@boldfield
Copy link
Copy Markdown
Owner

Summary

Post-Plan-A1 code-review pass. Six concrete issues the review found are fixed cleanly; the vertical slice (hello-world compiles + runs) is preserved.

  • Fix 1 ([Task 7] d1c91b2): Typecheck now emits user-facing codes E0040–E0046, one per class (no fn main / wrong signature / effect not in row / arg count / arg type / let-type mismatch / unknown ident). Previously every typecheck error used E0001 "internal compiler error". Adds unit tests covering each new code plus a negative sweep asserting no program surfaces E0001.
  • Fix 2 ([DEVIATION Task 0.11] d3e9966): Introduces a real StackMapBuilder in compiler/src/codegen.rs. Versions the stackmap section ("SGST" magic + u32 version = 0 + u32 record_count); every Plan A1 record carries STACKMAP_FLAG_PLACEHOLDER = 0x0001 so a Plan B v1 reader can detect Stage 1 data and bail. runtime/src/stackmap.rs grows a parse_section function + unit tests; runtime/README.md is rewritten to lead with the "v0 placeholder" caveat and document the v0 → v1 upgrade path. PLAN_A1_PROGRESS.md Task 0.11 status moves from done to done-with-caveat and the false "Compiler-side StackMapBuilder ships with task 12" self-report is corrected.
  • Fix 3 ([Task 7] 77e79aa): Replaces Expr::Ident(_, _) => Some(Ty::Unit) placeholder with a real lookup through a per-function BTreeMap<String, Ty>. Unknown identifiers emit E0046 with a span. Positive-path test (IO.println(greeting: String)) would previously have passed under Ty::Unit and now exercises the env.
  • Fix 4 ([Task 4] 196a377): Lexer emits E0050 on integer literals that overflow i64 instead of silently substituting 0. Test covers overflow + i64::MAX boundary.
  • Fix 5 ([Task 1] 44f70dd): Removes direct object = "=0.39.1" workspace dep. Imports go through cranelift_object::object::* re-export. cargo tree -e normal -p sigil-compiler | grep -w object now shows object as a transitive-only dep of cranelift-object.
  • Fix 6 ([CHORE] e9d1af8): Consolidates six hand-rolled ErrorCode::new(..).unwrap_or(panic) sites across lexer/parser/resolve/typecheck into a single errors::code(..) -> ErrorCode helper backed by unreachable!. Adds disallowed-macros = [{ path = "std::panic", ... }] to clippy.toml. Non-test panic! surface is now zero.

Acceptance evidence (laptop, aarch64-apple-darwin)

command result
cargo fmt --all -- --check clean
cargo clippy --no-deps --workspace --all-targets -- -D warnings clean (new panic rule active)
cargo test --workspace --no-fail-fast 67 pass / 0 fail (vs 43 baseline: +14 typecheck, +4 codegen, +3 lexer, +1 e2e stackmap-parse, +2 runtime stackmap-parse +2 runtime stackmap-round-trip)
scripts/smoke.sh smoke: OK (hello-world printed as expected)
scripts/reproducibility.sh reproducibility: OK (sha256=183e74cc…)
scripts/check-no-interior-pointers.sh check-no-interior-pointers: OK
./target/release/sigil examples/hello.sigil -o /tmp/sigil_hello && /tmp/sigil_hello hello, world
sigil --print-runtime-stats examples/hello.sigil -o /tmp/sigil_hello_stats SIGIL_COUNTER_BOEHM_ALLOC_COUNT=1, SIGIL_COUNTER_ARENA_ESCAPE_COUNT=0
otool -l /tmp/sigil_hello | grep -A4 __stackmaps present, size 0x2c = 44 bytes (= 12 header + 4 records × 8, one per Cranelift call site)
grep -rn 'panic!' compiler/src/ | grep -v '#\[cfg(test)\]' comments/doc-strings only; all live sites are inside #[cfg(test)] modules with matching #[allow]
cargo tree -e normal -p sigil-compiler | grep object object v0.39.1 appears only as transitive of cranelift-object

Test plan

  • Unit tests per fix (see commit descriptions for each)
  • e2e hello still passes
  • New e2e stackmap_section_parses_v0_placeholder validates the v0 format end-to-end through the runtime parser
  • Reproducibility across two same-host builds
  • Manual hello-world on macOS aarch64

Scope fence

  • No changes to docs/plans/2026-04-21-sigil-design.md (in the boldfield/designs repo).
  • No new dependencies added; the object direct dep was removed.
  • No Plan A2 features implemented. The Expr::Call typecheck site was moved from E0001 to E0043 for consistency; it remains unreachable in Plan A1 programs.
  • Commits follow [Task N] / [DEVIATION Task N] / [CHORE] prefix discipline. Deviation entry for Fix 2 landed in the implementing commit (matches the existing 51655e7 precedent); hash backfill is the [CHORE] Backfill commit that follows (matches fb7d832 precedent).

🤖 Generated with Claude Code

boldfield and others added 9 commits April 22, 2026 22:15
Post-review hardening (Fix 6). Replaces six hand-rolled
`ErrorCode::new(..) -> panic!` call sites across lexer, parser, resolve,
and typecheck with a single `errors::code(..) -> ErrorCode` helper
backed by `unreachable!` — the catalog seed is a build-time invariant.
Adds `disallowed-macros = [{ path = "std::panic", ... }]` to
`clippy.toml` so future non-test `panic!` sites fail CI. Test modules
extend their `#[allow(..)]` attribute to include
`clippy::disallowed_macros`.

Non-test `panic!` surface outside this one `unreachable!` call is now
zero.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
… diagnostic

Post-review fix (Fix 1). Adds catalog entries E0040..E0046 and routes
every user-reachable diagnostic from `compiler/src/typecheck.rs`
through one of them. Previously every typecheck diagnostic was emitted
with E0001 — "internal compiler error — always a compiler bug" — which
defeated the error-catalog pedagogy.

New codes (stable once minted):

- E0040 — program has no `fn main`
- E0041 — `fn main` has wrong signature (return type, params, or
          effect-row contents beyond `IO`)
- E0042 — effect used but not declared in enclosing row (also covers
          "unknown effect operation" for Plan A1's minimal surface)
- E0043 — argument count mismatch at call site
- E0044 — argument type mismatch at call site
- E0045 — let-binding declared type does not match initializer
- E0046 — unknown identifier (catalog entry lands here; the call site
          arrives with Fix 3 — currently masked by the
          `Ident => Some(Ty::Unit)` placeholder)

Adds unit tests covering each code plus a negative sweep that asserts
no typecheck test program surfaces E0001. `sigil explain <code>` prints
the long form for every new code.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…unknown refs

Post-review fix (Fix 3). Replaces `Expr::Ident(_, _) => Some(Ty::Unit)`
— which silently typed every identifier as Unit — with a lookup in a
per-function type environment. Parameters populate the env on entry to
`check_fn`; `let` bindings extend it as their statements are checked.
Unknown identifiers emit E0046 with a span pointing at the reference
and return None so downstream checks don't cascade from a fabricated
type.

The placeholder was benign in Plan A1 (hello-world never references a
user binding) but would have silently accepted nonsense once A2 adds
arithmetic. The two new positive-path tests (`bound_ident_resolves_to_its_type`,
`bound_ident_wrong_type_is_e0044`) would both fail under the old code
because IO.println(x: Int) would have passed as IO.println(x: Unit).

Uses BTreeMap for the env to preserve deterministic-iteration
discipline even though Plan A1 never iterates the env (only
point-lookups).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Post-review fix (Fix 4). Replaces `lit.parse::<i64>().unwrap_or(0)` in
`compiler/src/lexer.rs` — which silently turned any over-range literal
into zero — with a positioned E0050 diagnostic. A zero-valued token is
still produced at the literal's span so subsequent parse positions do
not shift; the compile aborts after the errors sweep regardless.

Adds catalog entry E0050 ("integer literal out of range") and unit
tests covering the overflow path and the i64::MAX boundary (which must
still lex clean).

This was a latent wrong-code bug: Plan A1's hello-world never hit it,
but any Plan A2 arithmetic expression using an over-range constant
would have silently computed from 0 instead.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Post-review fix (Fix 5). The direct `object = "=0.39.1"` dependency was
added alongside Tasks 3-15 without a logged deviation. The plan's
dependency allow-list is cranelift, cranelift-module, cranelift-object,
target-lexicon, the Boehm GC crate (replaced by direct FFI),
include_dir, and insta — `object` was not on it.

`cranelift-object` already re-exports `object` at its crate root
(`pub use object;`) and pins the same `=0.39.1` version, so removing
the direct dep requires only switching the imports in
`compiler/src/codegen.rs` to go through `cranelift_object::object`.
`cargo tree -e normal -p sigil-compiler | grep -w object` now shows
`object` as a transitive of `cranelift-object` only.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Post-review fix (Fix 2). Corrects four sub-issues in Task 0.11's
original landing, logged as a single deviation in PLAN_A1_DEVIATIONS:

1. Introduces a real `StackMapBuilder` type in compiler/src/codegen.rs
   (the plan and runtime README both referenced it, but the prior
   codegen inlined a `Vec<u32>` serialization at section-emit time).
2. Gives the section a versioned header ("SGST" magic, u32 version = 0,
   u32 record_count). v0 flags every record with
   STACKMAP_FLAG_PLACEHOLDER (0x0001) so a Plan B v1 reader can detect
   Stage 1 placeholders and bail / resynthesise from relocations rather
   than consuming opaque Cranelift Inst handles as real PC offsets.
3. Declares `live_count = 0` as an asserted v0 invariant on both emit
   and parse paths.
4. Rewrites runtime/README.md's stackmap section to lead with the
   "Plan A1 limitation: placeholder format" caveat, document the v0
   wire format that is actually emitted, and specify the v0 -> v1
   upgrade path. Fixes PLAN_A1_PROGRESS Task 0.11's false self-report
   (status bumped from `done` to `done-with-caveat`).

Adds runtime-side parser `runtime::stackmap::parse_section` with
unit tests for each error mode (truncated header, bad magic, unknown
version, truncated records, empty ok, with-records round-trip) and a
compiler-side e2e test `stackmap_section_parses_v0_placeholder` that
compiles hello.sigil to .o and parses the resulting section through
the runtime parser — asserting the v0 invariants end-to-end.

Real post-regalloc PC offsets + live-value lists remain deferred to
Plan B's Cranelift safepoint integration. The constants, header
layout, and StackMapBuilder API are designed so Plan B's changes are
additive: push_placeholder becomes push, version bumps to 1, flags
bit 0 clears. Compiler and runtime tests both become forcing inputs
for Plan B's work.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Matches the existing backfill pattern (see fb7d832): the
`[DEVIATION Task 0.11]` entry and the Task 0.11 PROGRESS line both
now cite the implementing commit hash rather than the pre-commit
placeholder.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Post-Fix-2 fmt fixup: `cargo fmt` rejoins
`STACKMAP_HEADER_SIZE.saturating_add(count.saturating_mul(...))`
onto one line. The expression fits in rustfmt's default width and the
original multi-line form was an artifact of the earlier clippy
`manual_saturating_arithmetic` rewrite.

No behaviour change.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ut sha256sum

macOS CI (`macos-14` runner) has no GNU coreutils on PATH by default,
so the hardcoded `sha256sum` invocation failed with `command not
found`. Local macOS runs worked because the laptop has Homebrew
coreutils installed — this was masked until CI exercised a clean
runner.

Prefers `sha256sum` when present (no behaviour change on Linux CI)
and falls back to `shasum -a 256`, which is the portable Perl script
that ships with macOS by default. Both emit the hash as the first
whitespace-separated field, so the downstream `awk '{print $1}'`
extraction is unchanged.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@boldfield boldfield merged commit 5a2e55c into main Apr 23, 2026
2 checks passed
@boldfield boldfield deleted the fix/a1-review branch April 23, 2026 05:54
boldfield added a commit that referenced this pull request Apr 24, 2026
…0111

PR #12 review flagged that `Expr::RecordLit`'s "not yet implemented"
stub emitted E0001, which the catalog defines as "internal compiler
error — always a compiler bug, not a user error." Every well-formed
user program that uses a record literal today hits that message,
falsely claiming a compiler bug. Third time this anti-pattern has
been flagged on the project (PR #1, PR #5/6, now #12).

Changes:
- Add E0111 catalog entry — "record literal / constructor application
  pending Plan A3 Task 38". The long-form text documents this as a
  staged-unimplemented-feature diagnostic that Task 38 removes.
- Emit E0111 (not E0001) at typecheck.rs:527-538 for RecordLit.
- Extend `no_user_facing_error_uses_e0001` self-test with a record-
  literal program. This closes the coverage gap that let the
  regression through; any future user-reachable typecheck stub added
  in Tasks 38-45 must be covered by this sweep. Test fails under the
  old E0001 path (verified before the catalog+typecheck change),
  passes under E0111.
- Add positive-path parser test `record_literal_in_call_arg_of_match_
  scrutinee` pinning the no_record_lits flag-reset through a call arg
  list when the match scrutinee sets the flag. Twin of the existing
  if-cond coverage.

pod-verify green (fmt, check, clippy, runtime lib tests, interior-
pointer + HashMap discipline greps).
boldfield added a commit that referenced this pull request Apr 25, 2026
…tests

Addresses PR #16 review (boldfield):

- Comment 4318163326 #1 (mangling format ambiguity, soundness hazard):
  switch separator from `_`/`__` to `$`/`$$`. `$` is lexer-rejected
  per Plan A2's `$lambda_N` precedent, so user identifiers cannot
  contain it. `type List_Option[A]` instantiated at `Int` no longer
  collides with `List[Option[Int]]` — the former mangles to
  `List_Option$$Int`, the latter to `List$$Option$Int`. Codegen's
  `mangle_user_fn` rewrites `$` to `__` for ELF/Mach-O compat;
  AST-level uniqueness is preserved.

- Comment 4318163326 #2 (E0132 ambiguous polymorphism): new
  diagnostic in error catalog plus end-of-typecheck check. After
  resolving pending instantiations through `subst.apply_ty`, any
  type-arg that is still `Ty::Var(_)` AND not an outer-fn's
  generic-param id (i.e. not legitimately deferred to clone time)
  fires E0132 with the affected param name. Three tests pin the
  diagnostic: fires on unconstrained `nothing[A]()`, doesn't fire
  on constrained `id[A](42)`, doesn't fire on outer-fn-bound
  `use_id[B](y) { id(y) }`.

- Comment 4318161359 #2 (canon_ty placeholder collision): the
  `Ty::Var(_)` and `Ty::Fn(_)` arms in `canon_ty` and
  `ty_to_type_expr` now `unreachable!` instead of emitting
  placeholder strings that two distinct vars would collide at.
  Reachable only when the upstream invariant breaks — E0132
  catches ambiguous-poly cases at typecheck, and `Ty::Fn` requires
  a `TypeExpr::Fn` surface that v1 deliberately omits.

- Comment 4318161359 #1 (call-site ctor lookup span fragility):
  the `Expr::Call` arm now explicitly checks `ctor_sites.get(span)`
  by the Call's span before falling through to the recursive
  callee rewrite. No longer relies on the parser's incidental
  `Call.span == callee_ident.span` for ctor calls.

- Comment 4318161359 #3 (pattern sub-ctor unimplemented! guard):
  `rewrite_pattern`'s `Pattern::Ctor` arm now `unreachable!`s when
  the ctor's owning type is generic but doesn't match the
  scrutinee's User type. v1 surface can't construct this case;
  surfacing loudly closes a silent-miscompile path.

- Comment 4318163326 #3 (test additions/strengthening):
  - `match_against_generic_scrutinee_rewrites_pattern_ctors`:
    strengthened to assert *both* `Nada$$Int` and `Just$$Int`
    appear, not just `>= 1`.
  - `recursive_generic_type_termination_one_clone_per_arg_tuple`:
    new test pinning that a recursive generic type
    (`type List[A] = | Nil | Cons(A, List[A])`) produces exactly
    one `List$$Int` clone given recursive use.
  - `self_recursive_generic_fn_terminates`: new test on a
    self-recursive generic fn.
  - `end_to_end_nested_generic_clone_name_matches_pinned_format`:
    new e2e variant of `mangle_fn_handles_nested_generics` that
    runs through the full pipeline.
  - `ctor_call_site_callee_ident_is_rewritten`: new test asserting
    the Call's callee Ident text is the mangled name post-mono
    (closes Comment 1 #1's gap).
  - `generic_fn_calling_generic_fn_resolves_var_chain`: new test
    exercising the `Ty::Var → Ty::Var → concrete` chain through
    `Substitution::by_var`.
  - `two_fn_clones_with_nested_generic_type_args`: new test
    confirming two clones for two distinct nested instantiations.
  - `canon_ty_disambiguates_underscore_named_user_types`: new
    unit test pinning the soundness fix.

- Comment 4318163326 #4 (Deviation #2 reframe): effect-rows-not-
  monomorphized is reframed from open-ended ("v2 might revisit")
  to permanent v1 design choice; informational. Closure-path = none.

- Comment 4318161359 #5 (apply_to_ty Ty::Fn comment): added inline
  comment pointing at Deviation #2 so the unchanged `effect_row_var`
  copy isn't mistaken for a substitution bug.

- Pattern::Var nullary-ctor promotion in monomorph: typecheck's Plan
  A3 task 38.3 nullary-ctor promotion treats `Pattern::Var(name)`
  semantically as `Pattern::Ctor` when the name matches a Unit
  variant of the scrutinee's user type, but doesn't rewrite the AST.
  Monomorph now performs that rewrite explicitly (with mangled name
  `Nada$$Int` etc) so codegen's `ctor_index` lookup succeeds
  post-mono. This was the root cause of the strengthened
  match-pattern test failing initially.

DEVIATIONS Deviation #1 updated to reflect the `$` separator and the
hardening rationale. Deviation #2 reframed.

Test deltas: 309 → 319 compiler lib tests (+10).
- 28 → 35 monomorphize tests (+7 new).
- 281 → 284 typecheck tests (+3 E0132 tests).

Pod-verify green: cargo check workspace, cargo fmt --check, cargo
clippy -p sigil-runtime -p sigil-compiler --all-targets -D warnings,
cargo test -p sigil-runtime --lib, scripts/check-no-interior-pointers.sh.
boldfield added a commit that referenced this pull request Apr 25, 2026
…de reasons

Addresses both review comments on PR #17. Six structural changes plus
five new tests; existing 32 unit tests + 2 e2e tests still pass.

Bugs / correctness:

1. Edge insertion now drives off CheckedProgram::call_site_instantiations
   instead of the bare-Ident-name heuristic. Typecheck's env-precedence
   rules win over fn_schemes lookup at every Ident span, so a parameter
   or `let` binding shadowing a top-level fn name does NOT produce a
   spurious outgoing edge. The bare-Ident heuristic over-approximated
   here — the new code uses information typecheck already computed.
   Pinned by parameter_shadowing_top_level_fn_does_not_taint_caller, a
   real-front-end test that runs through typecheck.

2. tarjan_scc is now iterative with an explicit work stack — eliminates
   the recursive-frame stack-overflow risk on long monomorph chains.
   Reverse-topological emission order preserved; behavior on existing
   tests unchanged (verified by re-running all 15 prior color tests
   plus the 5 new ones).

Diagnostic / API:

3. Per-node reason text is now the node's *own* most-proximate cause:
   intrinsic-CPS members keep their specific local reason; bridge
   members (with an outgoing edge to a CPS SCC in a different SCC) get
   `cps: transitively calls <callee> which is cps`; non-bridge members
   propagated through SCC membership get either
   `cps: in SCC with intrinsically-cps member <name>` or
   `cps: in SCC bridging to cps callee via <name>` depending on which
   peer caused the taint. Two-variant SCC-fallback wording lets a
   --dump-color reader follow the causal chain in adversarial review.

4. pipeline::dump_color returns Result<String, DumpColorError> with
   typed ReadFailed / FrontEndErrors(usize) variants instead of
   conflating both as `usize`. Caller still maps to exit code 1 today,
   but downstream tooling can branch on the variant.

5. main.rs now warns on stderr when `-o <path>` is supplied alongside
   --dump-color: "warning: `-o <path>` ignored under --dump-color (no
   executable produced)". The flag is still accepted (shell-history
   ergonomics) but the misuse is no longer silent.

6. unused_param_warning_silenced renamed to
   synth_fn_with_param_classifies_native — the old name didn't match
   the test body.

New tests (in addition to the rename):

- self_loop_single_node_scc_native + self_loop_single_node_scc_cps —
  pin Tarjan correctness on a single-node-with-self-loop SCC.
- let_bound_fn_value_taints_parent_via_outgoing_edge — pins the
  load-bearing soundness claim that `let f = some_cps_fn` taints the
  parent even though f is never invoked.
- cps_propagates_through_three_scc_chain — pins reverse-topological
  propagation across depth-N hops, not just single-hop.
- scc_taint_via_transitive_only_branch — exercises the
  transitive-only SCC reason branch (no intrinsic member, only
  bridge), pins the new
  "in SCC bridging to cps callee via <name>" reason wording.
- parameter_shadowing_top_level_fn_does_not_taint_caller — the
  Review #1 reproducer; real front-end run; pins precision.

Pod-verify green. Full lib tests + new e2e behavior (the -o warning)
deferred to CI.
boldfield added a commit that referenced this pull request Apr 25, 2026
* [Task 50] Color inference: per-monomorph, SCC-aware, --dump-color

Replaces the Plan A1 always-Native stub in compiler/src/color.rs with
real per-monomorph analysis. Each top-level fn (each post-mono
"monomorph") is classified Native or Cps:

- Native: closed effect row of [] or [IO], no `perform` of a non-IO
  effect anywhere in the body or nested lambdas, and (after
  propagation) no transitive call into a Cps-color monomorph.
- Cps: anything else.

The propagation pass uses a Tarjan SCC over the monomorph call graph.
Within an SCC, if any member is locally Cps the whole SCC is Cps;
otherwise if any member calls into a downstream SCC that's Cps, the
SCC is Cps; otherwise Native. This matches Plan B's "SCC-aware"
guidance and avoids over-pessimizing one fn because of an unrelated
cycle.

Reasons are stable, machine-readable, and human-readable, covering:
- `cps: open effect row`
- `cps: row contains effect <E>`
- `cps: performs <E>.<op>`
- `cps: transitively calls <callee> which is cps`
- `cps: in SCC with cps member <name>`
- `native: pure row` / `native: row is ![IO]`

`--dump-color` lands in cli.rs + main.rs + new pipeline::dump_color
helper. The flag runs the front end through color inference and prints
`<name> native|cps <reason>` per fn (one line per monomorph in program
order) to stdout, no codegen. `-o` is silently accepted but ignored;
`--print-runtime-stats` conflicts with `--dump-color` and emits a
usage error.

Tests:
- 15 color::tests cover open-row Cps, non-IO-row Cps,
  perform-non-IO-with-IO-row Cps (via the body-walk fallback),
  caller-callee both Native, mutual recursion Native, mutual
  recursion with one Cps member tainting whole SCC, transitive Cps
  taint, unrelated cycle non-pessimization, and lambda body perform
  tainting parent.
- 4 cli::tests pin --dump-color parsing, default and human formats,
  -o silent ignore, and the conflict with --print-runtime-stats.
- 2 e2e tests exercise `sigil <input> --dump-color` end-to-end.

Pod-verify green; full lib + e2e tests deferred to CI.

* [Task 50] Review fixups: iterative Tarjan + span-keyed edges + per-node reasons

Addresses both review comments on PR #17. Six structural changes plus
five new tests; existing 32 unit tests + 2 e2e tests still pass.

Bugs / correctness:

1. Edge insertion now drives off CheckedProgram::call_site_instantiations
   instead of the bare-Ident-name heuristic. Typecheck's env-precedence
   rules win over fn_schemes lookup at every Ident span, so a parameter
   or `let` binding shadowing a top-level fn name does NOT produce a
   spurious outgoing edge. The bare-Ident heuristic over-approximated
   here — the new code uses information typecheck already computed.
   Pinned by parameter_shadowing_top_level_fn_does_not_taint_caller, a
   real-front-end test that runs through typecheck.

2. tarjan_scc is now iterative with an explicit work stack — eliminates
   the recursive-frame stack-overflow risk on long monomorph chains.
   Reverse-topological emission order preserved; behavior on existing
   tests unchanged (verified by re-running all 15 prior color tests
   plus the 5 new ones).

Diagnostic / API:

3. Per-node reason text is now the node's *own* most-proximate cause:
   intrinsic-CPS members keep their specific local reason; bridge
   members (with an outgoing edge to a CPS SCC in a different SCC) get
   `cps: transitively calls <callee> which is cps`; non-bridge members
   propagated through SCC membership get either
   `cps: in SCC with intrinsically-cps member <name>` or
   `cps: in SCC bridging to cps callee via <name>` depending on which
   peer caused the taint. Two-variant SCC-fallback wording lets a
   --dump-color reader follow the causal chain in adversarial review.

4. pipeline::dump_color returns Result<String, DumpColorError> with
   typed ReadFailed / FrontEndErrors(usize) variants instead of
   conflating both as `usize`. Caller still maps to exit code 1 today,
   but downstream tooling can branch on the variant.

5. main.rs now warns on stderr when `-o <path>` is supplied alongside
   --dump-color: "warning: `-o <path>` ignored under --dump-color (no
   executable produced)". The flag is still accepted (shell-history
   ergonomics) but the misuse is no longer silent.

6. unused_param_warning_silenced renamed to
   synth_fn_with_param_classifies_native — the old name didn't match
   the test body.

New tests (in addition to the rename):

- self_loop_single_node_scc_native + self_loop_single_node_scc_cps —
  pin Tarjan correctness on a single-node-with-self-loop SCC.
- let_bound_fn_value_taints_parent_via_outgoing_edge — pins the
  load-bearing soundness claim that `let f = some_cps_fn` taints the
  parent even though f is never invoked.
- cps_propagates_through_three_scc_chain — pins reverse-topological
  propagation across depth-N hops, not just single-hop.
- scc_taint_via_transitive_only_branch — exercises the
  transitive-only SCC reason branch (no intrinsic member, only
  bridge), pins the new
  "in SCC bridging to cps callee via <name>" reason wording.
- parameter_shadowing_top_level_fn_does_not_taint_caller — the
  Review #1 reproducer; real front-end run; pins precision.

Pod-verify green. Full lib tests + new e2e behavior (the -o warning)
deferred to CI.

* [Task 50] Review-fixup-2: discriminating shadowing test

Addresses the residual non-blocking concern on PR #17:
`parameter_shadowing_top_level_fn_does_not_taint_caller` was a
tautology under Stage 5 (typecheck rejects non-IO effects, so
`dangerous` ended up Native and the test passed under either the
precise edge logic or the old name-only heuristic).

Reviewer suggested three options; this commit applies option 3
(synthetic-test span uniqueness):

- New `unique_span()` test helper: atomic-counter span generator
  that produces a fresh span per call. Used only by tests that need
  to disambiguate individual `Expr::Ident` occurrences via their
  span. The constant `span()` helper is unchanged for the rest of
  the synth tests.

- New `synth_program_with_calls(items, calls)` test helper that
  accepts a caller-supplied `call_site_instantiations` map. The
  auto-built map produced by `build_synthetic_calls_map` is a
  name-only heuristic, not env-precedence-aware; tests modeling
  shadowing supply their own map manually.

- Rewrote `parameter_shadowing_top_level_fn_does_not_taint_caller`
  as a synthetic program: `dangerous` is intrinsically CPS (Raise
  effect), `unrelated` is Native, `caller(dangerous: Int)` body has
  a param-ref `dangerous` Ident (unique span deliberately omitted
  from the calls map) AND a real call to `unrelated()` (Ident span
  recorded in the calls map). Asserts `caller` Native.

  This is now genuinely discriminating: under the old name-only
  heuristic, caller would acquire a spurious edge to `dangerous`
  (CPS) and falsely classify CPS; under the precise calls-map-driven
  edge logic, the shadow Ident produces no edge and caller stays
  Native.

Also addresses the non-blocking nit on `bridge_callee_of`: replaced
the misleading "skip when already-decided" hint with an honest
comment explaining we always compute the map (the per-node reason
loop needs it for both intrinsic-present and intrinsic-absent SCCs)
and that pure leaf SCCs naturally hit the empty case without an
explicit skip.

Pod-verify green; existing 32+1 unit tests pass.
boldfield added a commit that referenced this pull request Apr 26, 2026
Addresses all critical/important/minor items from PR #21 review across
the structural review pass and the context-aware must-fix pass.

## Critical / GC rooting (M1, Critical #1, #2)

- Add `GC_add_roots`, `GC_remove_roots`, `GC_gcollect`,
  `GC_register_my_thread`, `GC_unregister_my_thread`,
  `GC_allow_register_threads` to gc.rs externs.
- `sigil_gc_init` registers the calling thread's `HANDLER_STACK` cell
  and `ARENA` storage range with `GC_add_roots`. `cfg(not(test))` so
  test threads don't auto-register (auto-registration leaks ranges
  across cargo test's per-test thread teardowns).
- `register_*_for_calling_thread` / `unregister_*_for_calling_thread`
  pairs in handlers.rs and arena.rs return the registered range so
  test infrastructure can symmetrically remove it.
- `GcThreadEnrolment` in test_support.rs is now an RAII guard that
  enrols the thread, registers both TLS roots on Acquire, and removes
  both roots + unregisters the thread on Drop.
- gc.rs comment at the GC_malloc/atomic selector clarifies the
  bitmap's v1 effect (binary signal) vs the per-bit precision being
  v2-forward-compat metadata.
- arena.rs and handlers.rs module docs gain "GC reachability" sections
  documenting the rooting model + the conservative-scan pinning
  tradeoff for arena bytes.

## Critical #3: Vec::reserve panic across FFI

- arena.rs `ensure_capacity_or_abort` uses `try_reserve_exact` and
  aborts on Err rather than panicking. Backing storage type changed
  to `Vec<u64>` for guaranteed 8-byte alignment (Critical #5).

## Important fixes

- #4: `round_up_to_align` uses `checked_add`; aborts cleanly on overflow.
- #5: `Vec<u64>` backing — natural u64 alignment, no system-allocator
  dependency. Test asserts absolute 8-byte alignment of every return.
- #6: `sigil_handler_frame_new` explicitly zeros the variable-length
  arms region with `ptr::write_bytes` rather than relying on the
  Boehm-allocator-zero contract. Comment updated.
- #7 / M5: `sigil_perform` bound-checks `args_len + 2` against
  `MAX_INLINE_ARGS` at entry, naming `effect_id` / `op_id` in the abort
  message. `sigil_next_step_call` does the same against `arg_count`.
  Trampoline-side check kept as defense-in-depth.
- #8: `MAX_INLINE_ARGS = 32` promoted to `pub const` at the
  `handlers` module top. The trampoline's stack-resident `args_buf`
  uses the same constant.
- #9: arena overflow has a `#[ignore]`-d test (`arena_overflow_aborts`)
  + a test-only `force_capacity_for_test` hook for manual verification
  via `cargo test -- --ignored`.
- #10: new `perform_walks_three_deep_prev_chain_to_match` test —
  3 frames pushed, `sigil_perform` walks past 2 unrelated outer
  frames to reach the deepest matching frame, depth-counter delta = 3.
- #11: marked `sigil_arena_reset` and `sigil_handle_pop` `unsafe extern "C"`
  for FFI consistency. Test-side callers updated.
- #14: `sigil_handle_push` debug-asserts `(*frame).prev.is_null()` to
  catch double-push at the push site, not later. `sigil_handle_pop`
  clears `prev` so repush of the same frame in a loop is supported.
- #15: `payload_words` cast in `sigil_handler_frame_new` uses
  `try_into` with abort fallback documenting the invariant.
- #16: INVARIANT comment near `Header::new` call about Boehm consuming
  bitmap-only.

## M2: GC stress tests

Three new tests verifying the rooting contract holds under forced GC:

- `handler_frame_survives_forced_gc_while_pushed` — push frame, alloc-spam
  to overwrite stack aliases, GC_gcollect, perform succeeds.
- `closure_in_handler_arm_slot_survives_gc` — closure in arm 0's
  closure_ptr slot survives GC because the frame is rooted via
  HANDLER_STACK and the bitmap selects GC_malloc (conservative scan).
- `closure_in_next_step_survives_gc_via_arena_root` — closure stored
  in arena via NextStep::Call survives GC because the arena range is
  rooted.

All three are `#[ignore]`-gated with explanatory text — Boehm thread
enrolment composes poorly with cargo test's per-test thread teardowns
even with explicit `GC_unregister_my_thread`. Each passes in
isolation; manual verification via `cargo test -- --ignored survives_gc`.

## M3: MAX_HANDLER_ARMS bumped to 14

Off-by-one in the original doc-comment at `handlers.rs:79-83` (claimed
"bit 31 corresponds to arm 13"). Bumped cap to 14 so `i ∈ [0, 13]`
fully utilises the 32-bit bitmap with bit 31 set at i=13. Updated the
doc-comment and the deviation entry to match.

## M4: Boundary-arity test

`handler_frame_dispatch_at_max_arm_count` allocates a frame with
`MAX_HANDLER_ARMS` op-arms, sets every arm to a real handler fn,
pushes, performs against the LAST arm (op = MAX_HANDLER_ARMS - 1),
and verifies dispatch succeeds. Exercises the full alloc + bitmap
+ perform path against the cap.

## M6: Counter semantics documented

Module-level docs clarify `HANDLER_WALK_COUNT` increments per perform
*attempt* (regardless of match), `HANDLER_WALK_DEPTH_SUM` sums frames
inspected including matching frame on a hit OR full stack depth on
unhandled-effect abort. Average walk depth = SUM / COUNT.

## M7: Arena reentrancy contract documented

Module-level docs spell out the `RefCell` no-reentrancy invariant: the
trampoline upholds it by reading dispatch info into stack locals and
calling `sigil_arena_reset` BEFORE invoking the carried `cps_fn`. Plan
B v1 has no path that nests `sigil_arena_alloc` calls within a single
trampoline iteration; codegen (Task 55) preserves this by emitting a
single `NextStep` allocation per cps_fn return.

## M8: Deviation entries updated

- New `[DEVIATION Task 56] Runtime TLS roots: register/unregister via
  Boehm GC_add_roots` covering the rooting contract + test-mode
  caveat + conservative-scan pinning tradeoff.
- New `[DEVIATION Task 56] MAX_INLINE_ARGS = 32 cap with bound-check
  at perform site` documenting the cap, where it's checked, and the
  Task 55 codegen impact.
- New `[DEVIATION Task 56] Vec::reserve panic-on-OOM does NOT cross
  the FFI boundary` documenting the try_reserve_exact swap.
- New `[DEVIATION Task 56] Arena alignment via Vec<u64> backing
  storage` documenting the alignment guarantee.

## M9: Tagged-vs-raw closure point updated

The `[VERIFICATION DEBT] Tagged-vs-raw ABI contract enforcement`
entry's closure point updated from "Task 56" to "Task 55 (when
codegen lowers the first Int-typed user arg into args_buf)" with a
one-line rationale: Task 56's runtime structs hold `*mut u8` pointers
and raw `u64` slots, not Int-typed slots; the newtype contract lands
when codegen does.

## Arena reset zero-fill

Added: `sigil_arena_reset` zeros the `[start, start + len*8)` region
before clearing `len`. Required because `GC_add_roots` covers the
full arena capacity range — without zeroing, stale pointers from
prior iterations or initial garbage from `try_reserve_exact` alias
freed Boehm blocks during conservative scan, segfaulting collections.
Cost is a `len*8`-byte memset per reset, typically tens of bytes per
trampoline iteration.

## Test deltas

61 passing + 4 ignored (arena_overflow_aborts + 3 GC stress tests)
prior pass-rate stays green. Pod-verify clean.
boldfield added a commit that referenced this pull request Apr 26, 2026
PR #21 re-review item A flagged that the three GC stress tests
(`handler_frame_survives_forced_gc_while_pushed`,
`closure_in_handler_arm_slot_survives_gc`,
`closure_in_next_step_survives_gc_via_arena_root`) were `#[ignore]`-d
because Boehm thread enrolment / re-enrolment doesn't compose with
cargo test's per-test thread teardowns. The tests are exactly the
ones that prove the rooting fixes from Critical #1 and #2 actually
work, so leaving them gated meant the load-bearing rooting contract
was implemented but not CI-validated. A future refactor breaking the
rooting could ship green.

This commit removes the `#[ignore]` and adopts the re-review's
"out-of-process test runner" suggestion (Option 1 from the review).
Each `#[test]` body checks an env var: when unset (outer mode), it
re-execs the test binary with `--exact handlers::tests::<name>
--nocapture` and `SIGIL_GC_STRESS_INNER=1`, then asserts the child
exited zero. When set (inner mode), it runs the actual stress body.

The subprocess inherits a pristine Boehm state. Only one test runs
per child process (filtered via `--exact`), so the child's
`GcThreadEnrolment` is acquired and dropped exactly once, and the
process exits before any stale registration can leak to a sibling
test. Cost: ~30ms process spawn × 3 tests = ~90ms added to
`cargo test`. Acceptable.

## Implementation

`runtime/src/handlers.rs`:
- New `STRESS_INNER_VAR` const, `in_stress_subprocess()` helper, and
  `run_stress_in_subprocess(test_name)` outer-mode wrapper. Errors
  from `current_exe()` / `Command::status()` route through the
  project's `eprintln! + std::process::abort()` pattern (clippy
  disallows `unwrap`/`expect`/`panic!` per `clippy.toml`).
- Three `#[ignore]` attributes removed. Each test gains a four-line
  prologue that delegates to `run_stress_in_subprocess` in outer
  mode and falls through to the real body in inner mode.

`PLAN_B_DEVIATIONS.md`:
- Verification block of the "Runtime TLS roots" deviation entry
  rewritten to reflect the subprocess pattern. Closure point flipped
  from "open for the test-harness composition issue" to "closed at
  PR #21 merge" — the test-harness issue is sidestepped, not solved,
  but the deviation's actual subject (production rooting) is closed.

## Verification

`cargo test -p sigil-runtime --lib`: 64 passed; 0 failed; 1 ignored
(only `arena_overflow_aborts` remains ignored, which is intentional —
it asserts an abort path the harness can't directly observe).

`cargo clippy -p sigil-runtime --all-targets -- -D warnings`: clean.

`cargo fmt --all -- --check`: clean.

Three subprocess invocations visible in test output, each running its
single inner-mode test successfully:

  test result: ok. 1 passed; 0 failed; 0 ignored; ... (×3)
  test result: ok. 64 passed; 0 failed; 1 ignored; ... (parent)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
boldfield added a commit that referenced this pull request Apr 26, 2026
Addresses PR #24 review feedback at cb050b6.

REVIEW BUG #1 (walker accepts ClosureEnvLoad):
The walker runs against the post-`closure_convert` AST.
When a `Handle` lives inside a `Lambda`, closure_convert
recurses into op-arm bodies and rewrites every captured-name
`Ident` into a `ClosureEnvLoad`. The walker's existing
`arm_body_walk` arm for `ClosureEnvLoad` returned `None`
(treated as benign alongside literals), so the rewritten
capture slipped past the Phase 4c "no outer-scope captures"
gate. At runtime the synthetic CPS arm fn (with
`closure_ptr = null` in Phase 4c) would null-deref when
lowering the ClosureEnvLoad to `load(closure_ptr + 16 +
8*index)`.

Fix: add an `Expr::ClosureEnvLoad { name, .. }` arm to
`arm_body_walk` returning the same Phase-4d-pointing capture-
rejected diagnostic. New test
`arm_inside_lambda_captures_outer_via_closure_env_load_is_rejected_at_codegen`
covers the path; the existing
`arm_captures_outer_scope_is_rejected_at_codegen` test only
covers the top-level-fn-param capture path which closure_convert
leaves as raw `Ident`.

REVIEW #2 (assert_eq vs debug_assert_eq asymmetry):
Phase 4c deviation entry says the arm-side fallthroughs
"mirror Phase 4b's widening-fallthrough hardening" but two
of the three fallthroughs used `debug_assert_eq!`. Changed
both `compiler/src/codegen.rs:2048` (arg unpack pointer
fallthrough) and `compiler/src/codegen.rs:2114` (body widen
pointer fallthrough) to `assert_eq!` so a future floats /
32-bit-target regression panics in *both* dev and release
on either side of the FFI boundary. Without this fix the
arm-side would silently miscompile while the perform-side
panicked — the worst failure mode (asymmetric hardening).

REVIEW #3 (silent I64 fallback in lower_perform_non_io_to_value):
The two registry-lookup fallbacks (effect missing, op missing)
silently returned `I64`. Per surrounding pre-pass invariants
these are unreachable from a clean typecheck (E0042 / E0043
catch them). Replaced with `unreachable!` so a typecheck
regression that left the registry incomplete surfaces at
codegen time instead of silently emitting wrong-typed
Cranelift values for non-Int return types.

REVIEW MF1 (Char arg-readback test missing):
Added `arm_reads_char_arg_branches_on_codepoint`. Bool (test
2a) exercises I8 width of the widen/ireduce path; this test
exercises I32 width. Distinct Cranelift instructions
operating on distinct widths — Bool-only coverage left I32
verifier-checked but not value-checked. Closes part of
the Phase 4b deviation entry's pre-registered acceptance
precondition's "Bool / Char arg readback" item.

REVIEW MF2 (perform-side narrow dead-coded by Int/String-only
return tests):
Added `perform_side_narrow_to_bool_value_checked` and
`perform_side_narrow_to_char_value_checked`. All preconditon-
matrix tests declare ops returning `Int` (or `String`) so
the perform-side narrow always took the `return_ty == I64`
no-op branch. The new `ireduce(return_ty, widened)` was
verifier-checked but not value-checked. Op declared
`(Int) -> Bool` / `(Int) -> Char` forces the I8 / I32
narrow branches; arm body returns a literal of that type;
surrounding code observes the value through `if b` /
`c == 'Y'` equality. Bool covers I8, Char covers I32 —
symmetric to MF1's split on the perform→arm widen leg.

Pod-verify clean.
boldfield added a commit that referenced this pull request Apr 26, 2026
…cking (#24)

Lifts the IntLit-only arm body restriction. The synthetic CPS arm fn now lowers its body through a real `Lowerer` instance with op-args bound from `args_ptr` at fn entry; arm bodies can use any expression over op-args + globals (top-level fns, ctors, builtins) — arithmetic, conditionals, calls, IO performs.

Closes the **Phase 4b deviation entry's pre-registered acceptance precondition** for arg-content verification (Int / Bool / Char / String / multi-arg-in-declared-order).

Code changes:
- `HandlerArmSynth` extended with `arg_names`, `arg_types`, `body_ty` resolved from the matching `EffectOp` declaration via `cranelift_ty_for_type_expr`.
- Pre-pass `collect_handle_arms_in_*` family threaded an `ArmSynthCtx` (`cps_arm_sig` + `op_ids` + `effects` + `pointer_ty`).
- Synthetic-fn definition pass at the bottom of `emit_object` rewritten: each arm fn declares its own per-fn FFI refs (mirrors the user-fn loop because Cranelift's `declare_func_in_func` is per-fn-scoped); op-args are loaded from `args_ptr` at fn entry (`load.i64` + `ireduce` to declared type for narrower Bool/Byte/Unit/Char), bound in a fresh `Lowerer` env. Body lowered via `lower_expr`; result widened to I64, wrapped via `sigil_next_step_done(value)` + return.
- `lower_perform_non_io_to_value` mirror-narrows `sigil_run_loop`'s I64 result back to the op's declared return type.
- `unsupported_handle_construct` walker drops the IntLit-only rejection; replaces with three new gates via `arm_body_phase_4c_violations`: scope-tracking walk checking for k-name references, outer-scope captures (incl. `ClosureEnvLoad` after closure_convert), and nested Lambda/ClosureRecord shapes. Walker collects globals (top-level fn names + ctor names + `int_to_string` builtin) once at entry.

Review-fixup commit `c75ef85` addressed: closure-env-load walker bug (handle inside lambda captures slipping past gate; null-deref at runtime), `debug_assert_eq!`/`assert_eq!` asymmetry on arm-side fallthroughs, silent I64 fallback in `lower_perform_non_io_to_value` (replaced with `unreachable!`), Char arg-readback test, perform-side narrow value-checks (Bool + Char return types).

CI fixups `97b5127`, `cb050b6`, `f69b562` addressed Sigil v1 surface syntax constraints (no `let _ = ...`; Block not parseable as arm body expression; `TypeExpr::Fn` deferred so lambdas must IIFE).

Tests: 9 new e2e tests (4 Phase 4b acceptance precondition + 2 perform-side narrow value-checks + 1 Char arg-readback + 1 Bug #1 closure-env walker + 3 bonus richer-shape tests; minus 1 stale Phase 3b IntLit-only-rejection test deleted).
boldfield added a commit that referenced this pull request May 3, 2026
…riants, field promotion

Addresses both review comments on PR #81:

## Code changes

1. **Tighten `arm_body_needs_branched_routing` to mirror lowering
   capability** (Review 1 main, Review 2 #1 partial).
   Earlier detector used transitive `expr_has_k_call` scan that
   returned true for k-calls anywhere in a sub-tree (let-RHS,
   binary operands, perform args, etc.). Lowering only handles k
   at branch-tail / scrutinee / Block-tail positions; previous
   detector relied on the walker's non-tail-k rejection to catch
   unsupported shapes before codegen. Architecturally fragile —
   future walker relaxation would silently expose `unreachable!`
   panics. New detector's recursion shape mirrors
   `lower_arm_body_to_next_step` exactly. Deleted obsolete
   `expr_has_k_call` and `block_has_k_call` helpers; added
   `expr_has_branchable_k` and `block_tail_has_branchable_k`.

2. **Promote `next_step_discharged_ref` to a Lowerer field**
   (Review 2 #5 plumbing). Was threaded as a parameter through
   3 helper methods while sibling FuncRefs (`next_step_call_ref`,
   `next_step_args_ptr_ref`, `continuation_identity_ref`,
   `run_loop_ref`) lived as Lowerer fields. Promoted for
   symmetry; saves parameters at the synth-arm-fn body emit
   call site and the helpers' recursive callers. Threading
   added at every Lowerer construction site (9) + 2 destructuring
   bind-renames (already-bound at 5 others).

## Doc-comment changes

3. **Explicit invariant doc on `lower_arm_body_to_next_step`**
   (Review 2 #3). The discharged-leaf catch-all (`_ =>` arm)
   assumes the walker rejects k-calls in subexpressions reached
   by `lower_expr`. Future walker relaxation would surface as
   hard `unreachable!` panic at lower_call's "no signature
   source registered" site. Documented as load-bearing
   invariant. Also documented the k-as-scrutinee multi-shot
   dependency on Task 117 RELINK_STACK substrate.

4. **Sync "two ways" -> "three ways"** on
   `lower_synth_arm_k_call_as_value` (Review 2 #5 nit). Doc
   comment said "two" but listed three differences from
   `lower_k_pair_call`.

5. **FIXME on `current_fn_name` resolved-Ty lookup miss**
   (Review 1 minor + Review 2 #2). The per-clone key
   `(current_fn_name, scrut_span)` essentially never hits in
   synth arm-fn Lowerers (current_fn_name = String::new());
   span-only fallback saves non-generic surfaces. Generic
   dischargers with k-as-scrutinee untested. Same fragility
   in `lower_match` (pre-existing).

## DEVIATIONS updates

6. **Performance note accuracy** (Review 2 #4). Earlier note
   undersold the cumulative-per-recursion-depth cost of nested
   `sigil_run_loop` drives. Updated to reflect O(N) NextStep
   allocs + O(N) host stack frames for N-candidate recursive
   Choose. Sudoku already uses Task 117's binary-choose 2-let
   chain; not on Task 118 / Stage 13 acceptance gate.

7. **Pre-existing v1 limitations surfaced by review** added as
   a new sub-section in `[DEVIATION Task 118]`:
   - Outer-`k` reference inside nested-handle inner-arm body
     (walker accepts; codegen would panic at inner arm fn's
     lower_call). Pre-existing — Stage 6.8 Task 107
     `arm_k_pair_captures` handles k-capture-by-lifted-lambda,
     not k-capture-by-nested-handle-inner-arm. New tighter
     detector intentionally doesn't scan inner handle op_arms
     (would over-route).
   - Generic dischargers with k-as-scrutinee untested.

## Verified locally

- pod-verify clean (cargo check + clippy + fmt + runtime lib tests)
- 117/117 codegen lib tests pass

CI to confirm e2e tests on both hosts.
boldfield added a commit that referenced this pull request May 3, 2026
…ng path (#81)

* [Plan D Task 118] Drop walker rejection of conditional/branched k-call (incomplete lift)

Plan D Task 118 lifts `arm_body_walk`'s rejection of:
- `Eff.op(k) => if cond { k(x) } else { k(y) }`
- `Eff.op(k) => match s { A => k(x), B => k(y) }`
- `Eff.op(k) => if cond { k(x) } else { 42 }` (one-branch discharge)

This commit applies the MINIMAL walker change (propagating `tail`
into If/Match branch tails so the Expr::Call `!tail` reject no
longer fires) plus the four targeted e2e tests + a pod-safe lib
unit test.

## Status: incomplete — codegen side not yet implemented

The plan body's premise that this lift is "primarily a removal of
walker rejections + verification that step 117's machinery
generalizes" is structurally insufficient. Static analysis:

- The synth arm-fn Lowerer sets `arm_k_pair_self = None`
  (codegen.rs:9557). The k-pair dispatch in `lower_call`
  (codegen.rs:15120) is gated on `arm_k_pair_self.is_some()` —
  fires only for lifted-lambda synth fns (Plan B' Stage 6.8 Task
  107 machinery), not synth arm-fn bodies.
- `arm_body_tail_is_k_call` (codegen.rs:4792) only recurses
  through `Expr::Block.tail`, not If/Match branches; the synth-
  pass routes the new shapes into the "Non-tail / no-`k` path"
  which calls `lower_expr` on the whole if/match body.
- `lower_expr` of `k(arg)` inside a branch falls through every
  arm of `lower_call` and hits `unreachable!("codegen invariant:
  walker accepted callee shape but no signature source
  registered")` at codegen.rs:16072.

The four e2e tests are expected to fail at codegen-time
(unreachable!) on this commit. The pod-safe lib test
`task_118_walker_accepts_conditional_k_call_inside_if` confirms
the walker change is correct in isolation.

## What's needed for completion

Per the plan body: "Each branch's k-call site gets its own
post_arm_k synth-cont; the surrounding match/if lowers to the
standard branch-with-tail discipline." Concretely:

1. Generalize the synth arm-fn body's k-call dispatch — either
   set `arm_k_pair_self = Some(...)` on synth arm-fn Lowerers
   (sourcing k_closure/k_fn from arm_k_closure_v/arm_k_fn_v
   instead of closure record loads), or refactor
   `lower_k_pair_call` to accept the source as a parameter.
2. Allocate a post_arm_k synth-cont per branch's k-call site
   (mirrors Slice C's chain machinery, spread across branch
   tails).
3. Lower if/match as separate branch blocks each producing a
   NextStep ptr, merge in a join block returning the result.

Estimate: ~150-300 lines of codegen.rs additions. Substantially
exceeds the ~30-line ceiling the task instructions specify as
the re-scope trigger. Surfaced for Stage 13 review.

* [Plan D Task 118] Revert minimal walker change; mark e2e tests #[ignore]

Empirical CI on the prior commit (PR #81 first run):
- 4 task_118_* e2e tests panicked at codegen.rs:16070
  `unreachable!("codegen invariant: walker accepted callee shape but
  no signature source registered — callee = Ident(\"k\", ...)`
- Same panic pattern across all 4 shapes (if-k, match-k,
  one-branch-discharge, recursive-Choose).
- 2 pre-existing rejection-pin tests broke at the same panic site
  (their guard message was the walker rejection that no longer
  fires post-walker-change):
  - `arm_uses_k_inside_if_branch_is_rejected_pointing_at_phase_4e`
  - `arm_uses_k_inside_match_arm_is_rejected_pointing_at_phase_4e`
- task_117_smoke_gate_sudoku_solves_4x4 still passes (the walker
  change didn't regress Sudoku).

Disposition: revert the walker change; mark the 4 task_118_* e2e
tests `#[ignore]` with an explicit message pointing at this PR's
surface. Pre-existing rejection-pin tests pass unchanged with the
revert. Lib unit test renamed to
`task_118_walker_rejects_conditional_k_call_inside_if_pre_lift`
and asserts the rejection (matches the pre-lift state).

PR is now a clean **surface artifact** for the architectural
re-scope decision:
- Test scaffolding (4 e2e + 1 unit) ready for the lift PR to
  un-ignore + flip to acceptance.
- Empirical evidence in commit history that minimal walker
  removal is structurally insufficient.
- Architectural recommendation in PR body.

* [Plan D Task 118] Conditional/branched k-call lift via branched-routing path

Closes Plan D Task 118 (architectural slice option C). User-authorized
this session to bundle the architectural work into this PR.

## What ships

A new "branched-routing" path in the synth arm-fn body lowering
that recursively descends the body's tail expression, emitting one
*NextStep ptr per leaf:

- k(arg) tail call → NextStep::Call (Slice A trailing-pair convention)
- non-k tail value → NextStep::Discharged
- If: lower cond, branch + recurse into both branches, merge in a
  join block (block-param of pointer_ty)
- Match: lower scrutinee (special-cased for k-as-scrutinee), branch
  per arm, recurse arm bodies via the helper, merge

For k-as-scrutinee (`match k(arg) { ... }`), a new helper
`lower_synth_arm_k_call_as_value` drives a nested sigil_run_loop on
the k-call from inside the synth arm fn; the result is narrowed to
the match scrutinee's Cranelift type (via match_scrut_tys_resolved /
match_scrut_tys side-tables). Differs from the lifted-lambda
`lower_k_pair_call` in three ways: (a) sources k_closure/k_fn from
the synth arm-fn's loaded args_ptr trailing slots, no closure
record indirection, (b) does NOT re-push the originating handler
frame (synth arm-fn executes within the trampoline that's
processing the perform), (c) does NOT apply the return-arm wrap
(value consumed at source-language level as match scrutinee).

## Routing

Detector `arm_body_needs_branched_routing` gates entry. New
branched-routing path is mutually exclusive with the existing
Slice C / Slice B / tail-k / discharged paths — placed BEFORE
them in the synth arm-fn body emit's if/elif/else chain.

Walker (arm_body_walk) updates:
- Expr::If propagates `tail` into branch tails (was always false)
- Expr::Match propagates `tail` into arm bodies (was always false)
- Expr::Match scrutinee in tail context: when scrutinee is k(arg),
  walk only the arg in non-tail (skipping the regular Expr::Call
  walker which would reject k-as-callee in non-tail)

## Tests

- 4 e2e tests (un-ignored): conditional k inside if, conditional k
  inside match, k in one branch + discharge in other, recursive
  Choose first_choice 3-candidate
- 1 lib unit test flipped from rejection-pin to acceptance
- Pre-existing rejection-pin tests
  arm_uses_k_inside_if_branch_is_rejected_pointing_at_phase_4e and
  arm_uses_k_inside_match_arm_is_rejected_pointing_at_phase_4e
  removed (superseded by task_118_*)

## Verified locally

- pod-verify clean (cargo check + clippy + fmt + runtime lib tests)
- 117/117 codegen lib tests pass

CI to confirm e2e tests on both hosts.

* [Plan D Task 118] Update PROGRESS + DEVIATIONS for branched-routing path

- PLAN_D_PROGRESS.md: Task 118 status flipped from todo to
  done-pending-ci with the implementing-mechanism summary +
  pointer to [DEVIATION Task 118].
- PLAN_D_DEVIATIONS.md: new entry [DEVIATION Task 118]
  documenting the empirical surface, the architectural mechanism
  (branched-routing path), the differences from `lower_k_pair_call`,
  test coverage, closure points closed, and the three implementing
  commits.

* [Plan D Task 118] Address PR #81 reviews — tighten detector, doc invariants, field promotion

Addresses both review comments on PR #81:

## Code changes

1. **Tighten `arm_body_needs_branched_routing` to mirror lowering
   capability** (Review 1 main, Review 2 #1 partial).
   Earlier detector used transitive `expr_has_k_call` scan that
   returned true for k-calls anywhere in a sub-tree (let-RHS,
   binary operands, perform args, etc.). Lowering only handles k
   at branch-tail / scrutinee / Block-tail positions; previous
   detector relied on the walker's non-tail-k rejection to catch
   unsupported shapes before codegen. Architecturally fragile —
   future walker relaxation would silently expose `unreachable!`
   panics. New detector's recursion shape mirrors
   `lower_arm_body_to_next_step` exactly. Deleted obsolete
   `expr_has_k_call` and `block_has_k_call` helpers; added
   `expr_has_branchable_k` and `block_tail_has_branchable_k`.

2. **Promote `next_step_discharged_ref` to a Lowerer field**
   (Review 2 #5 plumbing). Was threaded as a parameter through
   3 helper methods while sibling FuncRefs (`next_step_call_ref`,
   `next_step_args_ptr_ref`, `continuation_identity_ref`,
   `run_loop_ref`) lived as Lowerer fields. Promoted for
   symmetry; saves parameters at the synth-arm-fn body emit
   call site and the helpers' recursive callers. Threading
   added at every Lowerer construction site (9) + 2 destructuring
   bind-renames (already-bound at 5 others).

## Doc-comment changes

3. **Explicit invariant doc on `lower_arm_body_to_next_step`**
   (Review 2 #3). The discharged-leaf catch-all (`_ =>` arm)
   assumes the walker rejects k-calls in subexpressions reached
   by `lower_expr`. Future walker relaxation would surface as
   hard `unreachable!` panic at lower_call's "no signature
   source registered" site. Documented as load-bearing
   invariant. Also documented the k-as-scrutinee multi-shot
   dependency on Task 117 RELINK_STACK substrate.

4. **Sync "two ways" -> "three ways"** on
   `lower_synth_arm_k_call_as_value` (Review 2 #5 nit). Doc
   comment said "two" but listed three differences from
   `lower_k_pair_call`.

5. **FIXME on `current_fn_name` resolved-Ty lookup miss**
   (Review 1 minor + Review 2 #2). The per-clone key
   `(current_fn_name, scrut_span)` essentially never hits in
   synth arm-fn Lowerers (current_fn_name = String::new());
   span-only fallback saves non-generic surfaces. Generic
   dischargers with k-as-scrutinee untested. Same fragility
   in `lower_match` (pre-existing).

## DEVIATIONS updates

6. **Performance note accuracy** (Review 2 #4). Earlier note
   undersold the cumulative-per-recursion-depth cost of nested
   `sigil_run_loop` drives. Updated to reflect O(N) NextStep
   allocs + O(N) host stack frames for N-candidate recursive
   Choose. Sudoku already uses Task 117's binary-choose 2-let
   chain; not on Task 118 / Stage 13 acceptance gate.

7. **Pre-existing v1 limitations surfaced by review** added as
   a new sub-section in `[DEVIATION Task 118]`:
   - Outer-`k` reference inside nested-handle inner-arm body
     (walker accepts; codegen would panic at inner arm fn's
     lower_call). Pre-existing — Stage 6.8 Task 107
     `arm_k_pair_captures` handles k-capture-by-lifted-lambda,
     not k-capture-by-nested-handle-inner-arm. New tighter
     detector intentionally doesn't scan inner handle op_arms
     (would over-route).
   - Generic dischargers with k-as-scrutinee untested.

## Verified locally

- pod-verify clean (cargo check + clippy + fmt + runtime lib tests)
- 117/117 codegen lib tests pass

CI to confirm e2e tests on both hosts.
boldfield added a commit that referenced this pull request May 3, 2026
…y_name, comment cleanups, invariant hardening

Addresses inline review #4215867100 on PR #83:

## Mechanical fixes

1. **#1 debug_assert! for MAX_INLINE_ARGS** at both wrapper-Call emit sites (helper-body Phase 6 + Middle-step CallCps). Mirrors the perform-site debug_assert at `lower_perform_to_value` (codegen.rs:14586). Defense-in-depth — runtime `sigil_next_step_call` aborts on overflow; debug builds catch it before linking.

2. **#2 + #3 hoist fns_by_name + extract is_tail_perform_cps_user_fn helper.** Both `compute_user_fn_abi` (per-fn loop) and the synth-cont allocation site previously rebuilt the fns lookup each iteration → O(n²) over program items. Hoisted to `build_fns_by_name(&ColoredProgram)` called once at `emit_object` entry; threaded as `&BTreeMap<&str, &FnDecl>` through `compute_user_fn_abi`'s new parameter. The 18-line closure construction extracted into top-level fn `is_tail_perform_cps_user_fn` (callee_name, fns_by_name, colored, ctors). Both call sites now invoke the same helper.

3. **#5 unreachable!() at silent fall-through** in `compute_user_fn_abi`'s per-stmt extraction. The classifier guarantees every stmt is Stmt::Let with Perform OR Call(Ident) value; the catchall arm now panics (mirrors the pre-pass site's existing discipline). Was a silent skip that dropped the binding.

## Doc fixes

4. **#4 prior_was_call_cps comment.** Replaced the thinking-out-loud derivation ("Wait, this is off-by-one. step_0 fires AFTER ... hmm actually step_i fires AFTER ...") with a one-sentence summary: `prior_was_call_cps for step_i is true iff steps[i] is CallCps — step_i's synth-cont fires after steps[i]'s dispatch (helper-body for i=0; step_{i-1}'s Middle for i>0)`.

5. **#6 stale 1099/1108 comment.** The comment at the helper-body / Middle-step CallCps emit sites referenced the reverted-by-7b56eec attempt at unconditional OUTER_POST_ARM_K push and the resulting test failures. Both task_78_5_g4_approach6_risk3_* tests pass post-classifier-restriction (commit f5a2618) — sub_cps_fn falls back to Sync, lower_call's Cps branch handles via SAVE+CLEAR+RESTORE. Comment rewritten to point at [DEVIATION Task 112b].

6. **#7 BODY_RETURN_ARM_STACK leak-on-arm-abandon assumption.** Added a paragraph at the chain step entry POP site documenting that the POP relies on chain progression: if a discharge-with-lambda arm captures `k` into a lambda but never invokes it, the (null, null) entry stays on stack. Phase 4g treats it as plain Done (no return-arm wrap) — correct-by-coincidence. The discharge-with-lambda handlers in std/state + std/random + std/clock all invoke `k` (verified); assumption holds for the test corpus today.

7. **#9 wrapper-of-wrapper recursion termination note.** Added doc-comment paragraph at `is_tail_perform_cps_user_fn` noting that tail-perform bodies can't themselves be wrappers (no Expr::Call; expr_is_pure rejects non-ctor calls in args), so the single-hop lookup is total — no recursion-termination concern.

## Out-of-scope

- **#8 (compute_user_fn_abi + emit_object re-walk body)** — pre-existing structural waste flagged by the reviewer as out-of-scope follow-up. Not addressed in this commit.

## Verified locally

- pod-verify clean
- 117/117 codegen lib tests pass

CI to confirm task_112_*, task_78_5_g4_approach6_risk3_*, and stdlib regressions all stay green.
boldfield added a commit that referenced this pull request May 3, 2026
…-let-yield wrapper deferred to Task 112b (#83)

* [Plan D Task 112] Wrapper-fn-frame composition fix — chained-let-yield classifier extension + Sync→Cps interop k-pair threading

Closes [DEVIATION Task 72] constraint #3 (wrapper-fn-frame composition gap)
deferred during Plan B'/Plan C and again during Plan D execution. The
deferral chain assumed Task 117's substrate would unblock the lift;
empirical architectural read (this session's preceding investigation)
showed the surfaces are disjoint and Task 112 needs its own architectural
slice.

## Mechanism (Candidate (a))

Extend the chained-let-yield classifier to accept `let _ = wrapper_call(args)`
let-RHS shapes (in addition to the existing `let _ = perform Eff.op(args)`)
when `wrapper_call`'s callee is a Cps-color top-level user fn. The body
then classifies as Cps and gets a synth-cont chain; the helper-body and
Middle-step emit thread the chain's k-pair through the wrapper boundary
via the trailing-pair args-buffer convention.

## Codegen sites changed

- `is_simple_chained_let_yield_then_pure_tail_body` (codegen.rs:19277):
  accepts Expr::Call let-RHS when callee is Cps-color top-level fn; takes
  new `is_cps_user_fn` lookup parameter.
- `compute_user_fn_abi` (codegen.rs:189): supplies `colored.needs_cps_transform`
  as the lookup; updated K+N captures-cap check to extract
  ChainedNextStep enum.
- `walk_collect_captures` (codegen.rs:3378): descends into Expr::Call
  args (was a defensive skip pre-Task-112).
- `collect_chained_synth_cont_captures` (codegen.rs:2922): iterates over
  ChainedNextStep enum (was &[PerformExpr]) — walks Perform args OR Call
  args per step kind.
- `ChainedNextStep` enum: new sum type with `Perform(PerformExpr)` and
  `CallCps { callee_name, args }` variants; replaces `next_perform:
  PerformExpr` in `ChainStepRole::Middle`.
- Pre-pass per-stmt loop (codegen.rs:7460-7573): extracts ChainedNextStep
  per step (Perform or CallCps).
- Helper-body Phase 6 emit (codegen.rs:8785-9220): branches on
  body_first_step kind. Perform → existing sigil_perform call. CallCps →
  resolves callee's func_addr from user_fns, lowers args, packs args +
  (k_closure_loaded, k_fn_loaded) into trailing slots via
  k_closure_offset / k_fn_offset, builds NextStep::Call.
- Middle-step emit (codegen.rs:11898+): branches on next_step kind.
  Same Perform-vs-CallCps split, with the chain's NEXT step's
  closure_ptr / fn_addr as the trailing pair (instead of helper's k-pair
  loaded from args_ptr).

## Why it works (Candidate (a) over (b))

Initial architectural read recommended Candidate (b) — push to
OUTER_POST_ARM_K_STACK, emit NextStep::Call with (null, identity).
Closer analysis showed (b) fails for the discharge-with-lambda
shape: the lambda invocation goes through `lower_k_pair_call` which
reads k from the closure record and drives a NESTED run_loop;
multi-shot composition via OUTER_POST_ARM_K_STACK only routes the
OUTER trampoline's terminal Done, not the lambda's nested
invocation. Candidate (a) — direct k-pair threading via the args-
buffer trailing slots — composes uniformly: the wrapper's tail-
perform Cps body already loads its k-pair from args_ptr trailing
slots and forwards to its perform site. The arm captures the
chain's k-pair (NOT identity); lambda's `k(arg)` invokes the next
synth-cont. Same mechanism as inline-perform.

## Tests

- New self-contained e2e `task_112_wrapper_fn_frame_composition_state_set_get_returns_11`
  pinning the canonical `set 10, get, +1 = 11` shape.
- Sister tests:
  - `task_112_wrapper_chain_three_sets_then_get_returns_3` — chain length 4.
  - `task_112_wrapper_returns_binding_used_in_tail` — binding flows
    into non-trivial tail.
  - `task_112_mixed_inline_perform_and_wrapper_in_chain` — mixed
    Perform + Call let-RHS in the same chain.
- Un-ignored: `std_state_run_state_via_wrappers_pending_v2_wrapper_fn_frame_fix`
  (the original deferral test).
- Updated existing lib unit test
  `chained_captures_recurses_into_call_in_tail_post_task_112`
  (renamed from `..._does_not_recurse_into_call_in_tail`) to pin the
  new walker behavior.

## Verified locally

- pod-verify clean (cargo check + clippy + fmt + runtime lib tests)
- 117/117 codegen lib tests pass

CI to confirm e2e tests on both hosts.

* [Plan D Task 112] Fix CI failures — Risk 3 BODY_RETURN_ARM_STACK protection + OOB args buffer + test rename

Three fixes responding to CI on commit ac45a09:

## 1. OOB args buffer write (helper-body + Middle-step CallCps emit)

Previous emit passed `arg_count = user_arg_count` to
`sigil_next_step_call`. The runtime allocates `arg_count * 8` bytes
for the args buffer; for 0-user-arg wrappers (e.g., `random_int()`,
`now()`, `get_state()`), this allocated 0 bytes. Writing the
trailing-pair k_closure / k_fn at offsets 0/8 wrote into NULL
(`sigil_next_step_args_ptr` returns null when arg_count == 0) →
SIGSEGV.

Fix: pass `arg_count = user_arg_count + 2` (matches the synth-arm-fn
tail-k path's convention). The Cps callee's body ignores args_len
at runtime and uses the static user_arg_count from f.params.len(),
so this count is only consumed by the runtime arena allocator and
the trampoline's MAX_INLINE_ARGS check.

## 2. Risk 3 BODY_RETURN_ARM_STACK protection

`task_78_5_g4_approach6_risk3_*` tests broke because body_fn now
classifies as Cps (chained-let-yield with wrapper-Call let-RHS).
The chain emit returns `NextStep::Call(sub_cps_fn, ...)` to the
OUTER trampoline; sub_cps_fn's natural-exit emit then reads
BODY_RETURN_ARM_STACK top — which contains the OUTER body's
return-arm pair (pushed by main's `handle` expression). The
outer return arm wraps sub_cps_fn's value erroneously, producing
2100 instead of 1100.

`lower_call`'s Cps branch (codegen.rs:16294) handles this for
synchronous Sync→Cps interop by PUSHing (null, null) onto
BODY_RETURN_ARM_STACK before driving its nested `run_loop`,
POPping after. For Task 112's chain emit, no nested run_loop —
the Call returns to the OUTER trampoline, so PUSH/POP must be
async across two synth-cont fns.

Fix:
- Add `prior_was_call_cps: bool` to `ChainedLetBindStep` (set at
  pre-pass per chain step from `steps[step]`'s kind).
- PUSH (null, null) onto BODY_RETURN_ARM_STACK at the helper-body
  Phase 6 CallCps emit AND the Middle-step CallCps emit (before
  building NextStep::Call).
- POP at every chain step's body entry, gated on
  `prior_was_call_cps`.

## 3. `_` shadowing in new sister tests

Two new tests (`task_112_wrapper_chain_three_sets_then_get_returns_3`,
`task_112_mixed_inline_perform_and_wrapper_in_chain`) used `let _`
multiple times in the same body. resolve.rs doesn't catch `_`
shadowing (treats it as a regular identifier); typecheck's
env_insert debug_assert fires. Renamed to `_a`/`_b`/`_c`.

## Verified locally

- pod-verify clean (cargo check + clippy + fmt + runtime lib tests)
- 117/117 codegen lib tests pass

CI to verify regressions are gone + Task 112 tests pass.

* [Plan D Task 112] Add OUTER_POST_ARM_K_STACK chain-routing push + fix chain-of-3 test source

Two follow-on fixes for the 3 remaining CI failures on commit 8290719:

## 1. Add OUTER_POST_ARM_K_STACK chain-routing push at CallCps emit sites

For chained-let-yield Cps WRAPPERS (callee is itself a chained-let-yield
Cps fn whose body's first step is a perform — sub_cps_fn in the Risk 3
test), the wrapper IGNORES the trailing-pair k_pair from args_ptr (uses
its own internal chain pair instead). Without an OUTER_POST_ARM_K_STACK
push of the chain's next-step pair, the wrapper's natural-exit Done
terminates the chain prematurely — the wrap-handler fires on the
wrapper's value (Risk-3-like leak through the OUTER_POST_ARM_K
side-channel), producing 1099 / 1108 instead of 1100 / 1105.

Fix: at helper-body Phase 6 CallCps emit, PUSH (k_closure_loaded,
k_fn_loaded) onto OUTER_POST_ARM_K_STACK. At Middle-step CallCps emit,
PUSH (next_closure_ptr, next_step_fn_addr). The trampoline's Done
routing pops these and dispatches the chain's next step.

For tail-perform wrappers (set_state in the canonical state test) this
push is redundant — the trailing-pair k_pair forwarding handles
routing via the perform's arm — and the leaked push gets drained on
the Discharged terminal.

## 2. Fix chain-of-3 test source

`task_112_wrapper_chain_three_sets_then_get_returns_3` had `get_state()`
as the body's tail (impure call). The chained-let-yield classifier
requires a pure tail (`expr_is_pure` rejects non-ctor calls), so the
body fell back to Sync ABI and the chain-emit path never ran — state
threading went through three independent Sync→Cps interop wrappers,
each losing context. Rewrote to `let v: Int = get_state(); v` so the
body has 4 lets with a pure tail (Ident); the chained-let-yield
classifier accepts it, body classifies as Cps, and the chain-emit
path runs.

## 3. Bind `outer_post_arm_k_push_ref` in helper-body destructuring

The PerFnRefs destructuring at line 7949 had `outer_post_arm_k_push_ref:
_` (discard). The new helper-body Phase 6 CallCps emit needs this
FuncRef. Bound it (and the other 4 destructuring sites that DON'T need
it stay as `_` to avoid unused-variable warnings).

## Verified locally

- pod-verify clean
- 117/117 codegen lib tests pass

CI to verify the 3 failing tests now pass + no new regressions.

* [Plan D Task 112] Restrict classifier to tail-perform Cps wrappers; revert OUTER_POST_ARM_K push

Previous commit (7b56eec) added an unconditional OUTER_POST_ARM_K_STACK
push at CallCps emit sites to route the wrapper's natural-exit Done
back to the chain. This worked for chained-let-yield Cps wrappers
(Risk 3 shape) but caused re-dispatch abort for tail-perform Cps
wrappers (the routing pop dispatched the SAME chain step that already
fired via the perform's k_pair → infinite loop / abort, exit -1).

## Two-part fix

### Part 1: Revert the OUTER_POST_ARM_K push at CallCps emit sites

Restores the previous behavior for tail-perform wrappers. The push
was load-bearing only for chained-let-yield Cps wrappers (Risk 3
shape); without it, those would route their Done value to the wrong
handler. Without the push, that case is broken — but Part 2
prevents it from being reached.

### Part 2: Restrict classifier to tail-perform Cps wrappers only

`is_simple_chained_let_yield_then_pure_tail_body` now accepts
`Expr::Call` let-RHS only when the callee is a TAIL-PERFORM Cps
user fn (its body matches `is_simple_tail_perform_with_pure_args_body`).
Tail-perform Cps wrappers (set_state, get_state, random_int, now,
etc.) FORWARD the trailing-pair k_pair to their inner perform site,
making them transparent to the chain's k-pair propagation —
state-threading and normal-resume both work via the existing path.

Chained-let-yield Cps wrappers (e.g., `sub_cps_fn` whose body is
`let _ = perform E.op(); body_tail`) IGNORE the trailing-pair
k_pair (use their own internal chain pair). The classifier now
rejects them, body falls back to Sync ABI, and `lower_call`'s Cps
branch handles via the SAVE+CLEAR+RESTORE BODY_RETURN_ARM pattern
(restoring pre-Task-112 Risk 3 protection for these shapes).

## Caller plumbing

Both `compute_user_fn_abi` and the pre-pass build a `fns_by_name:
BTreeMap<&str, &FnDecl>` from the program's items, and pass an
`is_tail_perform_cps_user_fn` closure that:
1. Filters by `colored.needs_cps_transform(name)` (Cps-color check).
2. Looks up the callee's body and tests
   `is_simple_tail_perform_with_pure_args_body`.

## Disposition for chained-let-yield Cps wrappers

Documented as Plan D Task 112 follow-up: the
chain-routing OUTER_POST_ARM_K_STACK push needs to be wrapper-shape-
conditional (only fire when callee is chained-let-yield Cps). That
requires a different codegen path; deferred to a follow-up. Risk 3
tests pass as before via lower_call's Cps branch fallback.

## Verified locally

- pod-verify clean
- 117/117 codegen lib tests pass

CI to verify all task_112 tests + all regressions are gone.

* [Plan D Task 112] Split into 112a (shipped) + 112b (deferred to follow-up); update PROGRESS + DEVIATIONS

Per user direction (Option B disposition), split the original Task 112
deferral into two named sub-tasks:

- Task 112a: tail-perform Cps wrapper composition — CLOSED via PR #83.
- Task 112b: chained-let-yield Cps wrapper composition — DEFERRED with
  named closure path.

PLAN_D_PROGRESS.md: Task 112 status flipped from "deferred" to
"112a done (PR #83); 112b deferred". Task 112a entry added with
done-pending-ci status. Task 112b entry added with todo status and
cross-reference to its deviation entry.

PLAN_D_DEVIATIONS.md: original [DEVIATION Task 112] entry preserved
verbatim with a SUPERSEDED 2026-05-03 marker and forward-pointers to
the new entries. Added [DEVIATION Task 112a] (CLOSED) and
[DEVIATION Task 112b] (DEFERRED) with full context, mechanism,
closure path, and gate disposition for Task 119 closeout.

Docs-only commit; no code changes.

* [Plan D Task 112a] Address PR #83 review — debug_asserts, hoist fns_by_name, comment cleanups, invariant hardening

Addresses inline review #4215867100 on PR #83:

## Mechanical fixes

1. **#1 debug_assert! for MAX_INLINE_ARGS** at both wrapper-Call emit sites (helper-body Phase 6 + Middle-step CallCps). Mirrors the perform-site debug_assert at `lower_perform_to_value` (codegen.rs:14586). Defense-in-depth — runtime `sigil_next_step_call` aborts on overflow; debug builds catch it before linking.

2. **#2 + #3 hoist fns_by_name + extract is_tail_perform_cps_user_fn helper.** Both `compute_user_fn_abi` (per-fn loop) and the synth-cont allocation site previously rebuilt the fns lookup each iteration → O(n²) over program items. Hoisted to `build_fns_by_name(&ColoredProgram)` called once at `emit_object` entry; threaded as `&BTreeMap<&str, &FnDecl>` through `compute_user_fn_abi`'s new parameter. The 18-line closure construction extracted into top-level fn `is_tail_perform_cps_user_fn` (callee_name, fns_by_name, colored, ctors). Both call sites now invoke the same helper.

3. **#5 unreachable!() at silent fall-through** in `compute_user_fn_abi`'s per-stmt extraction. The classifier guarantees every stmt is Stmt::Let with Perform OR Call(Ident) value; the catchall arm now panics (mirrors the pre-pass site's existing discipline). Was a silent skip that dropped the binding.

## Doc fixes

4. **#4 prior_was_call_cps comment.** Replaced the thinking-out-loud derivation ("Wait, this is off-by-one. step_0 fires AFTER ... hmm actually step_i fires AFTER ...") with a one-sentence summary: `prior_was_call_cps for step_i is true iff steps[i] is CallCps — step_i's synth-cont fires after steps[i]'s dispatch (helper-body for i=0; step_{i-1}'s Middle for i>0)`.

5. **#6 stale 1099/1108 comment.** The comment at the helper-body / Middle-step CallCps emit sites referenced the reverted-by-7b56eec attempt at unconditional OUTER_POST_ARM_K push and the resulting test failures. Both task_78_5_g4_approach6_risk3_* tests pass post-classifier-restriction (commit f5a2618) — sub_cps_fn falls back to Sync, lower_call's Cps branch handles via SAVE+CLEAR+RESTORE. Comment rewritten to point at [DEVIATION Task 112b].

6. **#7 BODY_RETURN_ARM_STACK leak-on-arm-abandon assumption.** Added a paragraph at the chain step entry POP site documenting that the POP relies on chain progression: if a discharge-with-lambda arm captures `k` into a lambda but never invokes it, the (null, null) entry stays on stack. Phase 4g treats it as plain Done (no return-arm wrap) — correct-by-coincidence. The discharge-with-lambda handlers in std/state + std/random + std/clock all invoke `k` (verified); assumption holds for the test corpus today.

7. **#9 wrapper-of-wrapper recursion termination note.** Added doc-comment paragraph at `is_tail_perform_cps_user_fn` noting that tail-perform bodies can't themselves be wrappers (no Expr::Call; expr_is_pure rejects non-ctor calls in args), so the single-hop lookup is total — no recursion-termination concern.

## Out-of-scope

- **#8 (compute_user_fn_abi + emit_object re-walk body)** — pre-existing structural waste flagged by the reviewer as out-of-scope follow-up. Not addressed in this commit.

## Verified locally

- pod-verify clean
- 117/117 codegen lib tests pass

CI to confirm task_112_*, task_78_5_g4_approach6_risk3_*, and stdlib regressions all stay green.
boldfield added a commit that referenced this pull request May 3, 2026
…sole channel (#92)

* [Plan D Task 111d] TLS removal — caller-owned TerminalResult slot is sole channel

Final of 4 PRs implementing Task 111. Brian-authorized 4-PR breakdown
2026-05-03. Closes the cross-fn discharge propagation gap that the
prior multi-return / out-pointer / per-fn-stack-slot attempts on PR
#50 could not address (see `[DEVIATION Task 111]`); the option-(C)
"thread `*mut TerminalResult` through every fn ABI" architectural
slice is now complete.

**Channel transition.** 111a–c ran TLS and the caller-owned
`TerminalResult` pointer in dual-write. 111d switches handle-exit
reads from TLS to the pointer and removes the TLS path entirely:

- `LAST_TERMINAL_TAG` / `LAST_TERMINAL_VALUE` thread-local statics
  removed from `runtime/src/handlers.rs`.
- 4 FFI helpers removed: `sigil_last_terminal_tag`,
  `sigil_reset_last_terminal_tag`, `sigil_last_terminal_value`,
  `sigil_reset_last_terminal_value`. 4 corresponding
  `module.declare_function` entries in codegen + the `PerFnRefsCtx`
  / `PerFnRefs` / `Lowerer` ref fields all removed; the destructure
  clauses at every Lowerer construction site are correspondingly
  cleaned up.
- TLS dual-write at `sigil_run_loop`'s DONE + DISCHARGED terminals
  removed. The PR #90 R1 issue 3 TLS-vs-pointer agreement
  debug_asserts removed (no TLS to compare against).
- The slot becomes the sole terminal channel; codegen always
  passes a non-null pointer (main shim + Sync/Cps/synth fn ABI
  threading guarantee). Null is still tolerated at run_loop for
  runtime tests that drive the trampoline directly without observing
  the terminal.

**Codegen sites switched to load/store on the slot.**

- 5 handle-exit tag / value queries (`Expr::Handle`'s outer
  return-arm / no-return-arm / k-pair-call branches): replace
  `call sigil_last_terminal_tag/value` with
  `load i64 [terminal_out_param + {0,8}]`. The `discharged_const`
  iconst widens from I32 to I64 to match the slot field width.
- 2 handle-entry resets (was `call sigil_reset_last_terminal_*`)
  become two `store i64` at offsets 0 and 8: `(value=0, tag=DONE)`
  before body lowering. Same role as the old TLS resets — keeps
  no-perform handles seeing a clean DONE state, and prevents a
  prior handle's terminal from leaking into the next handle in
  the same fn.
- 2 new `i32` constants `TERMINAL_RESULT_VALUE_OFF = 0` /
  `TERMINAL_RESULT_TAG_OFF = 8` mirror the runtime's `TerminalResult`
  `#[repr(C)]` layout.

**Field type relaxed.** `Lowerer.terminal_out_param: Option<Value>`
collapses to `Value` now that all 9 construction sites populate
`Some(...)`. The `terminal_out_or_null` helper (which existed only
to cover the dead `None` branch) is gone; call sites read
`self.terminal_out_param` directly. 13 call sites updated.

**End-to-end test added.** `task_111d_terminal_channel_propagation_-
through_nested_sync_calls` pins the new pointer-side path: a
3-deep Sync user-fn call chain (`a → b → c`) where `c` performs an
effect whose handler discharges. The (value=17, tag=DISCHARGED)
written by `sigil_run_loop` at `c`'s perform-site terminal must
propagate through `b`'s and `a`'s returns into the handle-exit's
load from the SAME caller-owned slot, route through the
discharge_block, and surface `17` to stdout. Closes the test
coverage gap Brian flagged in PR #91 R1 issue 4.

**Runtime / codegen comment refresh.** `TerminalResult` docstring,
`sigil_perform`'s "outer codegen logic" reference, and the
`sigil_run_loop` chain-routing note all update to reflect the
post-111d state. The transitional 111a/b/c notes are gone.

**Verification.**

- `cargo build --workspace --release` — clean. (Required for
  `libsigil_runtime.a` to be rebuilt; the compiler's `link.rs`
  prefers the release lib over debug.)
- `cargo clippy --workspace --all-targets` — clean.
- `cargo fmt --all -- --check` — clean.
- `bash scripts/check-no-interior-pointers.sh` — OK.
- `cargo test --workspace` — 275 passed (incl. new e2e); 3 failed
  (pre-existing perf timing tests fib_perf / fib_cps_perf /
  tree_example under parallel test-runner contention).

**Closure.** `[DEVIATION Task 111]` (deferred 2026-04-30) is now
fully closed. Plan B' Stage-6.8-followup carryover #1 (TLS → caller-
owned terminal channel) closes alongside. Plan D Task 119 closeout
audit's Task 111 line-item is unblocked.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* [Plan D Task 111d] PR #92 R1 review fixes — issues 1 (nested-handle leak), 2 (doc rot), quality nit

Addresses Brian's R1 review on PR #92.

**Issue 1 (medium, real correctness bug) — nested-handle slot leak.**
Confirmed via repro: when an outer handle's body contains an inner
handle whose op-arm DISCHARGES, the inner's `sigil_run_loop` writes
`(value, DISCHARGED)` to the fn-wide slot. The inner handle's exit
loads tag/value and routes through its discharge_block correctly,
but the slot RETAINS the inner's terminal state. The outer body
continues evaluating post-inner-handle expressions synchronously
(no further `sigil_run_loop` writes). When the outer handle's
exit-tag query loads from the slot, it reads the inner's leftover
DISCHARGED tag and incorrectly:

1. Skips the outer's return arm, AND
2. Loads the inner's leftover value as the handle's overall.

Pre-111d this leak existed identically in TLS form — the inner's
TLS write to `LAST_TERMINAL_TAG` clobbered the outer's expected
DONE state — but no test exercised the composition. Confirmed
pre-existing by repro on 111c (also outputs `99`); the slot is
load-bearing post-111d so the leak was newly the only source of
truth.

Fix: snapshot/restore at every handle entry/exit. At handle entry,
load the slot's pre-handle `(value, tag)` into local Cranelift
Values (`snap_value_v`, `snap_tag_v`). At every exit path (return-
arm and no-return-arm merge-blocks), restore the snapshot to the
slot before yielding the handle's overall. Pinned by new e2e
`task_111d_nested_handle_inner_discharge_does_not_leak_to_outer`
(invariant: `1090\n` post-fix; `99\n` pre-fix).

**Issue 2 (doc rot) — TLS/FFI references scattered through the
tree.** Brian flagged 13 hits across 5 files where stale references
to the now-removed `LAST_TERMINAL_*` thread-locals + `sigil_last_-
terminal_*` / `sigil_reset_last_terminal_*` FFI helpers persisted
in docstrings (most prominently `sigil_run_loop`'s contract docstring
which still described the dual-write transitional state). Updated
all to reflect the post-111d reality (caller-owned slot is sole
terminal channel) while preserving the historical "previously TLS"
provenance:

- `runtime/src/lib.rs:39-46` — module FFI list now notes the four
  TLS helpers were removed by 111d.
- `runtime/src/handlers.rs:1693-1701` — `sigil_run_loop`'s contract
  docstring rewritten: "slot is the sole terminal channel post-111d";
  null tolerance scoped to runtime unit tests.
- `compiler/src/codegen.rs:7189-7192, 10930-10934, 14704-14708,
  15375-15379, 16528-16532, 16729-16732, 16976-16983` — 7 codegen
  comments updated.
- `abi/src/effect.rs:43-48` — `NEXT_STEP_TAG_DISCHARGED` doc
  updated.
- `compiler/tests/e2e.rs:1001, 4600, 9550` — 3 test docstrings
  updated.

**Quality nit — `!out.is_null()` annotation.** Brian asked for a
SAFETY/NOTE line at the null check explaining "unreachable from
generated code post-111d; only runtime unit tests pass null." The
DONE branch already had a brief comment pointing at the DISCHARGED
bypass; expanded both to be self-contained and explicit about the
unreachability.

**Verification.**

- `cargo build --workspace --release` — clean.
- `cargo clippy --workspace --all-targets` — clean.
- `cargo fmt --all -- --check` — clean.
- `bash scripts/check-no-interior-pointers.sh` — OK.
- `cargo test --workspace` — 276 passed (incl. new nested-handle
  e2e); 3 failed (pre-existing perf timing tests).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
boldfield added a commit that referenced this pull request May 3, 2026
…ewrite, Stage 11+13 sign-offs (#93)

* [Plan D Task 119a] Closeout audit — flip Plan-D-closed entries; spec §14 rewrite; Stage 11+13 sign-offs

Plan D Task 119 closeout audit, slice (a). Splits the plan body's
single-task scope into two PRs — this PR ships the documentation
sweep + spec update + sign-offs (low-risk, no architectural
changes); 119b queues the JSON parser implementation as a follow-
up since the parser is a substantial Sigil program (Task 80 part 2
per `examples/json.sigil`'s v1-vs-v2 note) and per Plan D's hard-
rule split belongs to Plan C completion (stdlib/demo migrations
consuming Plan D's surface).

**Deviation entries flipped from "Open" / "Deferred" to "Closed by
Plan D Task NN" with PR refs:**

- `PLAN_C_DEVIATIONS.md` `[DEVIATION Task 72]` constraint #3
  (wrapper-fn-frame discharge composition) → Closed by Plan D
  Task 112 (a + b + c). Tail-perform Cps wrapper composition
  (PR #83), chained-let-yield Cps wrapper composition (PR #85),
  Case D wrapper-in-chain + Slice B outer arm (PR #86).
- `PLAN_C_DEVIATIONS.md` `[DEVIATION Task 73]` constraint #2
  (Static-N arm-body chain) → Closed by Plan D Task 117.
- `PLAN_C_DEVIATIONS.md` `[DEVIATION Task 73]` constraint #3
  (no first-class continuations) → Closed by Plan D Task 117
  (PR #60 substrate `4b3f0b4` + follow-up type-position surface).
- `PLAN_C_DEVIATIONS.md` `[DEVIATION Task 73]` constraint #4
  (no conditional / branched k-call) → Closed by Plan D Task 118
  (PR #81 squash `3904a12`).
- `PLAN_B_PRIME_DEVIATIONS.md` Stage-6.8-followup architectural
  carryover #1 (TLS out-channel → packed multi-return) → Closed
  by Plan D Task 111 (a + b + c + d via PRs #89/90/91/92).
  Carryover #2 (Sync shim emission gating) stays open per its
  named GitHub issue tracking.

**Spec §14 rewrite.** v1 limits section restructured around the
post-Plan-D state:

- 6 of 9 originally-listed limits closed by Plan D (first-class
  continuations, conditional/branched k-call, wrapper-fn-frame
  composition, type-parameterized effect rows, tuple types,
  row-poly fn-typed params); each entry retains a closure log
  entry pointing at the implementing PR.
- 3 survivors documented as permanent v1 surface decisions or
  future-plan-tier candidates (`Float`, `Unit` literal,
  `for` / `while`).
- 1 NEW v1 limit added: **E0145** captured-k inside lambdas in
  programs containing generic fns. Introduced by Task 117's
  escape barrier; mechanical fix paths documented (move handler
  into non-generic wrapper, or rewrite arm body to call k(arg)
  directly without intermediate lambda capture).

**`#[ignore]` audit.** Only one closeout-candidate test remained
in the pre-119 inventory (`task_78_5_pending_g2a_bracketed_e_-
alias_unconstrained`). Un-ignored briefly to verify post-Plan-D
status; surface symptom changed E0128 → E0145, confirming Plan D
Task 116 closed the original G2.b row-poly lambda gap but
Task 117's escape barrier surfaced the captured-k-across-generic-
boundary v1 limit (orthogonal closure, not a regression). Test
re-ignored with updated rationale linking to E0145. Post-119a:
zero `#[ignore]`s in the compiler/runtime test surface point at
Plan D closure paths; all surviving `#[ignore]`s are non-
architectural test-infra gaps (piped-stdin harness for
`std_io_read_line_via_piped_stdin_pending_test_infra`,
`cargo test --ignored` for `arena_overflow_aborts`).

**Sudoku smoke gate** already in place at `task_117_smoke_gate_-
sudoku_solves_4x4` — `examples/sudoku.sigil` compiles and runs
end-to-end via e2e harness on both hosts; passes deterministically
post-Plan-D. The Sudoku side of the smoke-gate criterion is met.

**Stage 11 review checkpoint update — Task 111 closure (2026-05-
03).** The Stage 11 deferral note for Task 111 updates with the
revised closure path's shipping reference (4-PR sequence
PRs #89/90/91/92). `[DEVIATION Task 111]` is now FULLY closed.

**Stage 13 review checkpoint sign-off (2026-05-03 / 119a).** Four
criteria from the plan body validated: lifted-lambda closure-
record discipline (preserved through Tasks 117 + 111d), arena
escape rate (0% baseline preserved), Step 118 minimality (shipped
as architectural slice per `[DEVIATION Task 118]`), Sudoku smoke
(passing). Stage 13 sign-off COMPLETE.

**Plan D status.** Task 119 status flipped to "done (119a shipped;
119b — JSON parser smoke gate — queued as follow-up)". Plan D's
core architectural cluster (Tasks 111-118) is shipped + verified.
JSON parser implementation routes to Plan C completion per the
hard-rule split.

**Verification.**

- `cargo build --workspace --release` — clean.
- `cargo clippy --workspace --all-targets` — clean.
- `cargo fmt --all -- --check` — clean.
- `bash scripts/check-no-interior-pointers.sh` — OK.
- `cargo test --workspace` — 276 passed; 3 failed (pre-existing
  perf timing tests under parallel test-runner contention).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* [Plan D Task 119a] PR #93 R1 review fixes — issues 1, 2, 3

Addresses Brian's R1 review on PR #93. All 3 issues are
documentation polish; no behavior change.

**Issue 1 — §14 prose promised inline links it didn't deliver.**
Replaced the misleading "links to its corresponding `[DEVIATION
Task NN]` entry" closing paragraph with an explicit grep-target
hint (search by `[DEVIATION Task <N>]` in the named files). Avoids
maintaining cross-file anchors that drift as deviation entries are
renumbered or merged.

**Issue 2 — Stale codegen line refs in `PLAN_C_DEVIATIONS.md`
Task 73 entries.** Brian flagged that constraints #2 / #3 / #4
cite `compiler/src/codegen.rs` line numbers and a rejection
diagnostic string ("first-class continuations are deferred to v2")
that no longer match HEAD post-Plan-D. Removed the stale
breadcrumbs (functions referenced by name only) and added a
parenthetical note pointing readers at the function names as the
durable grep targets. Constraint #3 also gains a note that the
pre-Task-117 rejection diagnostic is retired and replaced by the
**E0145** escape-barrier check on captured-k inside lambdas in
programs containing generic fns (so a future reader following the
breadcrumb won't be confused).

**Issue 3 — Stage 11 contradiction in `PLAN_D_PROGRESS.md`.** The
original Stage 11 review checkpoint paragraph reads as currently
true ("Task 111 deferred"). Added an inline `[SUPERSEDED
2026-05-03 — Task 111 deferral resolved; see Stage 11 closure
update paragraph immediately below.]` tag mirroring the convention
`[DEVIATION Task 112]` already uses for similar resolutions in
`PLAN_D_DEVIATIONS.md`.

**Test plan note (Brian's nit, no source change).** The 3
pre-existing perf failures are: `fib_perf_example_prints_6765_-
under_50ms`, `fib_cps_perf_example_prints_6765_under_500ms`,
`tree_example_prints_32767_under_500ms`. Verified pre-existing
by reproducing on pre-Plan-D state in earlier 111d session;
all three exhibit the same pattern (parallel test-runner
contention pushing wall-clock past the floor — runs cleanly
in isolation).

**Verification.**

- `cargo fmt --all -- --check` — clean.
- `bash scripts/check-no-interior-pointers.sh` — OK.
- No code changes; no rebuild required.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
boldfield added a commit that referenced this pull request May 4, 2026
…ename, breadcrumb

5 review items addressed:

1. **Coverage gap (item #1)** — added `std_choose_all_choices_-
   two_perform_then_two_pure_lets_pure_tail` e2e test pinning the
   pure-tail variant's `seen_pure_after_yield` path with hand-written
   multi-prefix lets (`let s = a + b; let s2 = s + 1; s2`). The
   symmetric branched-tail tests don't exercise this independently.

2. **Doc rot (item #2)** — `detect_pattern_c_dispatch` docstring at
   codegen.rs updated to reflect the 7-tuple return shape (added
   `then_stmts` / `else_stmts` describing arm-body Block stmts and
   how the work-stack emit consumes them).

3. **`tail_prefix_lets` widening misnaming (item #3)** — local
   renamed from `widened` to `env_value`. For non-Int slot kinds
   the path is no longer disguised as widening; instead each
   non-Int arm asserts the lower_expr contract (`debug_assert_eq!`
   against the slot kind's natural Cranelift width: I8 for Bool/
   Byte/Unit, I32 for Char, pointer_ty for String/Closure/User).
   A future lower_expr change that breaks the contract now fails
   loudly instead of silently bit-mismatching at use sites.

5. **Pre-yield pure-let breadcrumb (item #5)** — branched-tail
   variant's catch-all `_ => return None` arm now carries the same
   one-line note the pure-tail variant has, calling out that pure
   lets BEFORE the first yield require a pre-yield env-prep emit
   phase that the current Plan C Task 81 extension doesn't ship.

(Item #4 — PR description out of sync — addressed via `gh pr edit`
in a separate operation since it doesn't touch the diff.)

cargo fmt --all clean. 292/293 e2e green (1 pre-existing macOS perf-
floor failure unrelated; up from 290 via the new pure-tail test).
156 runtime + 0 regressions.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
boldfield added a commit that referenced this pull request May 4, 2026
* [Task 81] Fix DONE-path try_pop entry_depth respect; pin pure-tail multi-perform composition

Bug: nested sigil_run_loop (inside sigil_continuation_invoke's Phase 1)
consumed outer_post_arm_k entries belonging to the OUTER run_loop. The
DONE-path's `outer_post_arm_k_try_pop` at handlers.rs:1869 didn't gate
on `outer_post_arm_k_entry_depth`, so the first inner DONE pop succeeded
(removed outer's entry, leaving the slot null), and the next inner invoke's
DONE pop dereferenced a null fn_ptr → SIGSEGV.

Fix: cap try_pop at entry_depth so each run_loop only consumes entries it
owns. Mirrors the symmetric DISCHARGED-path drain discipline (Layer 3c
from PR #39 review §5).

Closes the chained-let-yield + pure-tail multi-perform composition gap
under runtime-N dischargers (`std/choose.sigil`'s `all_choices` /
`first_choice` over a body like `let a = perform; let b = perform; a + b`).
2 new e2e tests pin: (1) inline single-shot k(0) baseline (#[test]),
(2) `all_choices` over chained-let-yield + pure tail (#[test]).

Two architectural gaps remain v2 work (#[ignore]'d e2e tests document
the closure path):
- Nested-if branched tail not classified by
  is_let_yield_prefix_then_branched_cps_tail_body (only Pure/CpsCall/
  Perform leaves, not nested If/Match)
- Pure let-bindings between perform let-yields get ANF-lifted, breaking
  the chain classifier's "every stmt is Perform/Call" rule

Both are needed for natural Sudoku-shape bodies; PLAN_C_PROGRESS Task 81
records the architectural finding and v2 follow-up scope.

156 runtime + 288/289 e2e green (1 pre-existing macOS perf-floor failure
unrelated).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* [Task 81] ANF-tolerant chain classifier: accept pure trailing let-yields

Body shapes like:
  let a = perform A;
  let b = perform B;
  if a == 1 && b == 1 { 99 } else { perform Choose.fail() }

ANF-elaborate to:
  let a = perform A;
  let b = perform B;
  let _0 = a == 1;       // ANF-lifted intermediate
  let _1 = b == 1;       // ANF-lifted intermediate
  if _0 && _1 { 99 } else { perform Choose.fail() }

The chain classifiers (`is_simple_chained_let_yield_then_pure_tail_body`
and `is_let_yield_prefix_then_branched_cps_tail_body`) previously
rejected the lifted body because their per-stmt match required every
`Stmt::Let` value to be `Expr::Perform` or `Expr::Call`. Falling
through to Sync ABI silently broke `first_choice` / `all_choices`
composition: the second perform never made it through the runtime-N
discharger machinery.

Fix:
- Both classifiers now walk stmts and accept `Stmt::Let` with pure
  RHS interspersed AFTER the last perform/call yield. Yield count is
  returned (= chain_length); pure trailing intermediates are the
  closure path for ANF-lifted compound subexpressions.
- New `TailPrefixLet` struct + `tail_prefix_lets: Vec<TailPrefixLet>`
  field on `ChainStepRole::Final`. Pre-pass partitions stmts into
  yield-bearing (`Perform` / `Call`) chain steps and pure trailing
  intermediates; FINAL synth-cont's emit lowers each pure intermediate
  in order (binding into env via `lower_expr` + `env.insert`) before
  the existing pattern-c-detect / standard-tail dispatch.
- `collect_chained_synth_cont_captures` extended to walk
  `tail_prefix_lets` values BEFORE adding their names to the bound
  set, so helper params referenced only in lifted intermediates
  (e.g. `let _0 = x + threshold` where `threshold` is a helper param)
  flow through the closure record.
- Both `compute_user_fn_abi` chain-classification arms updated to
  tolerate pure stmts (the chain-step push loop now skips them; the
  cap check `captures.len() + chain_length + 1 < MAX_CLOSURE_ENV_SLOTS`
  remains correct against `chain_length` as yield count).
- Test-only `collect_chained_synth_cont_captures` callers (7 sites)
  updated to pass empty `&[]` for `tail_prefix_lets` (no behavioral
  change for those tests).

New e2e test pins the ANF case end-to-end:
- `std_choose_first_choice_two_sequential_performs_anf_intermediates`
  — body uses `if a == 1 && b == 1` shape; ANF lifts both compares
  into trailing pure lets; chain classifier accepts; FINAL emit
  evaluates them; first_choice short-circuits on `a=1, b=1 → 99`.

288/289 e2e green (1 pre-existing macOS perf-floor failure unrelated);
156 runtime + 0 regressions.

Remaining Plan C Task 81 work: nested-If branched-tail
classifier+emit (recursive walk for body shapes like `if a == 1 {
if b == 1 { ... } else { perform fail } } else { perform fail }`).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* [Task 81] Nested-If branched-tail classifier+emit: BranchedCpsLeaf::Nested

Body shapes like:
  let a = perform A;
  let b = perform B;
  if a == 1 {
    if b == 1 { 99 } else { perform Choose.fail() }
  } else {
    perform Choose.fail()
  }

previously rejected by `is_let_yield_prefix_then_branched_cps_tail_body`
because `classify_branched_cps_tail_branch_expr` only accepted Pure /
CpsCall / Perform leaves at one level. Nested If/Match leaves fell to
Sync ABI; first_choice / all_choices then saw only the outer perform's
continuation.

Fix:
- New `BranchedCpsLeaf::Nested` variant. classify_branched_cps_tail_-
  branch_expr recursively classifies nested If (and the if-desugar
  Match-on-Bool 2-arm shape) when both sub-branches classify. At
  least one sub-branch must be Cps-eligible (CpsCall / Perform /
  Nested) per a new `leaf_is_cps_eligible` helper; otherwise the
  whole tail is wholly pure and the existing Pure path handles it
  (lower_expr lowers nested If as a value via internal phi).
- classify_branched_cps_tail_branch (block variant) refactored to
  delegate to the bare-expr classifier, sharing the nested
  recognition.
- is_let_yield_prefix_then_branched_cps_tail_body's has_cps_call
  check uses leaf_is_cps_eligible (covers Nested transitively).
- FINAL synth-cont's branched-tail emit refactored from a 2-element
  `for` loop to a `while let Some(work.pop())` loop. When the popped
  leaf_kind is Nested, the loop calls detect_pattern_c_dispatch on
  the nested tail, lowers the nested cond, brifs into 2 fresh blocks,
  and enqueues each sub-branch's (block, expr, kind) tuple. Pure /
  CpsCall / Perform leaves dispatch as before. Termination: each
  iteration either emits a leaf (return_/jump terminates the block)
  or pushes 2 strictly-smaller sub-trees; finite tail expression
  bounds the iteration.

Two new e2e tests pin the nested case end-to-end:
- `std_choose_first_choice_two_sequential_performs_nested_if_tail`
  — body uses `if a == 1 { if b == 1 { 99 } else { fail } } else
  { fail }`; first_choice short-circuits on a=1, b=1 → 99.
- (also kept `..._anf_intermediates` from the prior commit which
  exercises the && form via ANF-lifted intermediates.)

290/291 e2e green (1 pre-existing macOS perf-floor failure unrelated);
156 runtime + 0 regressions.

Remaining Plan C Task 81 work: cross-fn perform composition through
Cps-call branched-tail leaves. The natural Sudoku `solve` body's
recursive `solve(set(...), cell+1)` Cps-call leaf composes via the
existing chain-step machinery (CpsCall leaf forwards caller_k_pair),
but the cross-fn end-to-end behaviour under first_choice needs
verification — a focused e2e test (`pick_outer` calls `pick_inner`
across a branched-tail CpsCall leaf) currently returns -1 (None);
investigating whether the issue is the Cps-call leaf's caller_k_pair
forward arithmetic or the helper-fn's chain caller_k_pair load.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* [Task 81] Arm-body Block stmts threading: cross-fn perform composition

Arm bodies in branched tails frequently contain ANF-lifted pure
intermediates as `Stmt::Let` stmts inside an `Expr::Block`:
  if x == 2 {
    p + x * 10   // ANF lifts to: { let _0 = x * 10; p + _0 }
  } else {
    perform Choose.fail()
  }

The pre-extension `classify_branched_cps_tail_branch` rejected any
non-empty Block stmts, falling through to Sync ABI. Cross-fn
composition (`pick_outer` → `pick_inner(p)` via Cps-call branched-
tail leaf) silently broke under `first_choice` — both fns dropped to
Sync, the perform-site continuations weren't captured through the
runtime-N discharger machinery, and the discharger saw only the
outer perform's continuation.

Fix:
- `classify_branched_cps_tail_branch` accepts arm-body `Block`s
  whose stmts are all `Stmt::Let` with pure RHS. Each pure stmt is
  an ANF-lifted intermediate; the leaf-emit walker lowers them
  before the leaf-emit dispatch.
- `detect_pattern_c_dispatch` returns 7-tuple: `(cond, then_stmts,
  then_leaf, else_stmts, else_leaf, then_kind, else_kind)`. Stmts
  are `&[Stmt]` slices into the arm body's Block (or `&[]` for
  bare-Expr arm bodies). Per-branch `extract_leaf` returns both
  stmts and leaf for both arm-body shapes (Block-wrapped, bare).
- Work-stack item structure extended to carry `&[Stmt]` per branch.
  The work-stack iteration lowers each arm-body stmt's RHS via
  `lower_expr` and binds the result into env via `lowerer.env.insert`
  before dispatching the leaf-emit (Pure / CpsCall / Perform / Nested).
  Nested re-dispatch threads its own arm-body stmts through the
  recursive `detect_pattern_c_dispatch` call.
- `leaf_is_cps_eligible` (Plan C Task 81 helper) replaces inline
  `matches!(.., CpsCall | Perform)` checks at both classifier and
  detect-dispatch sites for consistency.

New e2e test pins cross-fn end-to-end:
- `std_choose_first_choice_multi_perform_site_recursive_branched`
  — `pick_outer` performs `Choose.choose(2)`, branches on `p == 1`
  to either `pick_inner(p)` (Cps-call leaf) or `perform fail`.
  `pick_inner` performs `Choose.choose(3)`, branches on `x == 2` to
  either `p + x * 10` (Pure leaf with ANF-lifted `_0 = x * 10`) or
  `perform fail`. `first_choice` short-circuits at (p=1, x=2) → 21.

290/291 e2e green (1 pre-existing macOS perf-floor failure unrelated);
156 runtime + 0 regressions. The 9×9 std.choose-based Sudoku demo
is now expressible end-to-end via the natural `solve(board, cell)`
recursive shape (classifier accepts; emit composes through
first_choice).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* [CHORE] cargo fmt --all

Fixes rustfmt diffs introduced by the Plan C Task 81 commits:
- collect_chained_synth_cont_captures call sites at lines 324, 403:
  multi-line argument layout per rustfmt's width rules
- chained-let-yield pre-pass binding_kinds.push at lines 8119, 8137:
  single-line method-chain
- Tail-prefix-let lower at lines 13014, 13022: single-line
  builder.func.dfg.value_type / builder.ins().uextend
- classify_branched_cps_tail_branch_expr nested Match arm at line
  22977: tuple-let formatting
- e2e.rs trailing whitespace at line 8890

No behavioral changes; cargo fmt --all -- --check now clean.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* [Task 81] Address PR #96 review: pure-tail test, doc rot, env_value rename, breadcrumb

5 review items addressed:

1. **Coverage gap (item #1)** — added `std_choose_all_choices_-
   two_perform_then_two_pure_lets_pure_tail` e2e test pinning the
   pure-tail variant's `seen_pure_after_yield` path with hand-written
   multi-prefix lets (`let s = a + b; let s2 = s + 1; s2`). The
   symmetric branched-tail tests don't exercise this independently.

2. **Doc rot (item #2)** — `detect_pattern_c_dispatch` docstring at
   codegen.rs updated to reflect the 7-tuple return shape (added
   `then_stmts` / `else_stmts` describing arm-body Block stmts and
   how the work-stack emit consumes them).

3. **`tail_prefix_lets` widening misnaming (item #3)** — local
   renamed from `widened` to `env_value`. For non-Int slot kinds
   the path is no longer disguised as widening; instead each
   non-Int arm asserts the lower_expr contract (`debug_assert_eq!`
   against the slot kind's natural Cranelift width: I8 for Bool/
   Byte/Unit, I32 for Char, pointer_ty for String/Closure/User).
   A future lower_expr change that breaks the contract now fails
   loudly instead of silently bit-mismatching at use sites.

5. **Pre-yield pure-let breadcrumb (item #5)** — branched-tail
   variant's catch-all `_ => return None` arm now carries the same
   one-line note the pure-tail variant has, calling out that pure
   lets BEFORE the first yield require a pre-yield env-prep emit
   phase that the current Plan C Task 81 extension doesn't ship.

(Item #4 — PR description out of sync — addressed via `gh pr edit`
in a separate operation since it doesn't touch the diff.)

cargo fmt --all clean. 292/293 e2e green (1 pre-existing macOS perf-
floor failure unrelated; up from 290 via the new pure-tail test).
156 runtime + 0 regressions.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
boldfield added a commit that referenced this pull request May 4, 2026
Addresses no-regret items from PR #97 review:

**#1 — silent binding drop (`emit_object`).** The chained-let-yield
emit pass at codegen.rs:8136 had `_ => continue` for non-Ident
callees on `Expr::Call` let-RHS, which silently dropped the binding
when the classifier had accepted it as a tail-prefix let. Mirrors
the latent-bug pattern PR #83 review #5 closed via `unreachable!()`.
Fixed by treating non-Ident-callee Calls as tail-prefix lets
(symmetric with the Ident-callee non-Cps-wrapper arm and the
`other =>` arm). Classifier and emitter now agree on the shape.

**Sentinel `ChainedNextStep::Perform` removed (codegen.rs:9471).**
The chain_length=0 branch synthesized a `ChainedNextStep::Perform`
with empty `effect`/`op` fields purely to satisfy the
`(ChainedNextStep, Option<FuncId>)` tuple shape, with the comment
"isn't actually used at Phase 6 in this case". Reviewer flagged
that any code path between Phase 4 and Phase 6 that started
inspecting `body_first_step` would silently misbehave. Refactored
the tuple to `(Option<ChainedNextStep>, Option<FuncId>)` and added
an `unreachable!()` guard at the Phase 6 match site so a future
regression that lands in the match path with a None
`body_first_step` (i.e., `is_zero_chain` accidentally false when
it should be true) panics with a clear message instead of
silently misbehaving.

**Magic name `__zero_chain_dummy__` → `$zero_chain_dummy`
(codegen.rs:8266).** Reviewer flagged that the env namespace is
shared with user identifiers, so a fn parameter or local named
`__zero_chain_dummy__` would silently collide. Renamed to the
`$`-prefixed convention used by `$lambda_N` synth fns; the lexer
rejects `$` as an identifier-start character (lexer.rs:126:
`is_ascii_alphabetic() || c == '_'`), making the name
syntactically un-namable in user source. Comment near the
constructor cites the lexer rule + naming-convention
cross-reference.

**Stale docstring `cap = 32 → 4096` (handlers.rs:968).**
Reviewer nit; `sigil_body_return_arm_push`'s doc-comment
mentioned the old cap.

**Not addressed in this commit** (deferred per PR thread; see
review for rationale):

- **#2 — match-tail bool-only gate is silent miscompile of other
  match shapes.** The classifier's BoolLit gate at codegen.rs:
  23274 keeps emit/classifier in lockstep for the cases the
  emit-side `detect_pattern_c_dispatch` actually handles. A
  Match-on-Int / Match-on-sum-type body that should be Cps falls
  through to the Sync-ABI shim, which under DISCHARGED-arm
  scenarios drops the discharge tag in `terminal_out` because
  the OUTER run_loop terminates with DONE wrapping the discharged
  value. Surfacing this as a compiler error requires either fixing
  the underlying Sync→Cps interop tag-overwrite (option A on the
  PR thread, ~1–2 days) or adding a pre-emit gate. Pending the
  option-A decision.

- **#3 — `expr_is_pure → !expr_contains_perform` admits mutating
  builtins under multi-shot.** Same root cause as #2 (Sync→Cps
  interop discharge seam). Fix is option A.

- **#4 — `BODY_RETURN_ARM_STACK_SIZE: 32 → 4096` is a workaround,
  not a discipline rework.** Comment in handlers.rs already
  records the v2 follow-up (dynamic resize OR an interop
  discipline that doesn't push per nested invocation). Tracking
  issue would belong in the planning repo, not source.

- **#6 — additional classifier/emit unit tests.** Specifically:
  chain_length=0 with non-Sudoku body shape; arm-body Cps-Call
  let-RHS DISCHARGED propagation in isolation; match-tail
  rejection sum-type pattern; non-Ident-callee binding-drop
  regression. Not blocking the merge, but the bugs the review
  surfaced would have been caught earlier with these.

Verification: 290/293 e2e single-threaded (only pre-existing
perf-floor failures, same as b9c5afe baseline). Sudoku 9×9 and
all 3 raise/catch tests still pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
boldfield added a commit that referenced this pull request May 4, 2026
Closes the iter-2 review's items #1-#4 (the dominant ones) plus #7
(comment) and #8 (negative test). Items #5 (cap) and #6 (expr_-
contains_perform over-rejection) are already accepted as v1
follow-ups; the queued plan `2026-05-04-sigil-return-arm-via-args.md`
in the designs repo tracks #5's architectural fix.

**#1 + #2 + #3 + #4 — extract discharge-gate helper.** ~230 lines
of duplicated reset → lower_expr → tag check → branch → propagate
→ continue boilerplate across four sites collapsed into two
Lowerer methods:

- `Lowerer::emit_terminal_out_reset_to_done()` writes (0, DONE)
  to `terminal_out` before `lower_expr`. Mirrors the handle
  expression's pre-body slot reset (Bug 1) shape.
- `Lowerer::emit_discharge_propagation_check()` loads
  `terminal_out.tag` after `lower_expr`, branches on DISCHARGED,
  emits `NextStep::Discharged(terminal_out.value)` + `return_`
  on the discharge path, leaves the builder positioned at the
  continuation block on the non-discharge path.

The four call sites — tail-prefix-let lowering, Pattern C Pure
leaf emit, standard tail emit, arm-body Cps Call let-RHS — now
each read:

```
lowerer.emit_terminal_out_reset_to_done();
let v = lowerer.lower_expr(...);
lowerer.emit_discharge_propagation_check();
```

The reviewer's specific concerns at the arm-body site (#1: no
reset; #2: bare `8` instead of `TERMINAL_RESULT_TAG_OFF`; #3:
narrowed `v` instead of `terminal_out.value` for NextStep::-
Discharged) all collapse into the unified path: the helpers use
the named constants, do the reset, and read from `terminal_out.
value` consistently. Asymmetry is no longer possible — there's
one path.

**#7 — comment the non-Ident-callee `continue` in `compute_user_-
fn_abi`.** The classifier accepts non-Ident-callee Calls only as
tail-prefix lets (per `is_simple_chained_let_yield_then_pure_-
tail_body`'s `yield_count > 0 && !expr_contains_perform` clause);
the matching emit-pass site at `~8136` materializes the binding
into `tail_prefix_lets`. This pass only counts `chain_length`
and collects captures; non-Ident callees don't contribute a
chain step. The asymmetry between `continue` here and
`tail_prefix_lets.push(...)` at the emit site is intentional —
not the latent-bug pattern PR #83 review #5 closed.

**#8 — negative regression test.** `task_81_sudoku_unsolvable_-
puzzle_returns_no_solution` patches the canonical `examples/-
sudoku.sigil` to introduce a contradictory clue (two `5`s in
row 0), exercising the discharge-propagation path through ~50
nested `solve_with_undo` handles all the way to the outermost
first_choice's `None` arm. Asserts exit code 1 + stdout
`"no solution\n"`. Complements the positive-path test which
exercises `Choose.fail` propagation DURING the search but never
reaches the top-level discharge path.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
boldfield added a commit that referenced this pull request May 7, 2026
Combined fixes for:
- **Review B item 3** (PR review 4246507154, 17:59) — comment style
- **Review C items 9 + 10** (issue comment 4399992055, 18:30) —
  Float/Int64 == extension + e2e lossy-decode test
- **Review D items 1–4** (PR review 4246835141, 18:46) — E0060 both
  operands, decoder dedup, spec §3.4 typo, std/string duplicate

### B item 3 — Plan-C-addendum comment spam pruned

Deleted 13 redundant single-line `// Plan C addendum (Char) — boxed
Char is pointer-typed.` comments adjacent to `EnvSlotKind::Char | ...`
match arms across `compiler/src/codegen.rs`. The match-arm context
alone makes the change self-evident; the load-bearing ones (literal
lowering, type-mapping fns, dispatch arms, helpers, FuncIds struct,
runtime counters, ast.rs `is_pointer()`, layout.rs
`is_gc_pointer_ty`) stay.

### C item 9 — Heap-boxed-primitive == rejection extended to Float / Int64

The earlier Char-only E0060 rejection now generalizes to all three
heap-boxed primitives. New `BoxedPrim` enum + `boxed_primitive_-
eq_name` helper drive the typecheck arm; per-type suggestion +
ordering hint string. Float adds the NaN-aware caveat in the error
message. Four new typecheck unit tests pin Float / Int64 `==` and
`!=` rejection.

### C item 10 — e2e test for user-visible lossy UTF-8 decode

`string_chars_invalid_utf8_replaces` constructs a `ByteArray` with
a known-invalid byte (`0xFF`) via `byte_array_alloc` +
`byte_array_concat`, bypasses validation via
`string_from_bytes_alloc` (the post-validation primitive copies
bytes verbatim), and verifies `string_chars` emits U+FFFD (65533)
for the invalid byte. Closes the runtime → user-program coverage
gap.

### D item 1 — E0060 char check now guards both operands

The earlier check only inspected `lt.as_ref()`. Now both `lt` and
`rt` are checked via `lt_boxed.or(rt_boxed)`, so `42 == 'a'` and
similar shapes still fire the named-function suggestion even when
LHS is non-Char or `None`.

### D item 2 — UTF-8 decoder deduplicated

Extracted `decode_next_codepoint(bytes, offset) -> (cp, len)` as
the single source of truth for Sigil's lossy UTF-8 decode.
`decode_codepoints_lossy` (drives `string_chars`) and
`find_nth_codepoint` (drives `string_char_at`) both step through
it, so codepoint-boundary agreement is now identical-by-
construction rather than agree-by-coincidence. New runtime test
`string_char_at_overlong_replaces` exercises `find_nth_codepoint`
on `[0xC0, 0x80, b'a']` (overlong + ASCII) to pin that the two
entry points produce the same codepoint count for invalid input.

### D item 3 — spec §3.4 → §3.1.1

The Char literal entry's "use the named functions (§3.4)" pointer
referenced "Inference rules (overview)"; corrected to §3.1.1
("Char and codepoint string operations") which is the new
subsection.

### D item 4 — `std/string.sigil` duplicate removed

The byte-indexed surface preamble listed `string_byte_at` twice;
replaced the second occurrence with `string_length`.

### Out of scope (acknowledged in PR reply)

- D non-blocking observation #1 (test gap on find_nth_codepoint with
  invalid input): closed by `string_char_at_overlong_replaces` above.
- D non-blocking observation #2 (`\\u{HEX}` / `\\0` not supported in
  string literals): tracked as a separate follow-up since the
  current behavior produces a clear E0010, not a silent miss.
boldfield added a commit that referenced this pull request May 7, 2026
* [Task CH1] Char header tag + runtime primitives

Adds TAG_CHAR=0x0C and a complete `runtime/src/char.rs` module
covering boxing, equality/ordering, conversion, ASCII classifiers,
ASCII case folding, and the load-bearing UTF-8 walkers
(`string_chars`, `string_char_at`, `string_from_chars`).

`string_chars` and `string_from_chars` accept Cons / Nil header
words and discriminants from the codegen call site so the runtime
stays free of compile-time `List[Char]` layout knowledge while
still emitting well-typed `List[Char]` values. Lossy UTF-8 decode
emits U+FFFD per invalid byte and resyncs at the next valid leading
byte.

Adds `CharAllocCount` / `CharAllocBytes` counters; 32 unit tests
cover box/unbox round-trips for ASCII / 2-byte / 3-byte / 4-byte
codepoints, every classifier, ASCII case passthrough on non-ASCII,
`int_to_char_validate` boundary cases (surrogates, > 0x10FFFF,
negative), `string_chars` on valid + invalid UTF-8 input, codepoint-
indexed `string_char_at`, and `string_from_chars` round-trip through
`string_chars`.

* [Task CH2] Lexer + parser — Char literal extensions

Extends the existing `'x'` Char literal lexer with the full plan-spec
escape and validity surface:

- New escape sequences: `\"`, `\0`, `\u{HEX}` (1–6 hex digits).
- Bare multi-byte codepoints (`'é'`, `'中'`, `'😀'`) decoded via a
  new UTF-8 peek/advance helper pair on the byte-based cursor; the
  prior `self.src[pos] as char` shortcut would otherwise see N
  source bytes and reject as multi-codepoint.
- Compile-time rejection of `\u{...}` values > 0x10FFFF and any
  surrogate `0xD800..=0xDFFF` (E0010 with descriptive message).
- Compile-time rejection of empty `\u{}` and >6-hex-digit escapes.
- Multi-codepoint literal bodies (`'ab'`, `'ab\u{41}'`) now produce
  the targeted "Char literal must be a single codepoint; got N"
  diagnostic rather than the prior "expected closing `'`" surface.

Token type stays `CharLit(char)` — Rust's `char` already enforces
the post-validation invariant (no surrogates, ≤ 0x10FFFF). Existing
parser-side mapping (`Token::CharLit` → `Expr::CharLit`) is unchanged.

15 new lexer tests cover every escape, each Unicode boundary
(0xD7FF / 0xE000 / 0x10FFFF), each rejection case (empty braces,
too many digits, out-of-range, low/high surrogates), and the three
multi-byte bare-codepoint widths.

* [Task CH3] Char builtin schemes — 19 user-facing primitives

Registers the user-facing Char primitives via a new
`register_builtin_char_schemes(tc)` mirroring
`register_builtin_float_schemes`:

- 5 equality / ordering: char_eq, char_lt, char_le, char_gt, char_ge
  — `(Char, Char) -> Bool`
- 3 conversion: char_to_int `(Char) -> Int`, int_to_char
  `(Int) -> Option[Char]`, char_to_string `(Char) -> String`
- 5 ASCII classifiers: is_ascii, is_ascii_digit, is_ascii_alpha,
  is_ascii_alphanumeric, is_ascii_whitespace — `(Char) -> Bool`
- 2 ASCII case: to_lower_ascii, to_upper_ascii — `(Char) -> Char`
- 4 string codepoint ops: string_chars `(String) -> List[Char]`,
  string_char_at `(String, Int) -> Option[Char]`, string_from_chars
  `(List[Char]) -> String`

Char itself is already a Ty::Char primitive (pre-existing) and
literal type inference (`Expr::CharLit -> Ty::Char`) is unchanged.
The validator helpers `int_to_char_validate` /
`string_char_at_validate` are codegen-internal — not registered as
user-callable schemes; codegen will lower the user-facing
`int_to_char` / `string_char_at` to the validate-then-construct
pattern in CH4.

13 typecheck unit tests cover each user-facing scheme: round-trip
type inference, Option[Char] / List[Char] return types, E0044 on
wrong-argument-type calls, E0045 on Char-vs-Int annotation
mismatch, and Char literal default inference.

* [Task CH4] Codegen — boxed Char + 19-primitive dispatch

Converts Char from an I32 codepoint immediate to a `TAG_CHAR`-headed
heap pointer (mirroring Float / Int64 / String) and wires the 19
user-facing Char primitives through `lower_call`.

Type-mapping changes:

- `cranelift_ty_for_type_expr("Char")` and `cranelift_ty_of_ty(Ty::Char)`
  return `pointer_ty` instead of `types::I32`.
- `Expr::CharLit(c, _)` lowers to `iconst(I64, codepoint) →
  sigil_char_box → pointer_ty` (with stackmap placeholder), no longer
  bare `iconst(I32, c)`.
- `type_of_expr(Expr::CharLit)` returns `pointer_ty`.
- `Pattern::CharLit(c)` loads the u32 codepoint at offset 8 from the
  boxed Char scrutinee, zero-extends to i64, and tests for equality
  against the literal codepoint.

Primitive dispatch:

- 17 simple FFI dispatch arms in `lower_call` for char_eq / lt / le /
  gt / ge / char_to_int / char_to_string / is_ascii(_digit / _alpha /
  _alphanumeric / _whitespace) / to_lower_ascii / to_upper_ascii.
- `int_to_char` and `string_char_at` lower to a validate-then-construct
  pattern: validator → `brif(ok==0)` → Some/None branches → merge
  block with a `pointer_ty` block-param. Some-branch builds via
  `lower_ctor_alloc(Option$$Char, some_idx, [char_ptr])`; None via
  `lower_ctor_alloc(..., none_idx, [])`.
- `string_chars` and `string_from_chars` thread the codegen-computed
  `List$$Char` Cons / Nil header words and discriminants to the
  runtime as i64 immediates; runtime stamps them through `sigil_alloc`
  to build well-typed `List[Char]` cells.

`option_layout_for(Ty)` and `list_char_layout_immediates()` are the
two private helpers that resolve the monomorphized layout via
`mangle_type` / `mangle_ctor` and the `ctor_index` / `type_layouts`
side-tables.

`type_of_expr` predictions extended with the 19 Char op return types
(I8 for boolean classifiers / comparators, I64 for `char_to_int`,
`pointer_ty` for the rest). Globals set extended with the 17 user-
callable surface names (the validator helpers
`int_to_char_validate` / `string_char_at_validate` are codegen-
internal only).

* [Task CH4] Slot widening + GC bitmap for boxed Char

Boxed Char (TAG_CHAR) is a heap pointer; the closure-record /
sum-ctor / tuple-element slot-widening logic pre-Plan-C-addendum
treated `EnvSlotKind::Char` and `Ty::Char` as a sub-word I32
primitive (uextend on store, ireduce(I32) on load). Both directions
must drop for boxed Char — the slot already holds a `pointer_ty`
value, and narrowing a pointer to I32 corrupts it.

Updates:

- `EnvSlotKind::is_pointer()` now matches `Char` alongside
  `String / Closure / User`, so closure-record bitmap bits are
  set correctly for boxed-Char captures.
- Every match site that branched on `EnvSlotKind::Char` (closure
  store / load, synth-cont capture pack / unpack, post-arm-k
  capture pack, tail-prefix-let widen, narrow_for_kind helpers,
  `type_of_expr` for `Expr::ClosureEnvLoad`) routes Char through
  the pointer-typed branch.
- `Ty::Char` in `load_field_value` and the tuple-pattern element
  loader drops the I32 narrow.
- `is_gc_pointer_ty(Ty::Char)` returns true so sum-type field
  bitmaps mark Char fields for GC tracing (cosmetic on Boehm's
  conservative scan, load-bearing for any future precise GC).

Existing e2e tests that exercised Char-as-I32 semantics:

- `perform_side_narrow_to_char_value_checked`: rewrites
  `c == 'Y'` to `char_eq(c, 'Y')` (pointer equality wouldn't
  match codepoint equality for boxed Chars).
- `cps_abi_captures_bearing_with_char_capture_exercises_widen_-
   narrow_symmetry`: same `==` → `char_eq` rewrite; the test
  still pins capture flow through synth-cont closure records,
  but with no width narrowing.
- `arm_reads_char_arg_branches_on_codepoint`: same `==` →
  `char_eq` rewrite; the I32 → I8 split with the Bool test
  collapses (both are now pointer-store paths).
- `task_78_5_g4_approach6_b_neq_r_pointer_return_arm_through_-
   char_body`: B != R width discrepancy collapses (B = Char =
  pointer_ty, R = String = pointer_ty); test docstring updated
  to reflect post-addendum reality. Test still pins the wrapper
  composition end-to-end.

* [Task CH5] e2e tests + std/char.sigil + std/string.sigil + PLAN_C docs

- 14 e2e tests cover the full Char surface: char_literal_round_trips_via_to_string,
  char_codepoint_round_trip, int_to_char_rejects_out_of_range,
  int_to_char_rejects_surrogate, int_to_char_accepts_valid,
  is_ascii_classifiers_basic, to_lower_upper_ascii_passthrough,
  string_chars_ascii, string_chars_multibyte,
  string_char_at_codepoint_index, string_from_chars_round_trip,
  char_pattern_match_against_literal, char_eq_distinguishes_different_codepoints,
  char_doc_only_import.

- std/char.sigil ships as a documentation-only file (added to
  BUILTIN_INJECTED skip-list). Covers literal syntax, the 19
  user-facing primitives, the ASCII-only v1 scope + v2 closure path,
  the byte-vs-codepoint cross-reference to std/string.sigil, and a
  worked count_digits example.

- std/string.sigil doc preamble rewritten — the "Codepoint-aware
  variants deferred" note becomes a positive byte-vs-codepoint
  surface description; the new codepoint ops surface
  (string_chars, string_char_at, string_from_chars) is added to the
  builtin operations table with a cross-reference to std/char.sigil.

- PLAN_C_PROGRESS.md gets a Task CH section documenting the
  addendum's runtime + compiler + stdlib + test surface and the
  pre-existing Char-as-I32 → boxed-Char representation switch.

- PLAN_C_DEVIATIONS.md Task 68 entry's "deferred class 1"
  (codepoint-aware ops) and "deferred class 5"
  (char_to_int / int_to_char) marked CLOSED by this addendum.
  Class 3 (Float helpers) marked CLOSED by PR #101 (already shipped).
  Class 2 (codepoint-aware string_split / string_replace) remains
  deferred — depends on stdlib namespace qualification, not Char.

* [Task CH6] Spec update — Char primitive + codepoint string ops

Updates `spec/language.md` for the Plan C addendum:

- §1 (Lexical structure / literals): expanded `Char` literal entry
  covers the boxed representation (TAG_CHAR=0x0C, 16 bytes), the
  closed codepoint range (0x000000..=0x10FFFF excluding surrogates),
  the full escape table (`\n`, `\t`, `\r`, `\\`, `\'`, `\"`, `\0`,
  `\u{HEX}` 1–6 hex digits), bare-codepoint UTF-8 literals, and
  parse-time rejection of multi-codepoint / out-of-range / surrogate
  inputs. Calls out the deliberate absence of `==` / `<` operator
  overloading on Char.

- §3.1 (Built-in types): `Char` row updated to "Boxed Unicode
  codepoint (TAG_CHAR=0x0C, 21-bit codepoint payload)" — replaces
  the pre-addendum "1-byte codepoint" placeholder.

- §3.1.1 (new subsection): full `Char` reference covering literal
  syntax, the 19 user-facing operations grouped by purpose
  (equality / ordering, conversion, ASCII classifiers, ASCII case),
  the codepoint-aware string operations (`string_chars`,
  `string_char_at`, `string_from_chars`), the byte-vs-codepoint
  surface coexistence, and a worked `count_digits` example.

- §13 (Stdlib reference): adds a `std.char` row; extends the
  `std.string` row with the codepoint-indexed surface.

- §14 (v1 limits): new §14.1 "Deferred to follow-up plans" table
  documenting closure paths for codepoint-aware `string_split` /
  `string_replace`, Unicode-aware classifiers / case / normalization,
  and a hypothetical strict-UTF-8 `string_chars_strict`.

* [Task CH4 fix] is_gc_pointer_ty test + e2e UTF-8 source bytes

CI on PR #105 surfaced two regressions from the boxed-Char
representation switch:

1. `is_gc_pointer_ty_matches_expected_types` (layout.rs unit test)
   pinned `!is_gc_pointer_ty(Ty::Char)` — pre-addendum Char was an
   I32 immediate, so it correctly wasn't a GC pointer. Updated to
   assert `is_gc_pointer_ty(Ty::Char)` instead, with a Plan-C-
   addendum justification comment.

2. Three CH5 e2e tests embedded `\u{HEX}` escapes inside `"..."`
   string literals. Sigil's *string* lexer accepts only
   `\n / \t / \r / \\\\ / \\\"`; the `\u{...}` escape lives in
   *char* literals only. Rewrote the tests to use bare UTF-8 bytes
   in source (`"héllo"`, `"héllo 😀"`) — the bytes are valid UTF-8
   that Rust's source-tree handling preserves into the embedded
   test string verbatim, and Sigil's runtime treats `String` as a
   raw UTF-8 byte buffer.

* [Task CH4 fix 2] String lexer UTF-8 preservation + e2e mono trigger

Two more CI failures fixed:

1. Sigil's string lexer pre-Plan-C-addendum read source bytes as
   Latin-1 chars (`self.src[pos] as char`) and pushed them to a
   Rust `String` via `String::push(char)`, which UTF-8 re-encodes.
   Multi-byte source sequences double-encoded — `é` (0xC3 0xA9) →
   4 bytes (0xC3 0x82 0xC2 0xA9) inside the resulting StringLit.
   This regressed nothing pre-addendum because `string_chars` /
   `string_char_at` didn't exist; the addendum surfaces the bug
   immediately. `take_string_lit` now uses the `peek_utf8` /
   `advance_utf8` helpers (added in CH2) to decode multi-byte
   source bytes as a single codepoint and push that codepoint
   verbatim. Two new lexer tests pin the round-trip for 2-byte
   `é` and 4-byte `😀`.

2. `string_from_chars_round_trip` had no explicit `List[Char]`
   annotation in its source, so monomorphize never saw the type
   and codegen panicked with "ctor `Cons$$Char` not registered".
   Added a `let xs: List[Char] = string_chars(s)` binding to
   trigger mono via the type annotation's Apply node, mirroring
   the working `string_chars_multibyte` test's shape.

* [Review] Address PR #105 review items 1–5 + 7

PR #105 review (boldfield, 2026-05-07): one MUST-FIX, four follow-
ups, two deferred. All five non-deferred items addressed plus a
small note for item 7.

**1 (MUST-FIX) — Reject `==` / `!=` on `Char` at typecheck.** Pre-
fix, `'a' == 'a'` typechecked and lowered to `icmp Equal l r` on
boxed Char pointers — pointer identity, not codepoint equality —
silently returning `false` at runtime. New typecheck arm in
`check_binop`'s `BinOp::Eq | BinOp::NotEq` rejects `Ty::Char`
operands with E0060 pointing at `char_eq(a, b)` (or
`!char_eq(a, b)` for `!=`). Two new typecheck unit tests pin the
rejection. Float / Int64 (also boxed) inherit the same trap;
generalizing the rejection across all heap-boxed primitives is
queued as a follow-up since the existing `float_eq` / `int64_eq`
discipline currently hides the bug.

**2 — Refactor `string_char_at` to early-exit shared helper.** Pre-
refactor `sigil_string_char_at_validate` and `sigil_string_char_at`
each called `decode_codepoints_lossy(slice)` (full-pass + Vec
allocation) — making `for i in 0..n: char_at(s, i)` O(n²) with O(n)
transient allocations per call. New shared helper
`find_nth_codepoint(bytes, idx) -> Option<u32>` walks bytes
front-to-back with early-exit; both entry points use it. Each call
is now O(idx + decode_cost) and allocates nothing on the hot path.

**3 — `lower_ctor_alloc` comment fix.** Dropped `Char` from the
"sub-word primitives (Bool, Byte, Char, Unit)" widen-on-store
comment; boxed Char takes the pass-through branch.

**4 — Decoder overlong-rejection tests.** Two new runtime tests:
`string_chars_overlong_2byte_replaces` (`[0xC0, 0x80]` 2-byte
overlong of U+0000) and `string_chars_overlong_3byte_surrogate_-
replaces` (`[0xED, 0xA0, 0x80]` 3-byte UTF-8 form of surrogate
U+D800). Each pins a distinct decoder branch.

**5 — Lexer multi-codepoint count uses codepoint-step.** The
"Char literal must be a single codepoint; got N" diagnostic's
count loop now uses `peek_utf8` / `advance_utf8`, parity with the
literal-body decoder above. New lexer test pins
`'aé'` reports "got 2", not "got 3".

**7 (DEFERRED note) — U+FFFD merging.** Sigil v1's per-byte
replacement (matching `String::from_utf8_lossy`) is now
documented in `std/char.sigil` alongside a forward reference to
the v2 `string_chars_strict` follow-up.

Item 6 (Char in `is_gc_pointer_ty` — note for v2 precise GC) is
deferred per reviewer.

* [Review 2/3/4] Address all 4 outstanding boldfield reviews

Combined fixes for:
- **Review B item 3** (PR review 4246507154, 17:59) — comment style
- **Review C items 9 + 10** (issue comment 4399992055, 18:30) —
  Float/Int64 == extension + e2e lossy-decode test
- **Review D items 1–4** (PR review 4246835141, 18:46) — E0060 both
  operands, decoder dedup, spec §3.4 typo, std/string duplicate

### B item 3 — Plan-C-addendum comment spam pruned

Deleted 13 redundant single-line `// Plan C addendum (Char) — boxed
Char is pointer-typed.` comments adjacent to `EnvSlotKind::Char | ...`
match arms across `compiler/src/codegen.rs`. The match-arm context
alone makes the change self-evident; the load-bearing ones (literal
lowering, type-mapping fns, dispatch arms, helpers, FuncIds struct,
runtime counters, ast.rs `is_pointer()`, layout.rs
`is_gc_pointer_ty`) stay.

### C item 9 — Heap-boxed-primitive == rejection extended to Float / Int64

The earlier Char-only E0060 rejection now generalizes to all three
heap-boxed primitives. New `BoxedPrim` enum + `boxed_primitive_-
eq_name` helper drive the typecheck arm; per-type suggestion +
ordering hint string. Float adds the NaN-aware caveat in the error
message. Four new typecheck unit tests pin Float / Int64 `==` and
`!=` rejection.

### C item 10 — e2e test for user-visible lossy UTF-8 decode

`string_chars_invalid_utf8_replaces` constructs a `ByteArray` with
a known-invalid byte (`0xFF`) via `byte_array_alloc` +
`byte_array_concat`, bypasses validation via
`string_from_bytes_alloc` (the post-validation primitive copies
bytes verbatim), and verifies `string_chars` emits U+FFFD (65533)
for the invalid byte. Closes the runtime → user-program coverage
gap.

### D item 1 — E0060 char check now guards both operands

The earlier check only inspected `lt.as_ref()`. Now both `lt` and
`rt` are checked via `lt_boxed.or(rt_boxed)`, so `42 == 'a'` and
similar shapes still fire the named-function suggestion even when
LHS is non-Char or `None`.

### D item 2 — UTF-8 decoder deduplicated

Extracted `decode_next_codepoint(bytes, offset) -> (cp, len)` as
the single source of truth for Sigil's lossy UTF-8 decode.
`decode_codepoints_lossy` (drives `string_chars`) and
`find_nth_codepoint` (drives `string_char_at`) both step through
it, so codepoint-boundary agreement is now identical-by-
construction rather than agree-by-coincidence. New runtime test
`string_char_at_overlong_replaces` exercises `find_nth_codepoint`
on `[0xC0, 0x80, b'a']` (overlong + ASCII) to pin that the two
entry points produce the same codepoint count for invalid input.

### D item 3 — spec §3.4 → §3.1.1

The Char literal entry's "use the named functions (§3.4)" pointer
referenced "Inference rules (overview)"; corrected to §3.1.1
("Char and codepoint string operations") which is the new
subsection.

### D item 4 — `std/string.sigil` duplicate removed

The byte-indexed surface preamble listed `string_byte_at` twice;
replaced the second occurrence with `string_length`.

### Out of scope (acknowledged in PR reply)

- D non-blocking observation #1 (test gap on find_nth_codepoint with
  invalid input): closed by `string_char_at_overlong_replaces` above.
- D non-blocking observation #2 (`\\u{HEX}` / `\\0` not supported in
  string literals): tracked as a separate follow-up since the
  current behavior produces a clear E0010, not a silent miss.
boldfield added a commit that referenced this pull request May 8, 2026
Review #1 follow-ups:

- Header stack-depth doc was stale on TCO. Walker is mutually
  tail-recursive across `__fmt_walk` / `__fmt_open_brace` /
  `__fmt_close_brace` (matching signatures); per Task TCO's
  return_call lowering this is O(1) stack regardless of template
  length. Re-anchored doc + spec §13 row + PLAN_C_PROGRESS.md
  entry.

- Spec §13 std.format row gains a perf note: concat-chain
  accumulator is O(L²) in output length L; suitable for log lines
  / error messages / few-KB strings; large-output renders should
  use StringBuilder directly under `![Mem]`. The constraint is
  forced by `format`'s `![]` declaration (sb_* gates on Mem).

- Two new e2e tests pin EOF edge cases that existed in code but
  weren't tested explicitly: `std_format_trailing_open_brace_at_eof_emits_marker`
  (`{` at end → `{?`) and `std_format_trailing_close_brace_at_eof_is_literal`
  (`}` at end → literal `}`).

Review #2 follow-ups:

- `__fmt_close_brace` collapses two duplicate emit-and-advance-1
  branches via `let advance: Int = if ... { 2 } else { 1 };`. The
  three call sites become one, and the dispatch reads as
  "decide advance, then walk".

- New `__fmt_flush_run` helper centralises the flush logic and
  adds a `run_start < i` guard so adjacent specials (e.g. `{}{}`)
  don't allocate spurious empty-substring concats. Previously each
  consecutive special incurred one extra empty-string substring +
  acc-identical concat per byte.

- Two new e2e tests pin paths now covered: `std_format_empty_template_returns_empty_string`
  (the `i >= len` base case) and `std_format_adjacent_placeholders_no_separator`
  (exercises the empty-run-flush path on `{}{}` → `12`).

Review #2 point 5 (`__` prefix convention) does not need action: the
prefix is already established across `std/ordering.sigil`,
`std/list.sigil`, and `std/map.sigil` (`__string_compare_at`,
`__reverse_acc`, `__map_skew`, etc.). New code follows the existing
convention.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant