[Plan E2 Phase 1] Task 2b — helper refactor + cat 2 (partial) by boldfield · Pull Request #159 · boldfield/sigil

boldfield · 2026-05-11T21:29:49Z

Draft — pushed early so you can see helper refactor + early category 2 coverage. Category 2 has residual work (record/variant field destructuring loads); category 3 not yet started.

What's in this PR so far

Commit 1 — Helper refactor (41ea95b): All 62 alloc sites in codegen.rs now funnel through lower_alloc_call. The only declare_value_needs_stack_map in codegen.rs is inside the helper itself. Closes reviewer's structural concern from PR #156 re-review #2 ("makes the next 23-site miss impossible to commit"). Net –354 LOC.

Commit 2 — Category 2 partial (2051acb): Heap-pointer load flagging via type-aware classification.

New helper is_heap_pointer_ty(ty: &Ty) -> bool — single source of truth for "is this Sigil type a GC-traced heap pointer?" Centralised near cranelift_ty_of_ty. Key distinction: on 64-bit hosts, Int and heap pointers have the same Cranelift width (i64), but Int is a raw scalar and must NOT be flagged — over-flagging would crash Phase 2's precise marker on the raw i64.
lower_closure_env_load now flags every Char / String / Closure / User slot at load time, keyed on EnvSlotKind. Single change covers every closure-capture read.
24 named heap-pointer loads flagged via a multi-line regex sweep over named-by-purpose loads (ra_closure*, ra_fired*, return_arm_closure_v, caller_k_closure, k_closure*, post_arm_k_closure*). Companion ra_fn / k_fn loads (function pointers, NOT GC refs) correctly left unflagged.
sigil_ref_deref result flagged when T is heap-bearing (uses existing lookup_call_callee_ty infrastructure).

What's NOT covered yet

Category 2 residual — record/variant field destructuring loads. Sigil has no direct FieldAccess expression; record/variant field reads happen via match-arm destructuring, scattered across multiple match-lowering sites. Each load there needs the field's Sigil type threaded through — non-trivial refactor.

Category 3 — block-arg confluences + fn-entry block-params on tail-callable fns (the Task 3 dependency surfaced in PR #157). 48 brif/jump sites + 14 block-param add sites need type-aware sweeping. Not started.

Question for reviewer

This PR is a draft to surface scope reality. Doing all of Task 2b in one PR is a 4–8 hour effort with high regression risk on a ~1000 LOC diff. Two paths:

Option A: keep it one PR — I push through the residual category 2 work + all of category 3 on this branch.
Option B: ship this draft as Task 2b (categories 1 helper-refactor + 2 partial), file Task 2c for the residual category 2 destructuring sweep, file Task 2d for category 3. Smaller, reviewable, but admits per-task PR cadence at the cost of bundled-completeness.

Lean Option A given your "no deferral" steer; flagging it here in case the scope visibility changes your call.

Verification (commits 1 + 2)

pod-verify.sh green.
cargo test --test cranelift_stackmap_spike -p sigil-compiler — PR [Plan E2 Phase 1] Cranelift stackmap API spike #151 spike tests still pass.
CI: 4 lanes on ubuntu + macos (full e2e).

Marking-site count progression

State	Sites	Notes
Pre-PR (main)	62	61 surgical + 1 in helper (PR #156 final)
After commit 1 (helper refactor)	1	only inside the helper
After commit 2 (cat 2 partial)	26	helper + closure_env_load + 24 named loads + ref_deref

…loc_call Reviewer's N1 from PR #156: "Sweep methodology is fragile." Of the 62 alloc sites already marked, only one (`float_add` at codegen.rs:24812) was refactored to use the `lower_alloc_call` helper introduced in PR #156. The remaining 61 sites were surgical (call + extract + flag in 5 lines, by convention). This commit refactors all 61 to funnel through the helper. After this commit: - 62 alloc sites all use either: - `self.lower_alloc_call(callee, &[args])` — Lowerer method, 54 sites - `lower_alloc_call(&mut <builder>, <stackmap>, callee, &[args])` — free function, 8 emit_object sites - The only place `declare_value_needs_stack_map` appears in `codegen.rs` is inside `lower_alloc_call` itself (one site at codegen.rs:27191). Any future contributor adding a new alloc site must funnel through the helper; "mark by convention" is no longer possible. Refactor methodology: two perl scripts (in /tmp at edit time, not committed) applied multi-line regex substitutions for the regular patterns (single-line .call vs multi-line builder chain × tail-expr vs bound-name × Lowerer vs emit_object). A handful of irregular sites (cont_alloc with multi-arg, string_chars / string_from_chars with 5-arg / 3-arg lists, `balloc` and `step_0_closure_ptr` emit_object shapes) were hand-edited. Final declare-site count: 1 (in helper); final `lower_alloc_call` call count: 65 (62 site invocations + the Lowerer-method delegate + the free-function definition self-call + 1 internal). Verification: - pod-verify.sh green (cargo check + clippy + fmt + discipline greps). - `cargo test --test cranelift_stackmap_spike -p sigil-compiler`: PR #151's 2 tests still pass (no API change). - e2e behavioural tests: CI-authoritative. Behaviour is identical to PR #156's final state — the helper is the same code path the surgical pattern emitted, just consolidated. This commit closes the reviewer's structural concern from PR #156's re-review #2: "Either makes the next 23-site miss impossible to commit. Without it, I expect round 4." The helper is now the only path; no future alloc site can be added without it. Next commits on this branch: heap-pointer loads sweep + block-arg confluences sweep (Task 2b categories 2 and 3 from the plan body).

… heap loads + ref deref Plan E2 Phase 1 Task 2b category 2 — flag heap-pointer loads as GC refs via `declare_value_needs_stack_map`. Per-site classification uses the Sigil-level type at each load to decide flag-or-not — flagging an Int-typed slot would crash Phase 2's precise marker on the raw i64. ## Helper added: `is_heap_pointer_ty(ty: &Ty) -> bool` Centralised classification function at codegen.rs near `cranelift_ty_of_ty`. Returns `true` for the Sigil types whose codegen representation is a heap-managed pointer (`Char` boxed, `String`, `Fn(_)`, `User(_,_)`, `Tuple(_)`, `Continuation(_)`). `Int` / `Bool` / `Byte` / `Unit` return false — `Int` is the load-bearing distinction: on 64-bit hosts an `Int` slot has the same Cranelift type (i64) as a heap pointer, but it's a raw scalar and must not be flagged. ## Sites flagged 1. **Closure-env loads** (codegen.rs:26077 — `lower_closure_env_load`). Flagging is keyed on the slot's `EnvSlotKind`: `Char` / `String` / `Closure` / `User` → flag; `Int` / `Bool` / `Byte` / `Unit` → don't. This single change covers every closure-capture read from `closure_ptr` at offset `16 + 8*i`. 2. **Named heap-pointer loads** — 24 sites covering: - `ra_closure*` (return-arm closure pointer) - `ra_fired*` (return-arm fired-cell pointer) - `return_arm_closure_v` / `return_arm_fired_v` - `caller_k_closure` - `k_closure*` (continuation k pointer) - `post_arm_k_closure*` / `caller_post_arm_k_closure` Applied via `/tmp/flag_heap_loads.pl` (a multi-line regex sweep keyed on the variable name pattern + the precise five-line load shape: `let NAME = BUILDER.ins().load(pointer_ty, MemFlags::trusted(), BASE, OFFSET,);`). The companion `ra_fn` / `k_fn` / `caller_k_fn` loads — which load *function pointers*, not GC refs — are correctly NOT flagged. 3. **`sigil_ref_deref` result** (codegen.rs:25232). `Ref[T]::deref()` returns the stored T value. The site already looked up the callee's Sigil signature via `lookup_call_callee_ty` for narrowing; same lookup now decides whether to flag the raw i64 result. If T is heap-bearing (per `is_heap_pointer_ty`), flag; otherwise leave raw. ## What's NOT covered in this commit Category 2's plan-body wording is "loaded heap pointers (record-field reads, closure-env loads)." The closure-env loads are done. Record/ variant field reads happen via match-arm destructuring in sigil (there is no direct `FieldAccess` expression); destructuring loads are scattered across multiple match-lowering sites and need a threaded-type-info refactor to classify correctly. **Deferred to a follow-up commit on this branch** (next commit if scope permits, else spelled out in the PR body as a defered residual). ## Category 3 — NOT STARTED Block-arg confluences + fn-entry block-params on tail-callable fns (the Task 3 dependency from PR #157) are NOT in this commit. Will be the next commit on this branch. ## Verification - pod-verify.sh green (cargo check + clippy + fmt + discipline greps). - `cargo test --test cranelift_stackmap_spike -p sigil-compiler`: PR #151's 2 tests still pass. - e2e behavioural tests: CI-authoritative. The added flag calls are metadata for the safepoint pass; programs run identically. Marking site count: 26 (1 in `lower_alloc_call` helper + 1 in `lower_closure_env_load` + 24 named-load flags). Per-site type correctness is the load-bearing claim — over-flagging an Int slot would surface in Phase 2 as a precise-marker crash; this commit's classifications are reviewable by inspecting `is_heap_pointer_ty` plus each per-site `if is_heap_t` / `match kind` guard.

Continues Task 2b on the same branch. Two more layers of heap- pointer flagging now in place. ## Cat 2 residual — record / variant / tuple field reads Sigil has no direct `FieldAccess` expression; field reads happen via match-arm destructuring at two centralised lowering sites: 1. `Pattern::Tuple` arm (codegen.rs:26486) — loads tuple element at offset `8 + 8*i`, dispatches on `elem_ty` for narrowing. The `Char | String | Fn(_) | User(_,_) | Tuple(_)` arm now flags the raw i64 result via `declare_value_needs_stack_map` before returning. `Int` / `Bool` / `Byte` / `Unit` arms intentionally don't flag (raw scalar). 2. `load_field_value` (codegen.rs:26556) — loads a record/variant field at `16 + 8*index`, same `match field_ty` shape. Same change: heap-bearing arms flag the raw i64. Per-arm classification in both sites uses the existing per-element type discrimination — the change is one extra line per heap-bearing arm, no new threading required. Coverage is automatic for every record / variant / tuple destructuring in user code. ## Cat 3 — fn-entry block-params on tail-callable fns Adds `flag_heap_pointer_user_args(builder, entry_block, base_offset, is_heap_per_arg)` helper at codegen.rs:~530, plus the TypeExpr-level predicate `is_heap_pointer_type_expr(te) -> bool` (companion to the existing `is_heap_pointer_ty` on `Ty`). Applied at the user-fn body entry site (codegen.rs:10081) for the `Sync` ABI — entry-block-params layout there is `[null_closure, *user_args, terminal_out]`, user args at indices 1..=N. Each pointer- typed user-arg whose surface `TypeExpr` is heap-bearing (anything not `Int` / `Bool` / `Byte` / `Unit`) is flagged. `null_closure` at index 0 and `terminal_out` at index N+1 are infrastructure pointers (constant null at direct-call time / caller-stack pointer respectively); not flagged. This closes the **Task 3 dependency** filed in PR #157 / `PLAN_E2_PROGRESS.md`: "every fn-entry block-param of pointer type for any fn callable via `return_call` / `return_call_indirect` must be flagged `declare_value_needs_stack_map`." Without this flagging, heap pointers passed across a tail call would lose precise tracking once Phase 3 drops Boehm's conservative stack scan. For Cps ABI user fns, the entry-block-params are `[closure_ptr, args_ptr, k_closure, k_fn]` — `args_ptr` is a caller-stack pointer (not heap), `k_fn` is a function pointer (not GC ref), and the user args are loaded from `args_ptr` later (covered by `load_field_*` or per-site flagging there). So the Cps fn-entry case is a no-op here, by design — flagging machinery for the closure_ptr / k_closure of Cps fns lives at the load sites where they're consumed (already covered in commit 2's named-load sweep). Synth-arm-fn entries (codegen.rs:12381, 12783, 13852, 14171, 14455, 15285, 19078) are NOT tail-call destinations in sigil per the PR #157 audit. Their fn-entry block-params don't need flagging for Task 3's soundness contract — covered by inspection of the audit's "exactly two return_call* sites" finding (both target direct-Sync callees, never synth-arms). ## Verification - pod-verify.sh green. - `cargo test --test cranelift_stackmap_spike -p sigil-compiler`: PR #151's 2 tests still pass. - e2e behavioural tests: CI-authoritative. Marking-site count after this commit: 30 (1 helper + 1 `lower_closure_env_load` + 24 named loads + 1 `sigil_ref_deref` + 1 `Pattern::Tuple` heap arm + 1 `load_field_value` heap arm + 1 `flag_heap_pointer_user_args` per-param). Each per-param flag expands at runtime to one stack-map entry per heap-bearing user-arg position per Sync user fn — true count depends on the program. ## Remaining work — block-arg confluences (brif/jump heap-arg sweep) 48 brif/jump call sites in codegen.rs pass block-args through control flow. For each, if any block-arg is a heap pointer, the destination block's corresponding param needs flagging. This sweep is broader than fn-entry params and requires per-site type tracking. Plan-body's "phi confluences where any input is a heap pointer" explicitly names this. Will be the next commit on this branch.

boldfield added 3 commits May 11, 2026 21:20

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Plan E2 Phase 1] Task 2b — helper refactor + cat 2 (partial)#159

[Plan E2 Phase 1] Task 2b — helper refactor + cat 2 (partial)#159
boldfield wants to merge 3 commits into
mainfrom
v2-precise-gc-task2b

boldfield commented May 11, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

boldfield commented May 11, 2026

What's in this PR so far

What's NOT covered yet

Question for reviewer

Verification (commits 1 + 2)

Marking-site count progression

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant