[Plan E2 Phase 1] Task 2b — helper refactor + cat 2 (partial)#159
Draft
boldfield wants to merge 3 commits into
Draft
[Plan E2 Phase 1] Task 2b — helper refactor + cat 2 (partial)#159boldfield wants to merge 3 commits into
boldfield wants to merge 3 commits into
Conversation
…loc_call Reviewer's N1 from PR #156: "Sweep methodology is fragile." Of the 62 alloc sites already marked, only one (`float_add` at codegen.rs:24812) was refactored to use the `lower_alloc_call` helper introduced in PR #156. The remaining 61 sites were surgical (call + extract + flag in 5 lines, by convention). This commit refactors all 61 to funnel through the helper. After this commit: - 62 alloc sites all use either: - `self.lower_alloc_call(callee, &[args])` — Lowerer method, 54 sites - `lower_alloc_call(&mut <builder>, <stackmap>, callee, &[args])` — free function, 8 emit_object sites - The only place `declare_value_needs_stack_map` appears in `codegen.rs` is inside `lower_alloc_call` itself (one site at codegen.rs:27191). Any future contributor adding a new alloc site must funnel through the helper; "mark by convention" is no longer possible. Refactor methodology: two perl scripts (in /tmp at edit time, not committed) applied multi-line regex substitutions for the regular patterns (single-line .call vs multi-line builder chain × tail-expr vs bound-name × Lowerer vs emit_object). A handful of irregular sites (cont_alloc with multi-arg, string_chars / string_from_chars with 5-arg / 3-arg lists, `balloc` and `step_0_closure_ptr` emit_object shapes) were hand-edited. Final declare-site count: 1 (in helper); final `lower_alloc_call` call count: 65 (62 site invocations + the Lowerer-method delegate + the free-function definition self-call + 1 internal). Verification: - pod-verify.sh green (cargo check + clippy + fmt + discipline greps). - `cargo test --test cranelift_stackmap_spike -p sigil-compiler`: PR #151's 2 tests still pass (no API change). - e2e behavioural tests: CI-authoritative. Behaviour is identical to PR #156's final state — the helper is the same code path the surgical pattern emitted, just consolidated. This commit closes the reviewer's structural concern from PR #156's re-review #2: "Either makes the next 23-site miss impossible to commit. Without it, I expect round 4." The helper is now the only path; no future alloc site can be added without it. Next commits on this branch: heap-pointer loads sweep + block-arg confluences sweep (Task 2b categories 2 and 3 from the plan body).
… heap loads + ref deref Plan E2 Phase 1 Task 2b category 2 — flag heap-pointer loads as GC refs via `declare_value_needs_stack_map`. Per-site classification uses the Sigil-level type at each load to decide flag-or-not — flagging an Int-typed slot would crash Phase 2's precise marker on the raw i64. ## Helper added: `is_heap_pointer_ty(ty: &Ty) -> bool` Centralised classification function at codegen.rs near `cranelift_ty_of_ty`. Returns `true` for the Sigil types whose codegen representation is a heap-managed pointer (`Char` boxed, `String`, `Fn(_)`, `User(_,_)`, `Tuple(_)`, `Continuation(_)`). `Int` / `Bool` / `Byte` / `Unit` return false — `Int` is the load-bearing distinction: on 64-bit hosts an `Int` slot has the same Cranelift type (i64) as a heap pointer, but it's a raw scalar and must not be flagged. ## Sites flagged 1. **Closure-env loads** (codegen.rs:26077 — `lower_closure_env_load`). Flagging is keyed on the slot's `EnvSlotKind`: `Char` / `String` / `Closure` / `User` → flag; `Int` / `Bool` / `Byte` / `Unit` → don't. This single change covers every closure-capture read from `closure_ptr` at offset `16 + 8*i`. 2. **Named heap-pointer loads** — 24 sites covering: - `ra_closure*` (return-arm closure pointer) - `ra_fired*` (return-arm fired-cell pointer) - `return_arm_closure_v` / `return_arm_fired_v` - `caller_k_closure` - `k_closure*` (continuation k pointer) - `post_arm_k_closure*` / `caller_post_arm_k_closure` Applied via `/tmp/flag_heap_loads.pl` (a multi-line regex sweep keyed on the variable name pattern + the precise five-line load shape: `let NAME = BUILDER.ins().load(pointer_ty, MemFlags::trusted(), BASE, OFFSET,);`). The companion `ra_fn` / `k_fn` / `caller_k_fn` loads — which load *function pointers*, not GC refs — are correctly NOT flagged. 3. **`sigil_ref_deref` result** (codegen.rs:25232). `Ref[T]::deref()` returns the stored T value. The site already looked up the callee's Sigil signature via `lookup_call_callee_ty` for narrowing; same lookup now decides whether to flag the raw i64 result. If T is heap-bearing (per `is_heap_pointer_ty`), flag; otherwise leave raw. ## What's NOT covered in this commit Category 2's plan-body wording is "loaded heap pointers (record-field reads, closure-env loads)." The closure-env loads are done. Record/ variant field reads happen via match-arm destructuring in sigil (there is no direct `FieldAccess` expression); destructuring loads are scattered across multiple match-lowering sites and need a threaded-type-info refactor to classify correctly. **Deferred to a follow-up commit on this branch** (next commit if scope permits, else spelled out in the PR body as a defered residual). ## Category 3 — NOT STARTED Block-arg confluences + fn-entry block-params on tail-callable fns (the Task 3 dependency from PR #157) are NOT in this commit. Will be the next commit on this branch. ## Verification - pod-verify.sh green (cargo check + clippy + fmt + discipline greps). - `cargo test --test cranelift_stackmap_spike -p sigil-compiler`: PR #151's 2 tests still pass. - e2e behavioural tests: CI-authoritative. The added flag calls are metadata for the safepoint pass; programs run identically. Marking site count: 26 (1 in `lower_alloc_call` helper + 1 in `lower_closure_env_load` + 24 named-load flags). Per-site type correctness is the load-bearing claim — over-flagging an Int slot would surface in Phase 2 as a precise-marker crash; this commit's classifications are reviewable by inspecting `is_heap_pointer_ty` plus each per-site `if is_heap_t` / `match kind` guard.
Continues Task 2b on the same branch. Two more layers of heap- pointer flagging now in place. ## Cat 2 residual — record / variant / tuple field reads Sigil has no direct `FieldAccess` expression; field reads happen via match-arm destructuring at two centralised lowering sites: 1. `Pattern::Tuple` arm (codegen.rs:26486) — loads tuple element at offset `8 + 8*i`, dispatches on `elem_ty` for narrowing. The `Char | String | Fn(_) | User(_,_) | Tuple(_)` arm now flags the raw i64 result via `declare_value_needs_stack_map` before returning. `Int` / `Bool` / `Byte` / `Unit` arms intentionally don't flag (raw scalar). 2. `load_field_value` (codegen.rs:26556) — loads a record/variant field at `16 + 8*index`, same `match field_ty` shape. Same change: heap-bearing arms flag the raw i64. Per-arm classification in both sites uses the existing per-element type discrimination — the change is one extra line per heap-bearing arm, no new threading required. Coverage is automatic for every record / variant / tuple destructuring in user code. ## Cat 3 — fn-entry block-params on tail-callable fns Adds `flag_heap_pointer_user_args(builder, entry_block, base_offset, is_heap_per_arg)` helper at codegen.rs:~530, plus the TypeExpr-level predicate `is_heap_pointer_type_expr(te) -> bool` (companion to the existing `is_heap_pointer_ty` on `Ty`). Applied at the user-fn body entry site (codegen.rs:10081) for the `Sync` ABI — entry-block-params layout there is `[null_closure, *user_args, terminal_out]`, user args at indices 1..=N. Each pointer- typed user-arg whose surface `TypeExpr` is heap-bearing (anything not `Int` / `Bool` / `Byte` / `Unit`) is flagged. `null_closure` at index 0 and `terminal_out` at index N+1 are infrastructure pointers (constant null at direct-call time / caller-stack pointer respectively); not flagged. This closes the **Task 3 dependency** filed in PR #157 / `PLAN_E2_PROGRESS.md`: "every fn-entry block-param of pointer type for any fn callable via `return_call` / `return_call_indirect` must be flagged `declare_value_needs_stack_map`." Without this flagging, heap pointers passed across a tail call would lose precise tracking once Phase 3 drops Boehm's conservative stack scan. For Cps ABI user fns, the entry-block-params are `[closure_ptr, args_ptr, k_closure, k_fn]` — `args_ptr` is a caller-stack pointer (not heap), `k_fn` is a function pointer (not GC ref), and the user args are loaded from `args_ptr` later (covered by `load_field_*` or per-site flagging there). So the Cps fn-entry case is a no-op here, by design — flagging machinery for the closure_ptr / k_closure of Cps fns lives at the load sites where they're consumed (already covered in commit 2's named-load sweep). Synth-arm-fn entries (codegen.rs:12381, 12783, 13852, 14171, 14455, 15285, 19078) are NOT tail-call destinations in sigil per the PR #157 audit. Their fn-entry block-params don't need flagging for Task 3's soundness contract — covered by inspection of the audit's "exactly two return_call* sites" finding (both target direct-Sync callees, never synth-arms). ## Verification - pod-verify.sh green. - `cargo test --test cranelift_stackmap_spike -p sigil-compiler`: PR #151's 2 tests still pass. - e2e behavioural tests: CI-authoritative. Marking-site count after this commit: 30 (1 helper + 1 `lower_closure_env_load` + 24 named loads + 1 `sigil_ref_deref` + 1 `Pattern::Tuple` heap arm + 1 `load_field_value` heap arm + 1 `flag_heap_pointer_user_args` per-param). Each per-param flag expands at runtime to one stack-map entry per heap-bearing user-arg position per Sync user fn — true count depends on the program. ## Remaining work — block-arg confluences (brif/jump heap-arg sweep) 48 brif/jump call sites in codegen.rs pass block-args through control flow. For each, if any block-arg is a heap pointer, the destination block's corresponding param needs flagging. This sweep is broader than fn-entry params and requires per-site type tracking. Plan-body's "phi confluences where any input is a heap pointer" explicitly names this. Will be the next commit on this branch.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Draft — pushed early so you can see helper refactor + early category 2 coverage. Category 2 has residual work (record/variant field destructuring loads); category 3 not yet started.
What's in this PR so far
Commit 1 — Helper refactor (
41ea95b): All 62 alloc sites incodegen.rsnow funnel throughlower_alloc_call. The onlydeclare_value_needs_stack_mapincodegen.rsis inside the helper itself. Closes reviewer's structural concern from PR #156 re-review #2 ("makes the next 23-site miss impossible to commit"). Net –354 LOC.Commit 2 — Category 2 partial (
2051acb): Heap-pointer load flagging via type-aware classification.is_heap_pointer_ty(ty: &Ty) -> bool— single source of truth for "is this Sigil type a GC-traced heap pointer?" Centralised nearcranelift_ty_of_ty. Key distinction: on 64-bit hosts,Intand heap pointers have the same Cranelift width (i64), butIntis a raw scalar and must NOT be flagged — over-flagging would crash Phase 2's precise marker on the raw i64.lower_closure_env_loadnow flags everyChar/String/Closure/Userslot at load time, keyed onEnvSlotKind. Single change covers every closure-capture read.ra_closure*,ra_fired*,return_arm_closure_v,caller_k_closure,k_closure*,post_arm_k_closure*). Companionra_fn/k_fnloads (function pointers, NOT GC refs) correctly left unflagged.sigil_ref_derefresult flagged when T is heap-bearing (uses existinglookup_call_callee_tyinfrastructure).What's NOT covered yet
Category 2 residual — record/variant field destructuring loads. Sigil has no direct
FieldAccessexpression; record/variant field reads happen via match-arm destructuring, scattered across multiple match-lowering sites. Each load there needs the field's Sigil type threaded through — non-trivial refactor.Category 3 — block-arg confluences + fn-entry block-params on tail-callable fns (the Task 3 dependency surfaced in PR #157). 48 brif/jump sites + 14 block-param add sites need type-aware sweeping. Not started.
Question for reviewer
This PR is a draft to surface scope reality. Doing all of Task 2b in one PR is a 4–8 hour effort with high regression risk on a ~1000 LOC diff. Two paths:
Lean Option A given your "no deferral" steer; flagging it here in case the scope visibility changes your call.
Verification (commits 1 + 2)
cargo test --test cranelift_stackmap_spike -p sigil-compiler— PR [Plan E2 Phase 1] Cranelift stackmap API spike #151 spike tests still pass.Marking-site count progression