[Task 49] Monomorphization: reachability-bounded, typed IR preserved#16
Conversation
Plan B Task 49. New compiler/src/monomorphize.rs (~960 lines) replaces
the Plan A1 identity stub with a full reachability-bounded whole-program
specialisation pass.
Three-step pipeline:
1. Instantiation capture during typecheck. New `pending_call_instantiations`
and `pending_ctor_instantiations` on `Tc` track fresh-var ids per
`Expr::Ident` use of a top-level fn (via new `instantiate_with_vars`)
and per ctor use of a generic user type (via new
`fresh_user_instance_with_subst_and_ids`). End-of-typecheck resolves
them through `subst` to produce concrete `Vec<Ty>` per use site,
exposed as `CheckedProgram::call_site_instantiations` and
`CheckedProgram::ctor_site_instantiations` keyed by use-site span.
`CheckedProgram::fn_schemes` is also exposed publicly so
monomorphization can map fresh-var ids back to declared
generic-parameter names when descending into a generic body.
2. Post-elaborate worklist BFS rooted at `main`. `program_has_generics`
early-returns for non-generic programs (every Plan A1/A2/A3 program),
preserving zero overhead. Otherwise `Monomorphizer::run` drains fn
and type worklists to fixpoint, cloning each generic decl per
(name, type-args) tuple. Unreachable generics are dropped. `enqueue_*`
dedupes by mangled name (Ty doesn't impl Ord).
3. AST rewrite pass. Every cloned body has its `TypeExpr::Apply` and
generic-param `TypeExpr::Named` resolved to concrete primitives or
mangled user-type names. Every fn-call callee Ident rewritten to
`mangle_fn(name, args)`. Every ctor site rewritten to
`mangle_ctor(name, type_args)`. Pattern ctors rewritten using
`match_scrut_tys` + new `ctor_to_type` reverse index.
Canonical mangling (formalised in module docs and pinned by tests):
- `canon(Int)` = "Int", same for other primitives
- `canon(User(name, []))` = name
- `canon(User(name, [a1, ...]))` = name_<canon(a1)>_<canon(a2)>_...
- Top-level fn: `f__<canon(T1)>__<canon(T2)>__...`
- Top-level type: `Foo__<canon(T1)>__<canon(T2)>__...`
- Ctor of generic type: `C__<canon(T1)>__<canon(T2)>__...`
Plan example pinned: list_map__List_Option_Int__List_Option_Int.
Effect rows preserved per Plan B v1 (rows aren't monomorphized; codegen
erases them). Codegen-entry walker at compiler/src/codegen.rs no longer
rejects `f.effect_row_var.is_some()` or
`Expr::Lambda { effect_row_var: Some(_), .. }` — the rejection is
narrowed to surface generic syntax only.
Test deltas: 281 → 309 compiler lib tests (+28 new). Coverage:
- canon_ty rendering for all primitive and user variants
- mangle_fn/mangle_type/mangle_ctor for empty/single/multi/nested args
- Substitution::apply_to_ty resolving Var/User/Fn through subst
- ty_to_type_expr round-trip
- 9 end-to-end pipeline tests through lex → parse → resolve →
typecheck → elaborate → monomorphize covering: non-generic
pass-through, generic fn called at Int produces concrete Int clone
with substituted Param.ty / return_type, two instantiations of one
fn produce two clones, unreachable generic fn dropped, generic type
with unit ctor clones with mangled ctor names, two-instantiation
generic type produces distinct ctor names, **post-mono program
passes the codegen-entry walker** (the load-bearing invariant),
imports preserved, match-against-generic-scrutinee rewrites
pattern ctors.
PLAN_B_PROGRESS.md flips Task 49 to done-pending-ci.
PLAN_B_DEVIATIONS.md gains three Task-49 entries:
- Canonical mangling format and ordering (positional, not lex-sort
per plan body's literal reading; asymmetric `_`/`__` separator).
- Effect rows preserved through monomorphization (codegen-entry
walker rule narrowed).
- Pattern-ctor rewrite via match_scrut_tys + ctor_to_type, deferring
per-pattern instantiation indices to a future plan if needed.
Pod-verify green: cargo check workspace, fmt, clippy on runtime and
compiler (one crate at a time), runtime tests, all 309 compiler lib
tests pass (`--test-threads=1`).
Code Review — Request changesThorough work overall — the three-step pipeline is sound, mangling format is clearly specified, and the deviations are well-documented. 28 new unit tests, all 309 lib tests pass locally. But there are two correctness time bombs and several smaller issues that should be addressed before merge. Must fix1. Fragile call-site ctor lookup — depends on parser span coincidence
Expr::Call { callee, args, span } => {
let new_callee = self.rewrite_expr(callee, subst);
// ...just delegates to Ident's lookup — never checks ctor_sites.get(call_span)
}This works today only because the parser sets Risk: A reasonable parser change widening the Call span to cover the parens would silently break every positional ctor call of a generic type. Codegen would panic at Fix: Either (a) make 2. Silent collisions in Ty::Var(_) => \"VarUnresolved\".to_string(),
Ty::Fn(_) => \"Fn\".to_string(),The doc comment claims this makes failures "visible" — but two distinct Fix: Should fix3. Pattern sub-ctor handling is more limited than the deviation claims The deviation document says nested same-type ctors work via let type_args: Vec<Ty> = match (owning_type.as_deref(), scrut_ty) {
(Some(t), Some(Ty::User(scrut_name, args))) if t == scrut_name => args.clone(),
_ => Vec::new(), // ← any nested pattern of a different generic type silently un-mangled
};For The deviation's "acceptable in v1" claim is true only because v1's surface is too shallow to construct such a case in tests. Fix: either (a) tighten the deviation language to say nested-different-type generic patterns are unsupported, or (b) add an 4. Consistent with the "rows aren't monomorphized" deviation, but worth a one-line comment pointing to that deviation so future readers don't think it's a substitution bug. 5. Test coverage gaps
Nits (defer or fold in)6. 7. 8. What's good
Bottom lineIssues #1 and #2 are correctness time bombs even if not exercised today — please fix both. Issue #3 is a documented limitation but should either be tightened or guarded loudly. The rest can land in a follow-up if you prefer. |
Review verdict: request changesThe implementation is fundamentally sound — three-step pipeline matches the plan, BFS reachability + dedup-by-mangled-name is correct, the load-bearing reproducibility property holds (script green), 309 compiler-lib + 36 runtime tests pass, CI green on both hosts. The agent's deviation entries are well-reasoned. But there's one real soundness hazard in the mangling format that must close before Task 51 lands Must-fix before merge1. Mangling format is not unambiguous (real soundness hazard)The module docs at Construct two distinct types in the same program:
Today this is unreachable (no Plan A3 example uses single-underscore identifiers), but Task 51's Pick one fix:
I'd take the separator change. It's mechanical, the format becomes structurally unambiguous, and Plan A2 has already established Update 2. Ambiguous polymorphism reaches mangling without an error
No test asserts what happens. No Pick one fix:
The E-code is the right call. Sigil's whole thesis is that wrong-looking code should fail loudly; silent placeholder-mangling is the opposite. 3. Strengthen test coverageFour gaps the agent's tests don't pin:
4. Deviation #2 acceptance criterion is wrong shape
Reframe the entry: status Out of scope (genuine future-task work)
What's good (preserve through the rewrite)
ContinuationAfter items 1–4 land on this branch and CI is green:
Expect ~10 minutes for the spot-check re-review. Plan B Stage 5 review checkpoint sits between Tasks 52 and 53; this PR closes Task 49 once items 1–4 land. Tasks 50–52 (color inference, generic_map example, P16/P17 prompts) come next. |
Follow-up: all four items are blocking. Address them all on this PR.Prior comment (4318163326) listed four items under "Must-fix before merge." To remove any ambiguity: All four must land on this PR. None defer to a follow-up PR or a future task.
Why the explicit promotionSame pattern as PRs #14 / #15: "deferred" without a tracked closure path is how debt accumulates. Item 1 is the most urgent — once Task 51's Items 2 and 3 are catch-up: the discipline that's been holding through Plans A1–A3 says "every user-reachable diagnostic has an E-code" and "every claimed invariant has a test that pins it." Both currently fail for ambiguous-polymorphism and for the four named gaps. Item 4 is bookkeeping: an open-ended deviation entry that's actually a permanent design choice should be marked as such, so future readers don't waste time looking for the "v2 closure path." ContinuationPing when all four land and CI is green. Spot-check re-review (~10 min) verifies:
Plan B continues with Tasks 50–52 after this lands. |
…tests Addresses PR #16 review (boldfield): - Comment 4318163326 #1 (mangling format ambiguity, soundness hazard): switch separator from `_`/`__` to `$`/`$$`. `$` is lexer-rejected per Plan A2's `$lambda_N` precedent, so user identifiers cannot contain it. `type List_Option[A]` instantiated at `Int` no longer collides with `List[Option[Int]]` — the former mangles to `List_Option$$Int`, the latter to `List$$Option$Int`. Codegen's `mangle_user_fn` rewrites `$` to `__` for ELF/Mach-O compat; AST-level uniqueness is preserved. - Comment 4318163326 #2 (E0132 ambiguous polymorphism): new diagnostic in error catalog plus end-of-typecheck check. After resolving pending instantiations through `subst.apply_ty`, any type-arg that is still `Ty::Var(_)` AND not an outer-fn's generic-param id (i.e. not legitimately deferred to clone time) fires E0132 with the affected param name. Three tests pin the diagnostic: fires on unconstrained `nothing[A]()`, doesn't fire on constrained `id[A](42)`, doesn't fire on outer-fn-bound `use_id[B](y) { id(y) }`. - Comment 4318161359 #2 (canon_ty placeholder collision): the `Ty::Var(_)` and `Ty::Fn(_)` arms in `canon_ty` and `ty_to_type_expr` now `unreachable!` instead of emitting placeholder strings that two distinct vars would collide at. Reachable only when the upstream invariant breaks — E0132 catches ambiguous-poly cases at typecheck, and `Ty::Fn` requires a `TypeExpr::Fn` surface that v1 deliberately omits. - Comment 4318161359 #1 (call-site ctor lookup span fragility): the `Expr::Call` arm now explicitly checks `ctor_sites.get(span)` by the Call's span before falling through to the recursive callee rewrite. No longer relies on the parser's incidental `Call.span == callee_ident.span` for ctor calls. - Comment 4318161359 #3 (pattern sub-ctor unimplemented! guard): `rewrite_pattern`'s `Pattern::Ctor` arm now `unreachable!`s when the ctor's owning type is generic but doesn't match the scrutinee's User type. v1 surface can't construct this case; surfacing loudly closes a silent-miscompile path. - Comment 4318163326 #3 (test additions/strengthening): - `match_against_generic_scrutinee_rewrites_pattern_ctors`: strengthened to assert *both* `Nada$$Int` and `Just$$Int` appear, not just `>= 1`. - `recursive_generic_type_termination_one_clone_per_arg_tuple`: new test pinning that a recursive generic type (`type List[A] = | Nil | Cons(A, List[A])`) produces exactly one `List$$Int` clone given recursive use. - `self_recursive_generic_fn_terminates`: new test on a self-recursive generic fn. - `end_to_end_nested_generic_clone_name_matches_pinned_format`: new e2e variant of `mangle_fn_handles_nested_generics` that runs through the full pipeline. - `ctor_call_site_callee_ident_is_rewritten`: new test asserting the Call's callee Ident text is the mangled name post-mono (closes Comment 1 #1's gap). - `generic_fn_calling_generic_fn_resolves_var_chain`: new test exercising the `Ty::Var → Ty::Var → concrete` chain through `Substitution::by_var`. - `two_fn_clones_with_nested_generic_type_args`: new test confirming two clones for two distinct nested instantiations. - `canon_ty_disambiguates_underscore_named_user_types`: new unit test pinning the soundness fix. - Comment 4318163326 #4 (Deviation #2 reframe): effect-rows-not- monomorphized is reframed from open-ended ("v2 might revisit") to permanent v1 design choice; informational. Closure-path = none. - Comment 4318161359 #5 (apply_to_ty Ty::Fn comment): added inline comment pointing at Deviation #2 so the unchanged `effect_row_var` copy isn't mistaken for a substitution bug. - Pattern::Var nullary-ctor promotion in monomorph: typecheck's Plan A3 task 38.3 nullary-ctor promotion treats `Pattern::Var(name)` semantically as `Pattern::Ctor` when the name matches a Unit variant of the scrutinee's user type, but doesn't rewrite the AST. Monomorph now performs that rewrite explicitly (with mangled name `Nada$$Int` etc) so codegen's `ctor_index` lookup succeeds post-mono. This was the root cause of the strengthened match-pattern test failing initially. DEVIATIONS Deviation #1 updated to reflect the `$` separator and the hardening rationale. Deviation #2 reframed. Test deltas: 309 → 319 compiler lib tests (+10). - 28 → 35 monomorphize tests (+7 new). - 281 → 284 typecheck tests (+3 E0132 tests). Pod-verify green: cargo check workspace, cargo fmt --check, cargo clippy -p sigil-runtime -p sigil-compiler --all-targets -D warnings, cargo test -p sigil-runtime --lib, scripts/check-no-interior-pointers.sh.
Review fix-up — all 4 must-fix items + 3 should-fix items addressedPushed Comment 4318163326 (round-2 review) — all four must-fix
Comment 4318161359 (round-1 review) — also addressed
Plus one root-cause fix surfaced during testing
Test deltas
Pod-verify green: cargo check workspace, fmt --check, clippy -p sigil-runtime -p sigil-compiler --all-targets -D warnings, runtime tests, no-interior-pointers, discipline greps. Polling CI now. |
Re-review verdict: mergeAll four directed must-fix items landed correctly. CI green on all 4 jobs (ubuntu-24.04 + macos-14, both regular and cold-checkout). 319 compiler-lib tests pass (309 → 319, +10). Per-item verification
What the agent did beyond the directivesTwo notable bonuses, both net-positive:
Mechanical
ContinuationMerge whenever. Plan B Stage 5 review checkpoint sits between Tasks 52 and 53; this PR closes Task 49 cleanly. Tasks 50 (color inference + Task 51 is the load-bearing follow-up — it's the first time user-authored generic syntax flows through the full pipeline. Worth a fresh round of pod-side reproducibility and CI scrutiny when it lands. |
Re-review (after fix-up commit
|
| Issue | Status | Notes |
|---|---|---|
| #1 Call-site ctor span fragility | ✅ Fixed | Expr::Call arm now explicitly checks ctor_sites.get(call_span) before delegating; new test ctor_call_site_callee_ident_is_rewritten pins it |
#2 canon_ty placeholder collisions |
✅ Fixed | Both Ty::Var and Ty::Fn arms unreachable! in canon_ty and ty_to_type_expr; backstopped by new E0132 typecheck diagnostic |
| #3 Pattern sub-ctor handling | ❌ Regressed | See below |
#4 apply_to_ty Ty::Fn comment |
✅ Fixed | Pointer to deviation #2 added |
| #5 Test coverage | ✅ Mostly | 7 new mono tests + 3 E0132 tests; effect_row_var clone-preservation test still missing |
| #6–8 Nits | ⏭ Deferred | Acceptable |
New regression — Issue #3 fix breaks normal programs
The new Pattern::Ctor arm (monomorphize.rs:917) panics with unreachable! whenever a pattern's ctor belongs to a generic type that doesn't match the scrutinee's outermost type. The deviation update claims:
v1 surface cannot construct this case (every pattern ctor must match the scrutinee's type)
This is false. v1 surface fully supports nested ctor patterns — parser.rs::parse_pattern recurses into positional fields. I confirmed by running this program through the pipeline:
```sigil
type List[A] = | Nil | Cons(A, List[A])
type Option[A] = | None | Some(A)
fn main() -> Int ![] {
let opt: Option[List[Int]] = Some(Cons(1, Nil));
match opt {
None => 0,
Some(Cons(h, t)) => h,
Some(Nil) => 99,
}
}
```
Result: panic at monomorphize.rs:917 with the "v1 surface cannot construct this case" message. The trace is:
- Outer pattern
Some(Cons(h, t))—Somebelongs toOption, matches scrut → mangles toSome\$\$List\$Int✓ - Inner pattern
Cons(h, t)—Consbelongs toList(generic), butscrut_tyis stillOption[List[Int]](the outer scrut, propagated unchanged via the recursiverewrite_patterncall) - Falls into the new
(Some(_), _) if owning_decl_is_genericarm →unreachable!
The original code silently fell through to Vec::new() (no mangling) — which itself was wrong (codegen would later panic on missing Cons in ctor_index). The fix replaced silent miscompile with a hard panic, but on a legitimate program that any user pattern-matching against Option[List[T]], Result[T, E], Tree[T], etc. would write.
This is a release-blocker even though no current monomorphize test exercises it (because the test suite doesn't include nested generic pattern matches yet).
Suggested fix
The correct path is to thread per-sub-pattern type-args through the ctor lookup. Sketch:
```rust
// Inside Pattern::Ctor arm, after computing type_args for the
// CURRENT pattern, recurse with the variant's field types
// substituted under those type_args, so an inner pattern's
// scrut_ty reflects what type its parent's variant introduces.
let inner_scrut_ty: Option = compute_field_ty_for_inner_pattern(
owning_type, name, field_index, &type_args
);
self.rewrite_pattern(sp, &inner_scrut_ty)
```
This needs the type registry + variant index + field index to recover the field's declared TypeExpr, then resolve it under the per-pattern subst — same machinery as rewrite_type_expr. Not trivial, but the data is all there.
If the team wants to defer the fix, the minimum acceptable interim is:
- Drop the
unreachable!and revert to the original fall-through (still wrong, but no worse than pre-fixup). - Remove the false claim from the deviation.
- Add a typecheck-level diagnostic that rejects nested generic ctor patterns at parse time, so users get a clean error instead of a downstream codegen panic.
- Add a failing test marked
#[ignore]documenting the limitation.
But the right fix is to actually thread the types — it's not a v2-scope problem, the inputs are all present in the original program.
What's improved
- The
\$/\$\$separator change is a real soundness win, not just cosmetic —type_seenwould have collapsedList_Option[Int]andList[Option[Int]]under the underscore scheme. New testcanon_ty_disambiguates_underscore_named_user_typespins it. - E0132 is well-scoped — the outer-fn-var-id allowlist correctly distinguishes "deferred to clone time" from "genuinely unconstrained". Both positive and negative tests included.
unreachable!arms incanon_ty/ty_to_type_exprclose the silent-collision path completely, paired with E0132 catching the upstream condition.- The
Pattern::Varnullary-ctor promotion fix is a separate bug found during the fix-up — good catch (a pre-existing latent bug in mono not covered by tests, surfaced once the strengthened pattern test was added).
Still missing
effect_row_varclone-preservation test. The deviation claims clones retaineffect_row_var: Some(_)and pass the codegen walker. No test pins this. With the walker rule relaxed, an accidental future change that dropseffect_row_varon the clone wouldn't be caught.- Nested generic pattern test to anchor whatever resolution lands for the regression above.
Recommendation
Request changes — the regression is a release-blocker. Either implement the per-sub-pattern type-arg threading (the right fix), or revert to the silent fall-through and update the deviation truthfully (the interim). Approve the \$ separator change, E0132, and Issues #1/#2/#4 fixes — those are correctly resolved.
Closes the regression flagged in PR #16 review (boldfield, comment 4318208... — "Re-review (after fix-up commit 994f083) — Request changes"): The previous fix-up's `Pattern::Ctor` `unreachable!` arm panicked on a legitimate v1 program — `match opt: Option[List[Int]] { Some(Cons(h, t)) => ... }`. The deviation's claim that "v1 surface cannot construct this case" was empirically false: nested ctor patterns are fully supported by `parser.rs::parse_pattern`'s positional-field recursion. Any user pattern-matching against `Option[List[T]]`, `Result[T, E]`, `Tree[T]`, etc. would hit the panic. Proper fix: thread per-sub-pattern field types through `rewrite_pattern` so an inner pattern sees the field's resolved Ty (not the outer scrutinee's Ty) as its scrut_ty. Implementation: - New `Monomorphizer::variant_field_types(owning_type, ctor_name, type_args)` resolves the variant's field TypeExprs under the owning type's generic-param substitution. Returns a `VariantFieldTys` enum matching the variant's shape (Unit / Positional / Record). - New `ty_from_type_expr_under_subst(te, subst)` mirrors typecheck's resolution but without the error-collecting context — well-formed input is guaranteed by typecheck having accepted the program. - `Pattern::Ctor` arm computes the field types once per pattern and passes the per-sub-pattern Ty to the recursive `rewrite_pattern` call. The `unreachable!` arm is removed; the false claim is gone. Test deltas: 319 → 321 compiler lib tests (+2): - `nested_generic_ctor_pattern_threads_inner_scrut_ty`: the reviewer's exact reproducer (Option[List[Int]] with `Some(Cons(h, t))` arms). Asserts outer ctors mangle to `Some$$List$Int` / `None$$List$Int` and inner ctors mangle to `Cons$$Int` / `Nil$$Int`. Closes the regression directly. - `effect_row_var_preserved_on_clone_passes_codegen_walker`: pins the row-preservation invariant (Deviation #2). With the codegen- entry walker rule relaxed in PR #16, an accidental future change that drops `effect_row_var` on the clone wouldn't be caught without this test. Reviewer round-3 "Still missing" closure. DEVIATIONS Deviation #3 updated: - Title gains "+ per-sub-pattern field-type threading" suffix. - New section explaining the threading mechanism. - "Round-3 review correction" subsection documenting the false claim and its closure. - Closure path tightened: v1 surface fully covered; only Plan C+ GADT-style refinement would reopen.
Round-3 fix-up — regression closed via per-sub-pattern field-type threadingPushed The regressionConfirmed: the previous fix-up's Proper fix — per-sub-pattern field-type threadingImplemented exactly as the reviewer suggested. Inside Mechanism:
Trace of the reviewer's reproducer post-fix:
Tests added (Round-3 closure)
DEVIATIONS updateDeviation #3 (
Test deltas
Pod-verify green: cargo check workspace, fmt --check, clippy -p sigil-runtime -p sigil-compiler --all-targets -D warnings, runtime tests, no-interior-pointers, discipline greps. Polling CI. |
…ts (#18) * [Task 51] examples/generic_map.sigil — first user-authored generics e2e First user-authored generic syntax to flow through the full Sigil pipeline (lex → parse → resolve → typecheck → elaborate → monomorphize → color → codegen). PR #17's reviewer flagged this as the canonical reproducibility checkpoint for PR #16 (Task 49)'s `$$` mangling format: prior tests stop at the monomorph-IR level, and prior end-to-end examples declare no generic parameters. What the example exercises: - `type List[A] = | Nil | Cons(A, List[A])` — a self-recursive generic type. Monomorphization dedups via `type_seen` so each instantiation produces exactly one clone (`List$$Int`, `List$$String`). - `fn map[A](xs: List[A]) -> List[A] ![]` — structure-preserving traversal, semantically `map id`. Sigil v1 surface has no `TypeExpr::Fn`, so the design doc's canonical higher-order `map[A,B,e](xs, f) -> List[B]` cannot be expressed; this is the closest Plan B can express. - `fn length[A](xs: List[A]) -> Int ![]` — generic fold returning a concrete `Int`. Pairs with `map` to produce a verifiable scalar. - `main` instantiates each generic at `Int` and `String` in the same program, producing four monomorph clones (`map$$Int`, `map$$String`, `length$$Int`, `length$$String`) plus mangled ctor sites (`Nil$$Int`, `Cons$$Int`, `Nil$$String`, `Cons$$String`). - Pattern matching on generic ctors: scrutinee `Ty::User(_, args)` propagates per-sub-pattern field types so `Cons(h, t)` correctly types `h: A`, `t: List[A]` against the scrutinee's instantiated `A` (closes Task 49 round 3's regression reproducer). Two e2e tests: - `generic_map_example_prints_3_and_2` — full lex → parse → typecheck → elaborate → monomorphize → color → codegen → run pipeline. Asserts stdout exactly `"3\n2\n"`. Length values deliberately differ (3 vs 2) so a copy-paste error between the two list literals would surface as a length mismatch rather than pass silently. - `generic_map_dump_color_all_native` — verifies Task 50's per- monomorph color inference classifies all four monomorphs plus `main` as native (bodies have row `![]`, `main` has `![IO]`, no `perform` to a non-IO effect). Pins each expected mangled name independently so a mangling-format slip on any single one (e.g. `map$$String`) lands on a directed assertion rather than an opaque overall-string diff. Smoke + reproducibility scripts updated to include the new example so the per-host byte-stable-binary invariant covers the full generic pipeline. Plan A2's `examples/higher_order.sigil` doc comment notes that `TypeExpr::Fn` was deferred to Plan A3; A3 did not actually deliver it (deferred again to v2 per scope). The plan's "generic map" text therefore describes what Plan B's surface CAN express, not the design doc's canonical higher-order signature. * [Task 52] Validation prompts P16 + P17 — generic id, compose Adds the two remaining Stage-5-feasible entries to the prompt bank: - **P16 — generic identity at Int and String**. Fully expressible in Plan B's surface (no `TypeExpr::Fn` required). Asserts oracle output `"42\nsigil\n"`. Pins the discriminating contract that Algorithm W's fresh-var-per-call instantiation plus Task 49's reachability-bounded specialization produce exactly two monomorph clones (`id$$Int`, `id$$String`) — not one polymorphic body, not three from double-counted call sites. - **P17 — generic compose applied across types**. Requires `TypeExpr::Fn` surface syntax for the higher-order parameters, same as P09 / P10. P09/P10 deferred this to Plan A3; A3 did not deliver it. P17 follows the same deferral pattern: oracle is graded against "program compiles" until function types ship. P17's distinguishing feature versus P10 is `A != C` (the result type genuinely travels through composition rather than absorbing into the trivial endo-functor case). Plan B Task 61 will add P18 / P19 / P20 (Raise-based parser, State-threaded counter, multi-shot Choose). Bank is now 17/20; remaining three land in Stage 6. * [Task 51-52] PROGRESS: flip pending-ci entries → done; record Tasks 51 + 52 Standard PROGRESS hygiene at Stage 5 closeout. Per Plan A2 / Task 49 precedent, status flips for prior PRs go in the next task's PR. - Tasks 4.5.1-4.5.5 (Stage 4.5 scaffolding) and Plan A3 carryover items: flipped `done-pending-ci` → `done` with squash-merge hash `01e0e13` (PR #14). All four CI jobs were green at merge. - Task 47 (parser): `done-pending-ci` → `done` with `01e0e13` (also PR #14). - Task 48 (HM unifier): `done-pending-ci` → `done` with `70756de` (PR #15). - Task 50 (color inference): `done-pending-ci` → `done` with `82e0f97` (PR #17). Round 2 / round 3 fixup hashes preserved in the activity log; the squash-merge hash is the canonical post-merge anchor. - Task 49 (monomorphization): notes corrected from the round-1 `_`/`__` mangling separator to the final `$`/`$$` format pinned by round-2 fixup `994f083`. Status hash `[981ec93]` left as-is (set by prior session's `531bcfe` flip commit, recorded the branch-commit hash convention rather than the squash-merge `858b4c2`). - Task 51 entry: `done-pending-ci` with commit pointer `[HEAD]` describing the new `examples/generic_map.sigil`, two e2e tests, and smoke + reproducibility script updates. - Task 52 entry: `done-pending-ci` with commit pointer `[HEAD]` noting that P17 follows the same `TypeExpr::Fn`-deferred grading as P09 / P10 until first-class function types ship. Stage 5 review checkpoint remains pending — the next step before Stage 6 begins is Brian's review of row-unification, let- generalization, color decisions, and monomorphization naming determinism on adversarial inputs. * [Task 51 fix-up] typecheck: cross-arm body consistency uses unify_ty, not Eq CI failure on PR #18 surfaced an unprecedented Stage 5 typechecker bug: `check_match` compared arm body types via structural equality (`first != t`) instead of attempting unification. With pre-Plan-B programs (every Plan A1/A2/A3 example) arm body types are concrete primitives or already-resolved user types, so the equality check held coincidentally. With generic-fn-internal matches whose arms return a generic-typed value (e.g. `fn map[A](xs: List[A]) -> List[A] { match xs { Nil => Nil, Cons(h, t) => Cons(h, map(t)) } }`), each ctor resolution allocates a fresh `List[?N]` user instance with a distinct fresh-var id — `Nil` resolves to `List[?6]` and `Cons(...)` to `List[?5]` — and the equality check reports `?6 != ?5` even though the two trivially unify. Fix: snapshot `self.errors.len()`, call `unify_ty(first, t, &arm.span)`, and on failure truncate any internal E0044 "type mismatch" errors `unify_ty` itself pushed and emit E0065 with the arm-specific phrasing so the user-facing diagnostic surface stays unchanged. On success the unifier binds the fresh vars together (directly or transitively through the function's declared return type), and the match expression's overall type is whichever representative falls out of `deref(first)`. Two regression tests: - `generic_match_returning_generic_unifies_arms` — pins the exact reproducer from `examples/generic_map.sigil`'s `map` fn. Pre-fix this fails with E0065; post-fix the program typechecks clean. - `match_arm_type_unification_still_rejects_real_mismatch` — regression guard that the new unify-based path still emits E0065 on a genuine Int-vs-String arm-body mismatch (i.e., the fix doesn't accidentally silence legitimate type errors). The long-standing `match_arm_types_must_unify_is_e0065` covers the same shape; this new test pins the contract specifically inside a generic-fn surface so the discriminating path is tested. Plan B classification: bug fix in Task 48 surface (HM unification), discovered by Task 51's example. Not a deviation — the plan specified HM unification end-to-end, the implementation just had a coincidentally-passing structural-Eq check. No PLAN_B_DEVIATIONS entry; PROGRESS notes for Task 51 will be updated with this fix-up in the PR body. * [Task 51-52 review fixups] address PR #18 reviewer items Two review comments on PR #18: an initial code-review pass, then a revised verdict ("request changes") that supersedes the first on the contentious subst-rollback question. This commit addresses every non-superseded item. == From the revised verdict (Comment 2) == 1. **PROGRESS Task 49 SHA normalized.** `[981ec93]` (branch-tip hash) → `[858b4c2]` (squash-merge SHA on `main`). Matches the convention every other prior task entry uses; closes the bookkeeping inconsistency the reviewer flagged. 2. **3-arm generic match regression test.** New `three_arm_generic_match_propagates_subst_across_all_arms` pins that cross-arm unify propagates substitutions across MORE than two arms. Each `W(x)`-style ctor allocates a fresh `Wrap[?N]` user instance, so arm 1 → `Wrap[?A]`, arm 2 → `Wrap[?B]`, arm 3 → `Wrap[?C]`. The cross-arm check unifies sequentially: arm1↔arm2 binds `?A := ?B`; arm1↔arm3 must then unify the deref'd representative against `Wrap[?C]`. A naive 2-arm-only check would miss a propagation bug on the third arm. 3. **Subst-pollution-pinning regression-guard.** New `subst_pollution_from_partial_unify_surfaces_at_call_site` is the discriminating test against a future "fix" that adds subst snapshot/restore on `unify_ty` failure. The program: `foo[A, B]` has a match where arm 1 returns `p: Pair[A, B]` and arm 2 returns `Pair("x", 3): Pair[String, Int]`. Cross-arm unify SUCCEEDS by binding `A := String`, `B := Int` — the body itself typechecks clean, but the bindings persist in the global subst. `foo`'s scheme is now over-constrained; `caller`'s `foo(Pair(1, 2), 0)` instantiates with `A := Int, B := Int` and the over-constraint surfaces as E0044 (concrete mismatch) + E0132 (ambiguous polymorphism: scheme generalization sees A and B already bound). With rollback: arm 2's bindings discarded, foo stays generic, caller accepts `Pair[Int, Int]` cleanly with NO errors. Test pins the cascade by asserting BOTH E0044 AND E0132 appear, AND that the body's match itself has no E0065 (cross-arm unify succeeded; a body-level error would mean a regression in the opposite direction). 4. **Perf-floor instability flagged for Task 60 in PLAN_B_DEVIATIONS.md.** `fib_perf_example_prints_6765_under_50ms` and `tree_example_prints_32767_under_500ms` exceed wall-clock floors by ~4x in debug profile (~200ms each on aarch64-apple-darwin). Pre-existing on `main`, not Task 51's fault. Surfaced as a new VERIFICATION DEBT entry so Task 60 doesn't have to rediscover. Three resolution options enumerated; reviewer prefers "platform-aware tightening or release-only mode." == From the initial review (Comment 1, where not superseded) == 5. **Issue 2: selective error truncation on cross-arm unify failure.** Pre-fix `self.errors.truncate(pre_unify_errors)` removed every error `unify_ty` pushed, including E0126 (occurs check) and E0127 (row occurs) — which name a real soundness problem the generic E0065 wouldn't capture. Now drains errors past baseline, keeps non-E0044 codes (occurs-check kinds), drops E0044 (replaced by arm-specific E0065). User-facing diagnostic surface unchanged on the common path; correctness improved on the edge. 6. **Issue 3: P16 e2e test pins the prompt-bank claim.** New `p16_generic_id_at_int_and_string_oracle` runs P16's source through the full pipeline. Asserts (a) stdout exactly `"42\nsigil\n"` (P16 oracle), (b) `--dump-color` produces exactly 3 monomorph lines `{id$$Int, id$$String, main}` — a regression that double-counts call sites would surface as a 4th line; an unmonomorphized polymorphic body would surface as a bare `id native`. Makes the PR description's "exactly two clones" claim substantive instead of aspirational. 7. **Issue 4: P17 surface-syntax-pending pin.** New `p17_compose_source_rejects_until_typeexpr_fn_ships` writes the exact P17 prompt source and asserts the front end rejects it (specific error code is implementation detail and could shift between parser- and typecheck-level rejections; the test just asserts non-success). Once `TypeExpr::Fn` ships, the test should be inverted to assert success against the prompt's stdout oracle. 8. **Issue 6: unused `h` binding.** `Cons(h, t) => 1 + length(t)` in `length`'s body → `Cons(_, t) => 1 + length(t)`. `h` was never used; `_` matches Sigil idiom and avoids relying on the unused-binding policy. `map`'s body still uses `h`, which is fine because it's used in the result. == Skipped == - **Issue 1 (subst rollback):** superseded by Comment 2's verdict that current no-rollback behavior is HM-correct. Item 7 above pins that contract via the regression-guard test. - **Issue 5 (helper extraction):** minor style, single call site, not worth extracting per repo convention against speculative abstractions. - **Issue 7 (reproducibility note):** informational only; no action requested. == Test counts == 348 → 351 compiler lib tests (+3 new typecheck tests). E2E gains 2 new tests (P16 oracle + dump-color, P17 surface-pending). Pod-verify green; full test suite deferred to CI.
Addresses all critical/important/minor items from PR #21 review across the structural review pass and the context-aware must-fix pass. ## Critical / GC rooting (M1, Critical #1, #2) - Add `GC_add_roots`, `GC_remove_roots`, `GC_gcollect`, `GC_register_my_thread`, `GC_unregister_my_thread`, `GC_allow_register_threads` to gc.rs externs. - `sigil_gc_init` registers the calling thread's `HANDLER_STACK` cell and `ARENA` storage range with `GC_add_roots`. `cfg(not(test))` so test threads don't auto-register (auto-registration leaks ranges across cargo test's per-test thread teardowns). - `register_*_for_calling_thread` / `unregister_*_for_calling_thread` pairs in handlers.rs and arena.rs return the registered range so test infrastructure can symmetrically remove it. - `GcThreadEnrolment` in test_support.rs is now an RAII guard that enrols the thread, registers both TLS roots on Acquire, and removes both roots + unregisters the thread on Drop. - gc.rs comment at the GC_malloc/atomic selector clarifies the bitmap's v1 effect (binary signal) vs the per-bit precision being v2-forward-compat metadata. - arena.rs and handlers.rs module docs gain "GC reachability" sections documenting the rooting model + the conservative-scan pinning tradeoff for arena bytes. ## Critical #3: Vec::reserve panic across FFI - arena.rs `ensure_capacity_or_abort` uses `try_reserve_exact` and aborts on Err rather than panicking. Backing storage type changed to `Vec<u64>` for guaranteed 8-byte alignment (Critical #5). ## Important fixes - #4: `round_up_to_align` uses `checked_add`; aborts cleanly on overflow. - #5: `Vec<u64>` backing — natural u64 alignment, no system-allocator dependency. Test asserts absolute 8-byte alignment of every return. - #6: `sigil_handler_frame_new` explicitly zeros the variable-length arms region with `ptr::write_bytes` rather than relying on the Boehm-allocator-zero contract. Comment updated. - #7 / M5: `sigil_perform` bound-checks `args_len + 2` against `MAX_INLINE_ARGS` at entry, naming `effect_id` / `op_id` in the abort message. `sigil_next_step_call` does the same against `arg_count`. Trampoline-side check kept as defense-in-depth. - #8: `MAX_INLINE_ARGS = 32` promoted to `pub const` at the `handlers` module top. The trampoline's stack-resident `args_buf` uses the same constant. - #9: arena overflow has a `#[ignore]`-d test (`arena_overflow_aborts`) + a test-only `force_capacity_for_test` hook for manual verification via `cargo test -- --ignored`. - #10: new `perform_walks_three_deep_prev_chain_to_match` test — 3 frames pushed, `sigil_perform` walks past 2 unrelated outer frames to reach the deepest matching frame, depth-counter delta = 3. - #11: marked `sigil_arena_reset` and `sigil_handle_pop` `unsafe extern "C"` for FFI consistency. Test-side callers updated. - #14: `sigil_handle_push` debug-asserts `(*frame).prev.is_null()` to catch double-push at the push site, not later. `sigil_handle_pop` clears `prev` so repush of the same frame in a loop is supported. - #15: `payload_words` cast in `sigil_handler_frame_new` uses `try_into` with abort fallback documenting the invariant. - #16: INVARIANT comment near `Header::new` call about Boehm consuming bitmap-only. ## M2: GC stress tests Three new tests verifying the rooting contract holds under forced GC: - `handler_frame_survives_forced_gc_while_pushed` — push frame, alloc-spam to overwrite stack aliases, GC_gcollect, perform succeeds. - `closure_in_handler_arm_slot_survives_gc` — closure in arm 0's closure_ptr slot survives GC because the frame is rooted via HANDLER_STACK and the bitmap selects GC_malloc (conservative scan). - `closure_in_next_step_survives_gc_via_arena_root` — closure stored in arena via NextStep::Call survives GC because the arena range is rooted. All three are `#[ignore]`-gated with explanatory text — Boehm thread enrolment composes poorly with cargo test's per-test thread teardowns even with explicit `GC_unregister_my_thread`. Each passes in isolation; manual verification via `cargo test -- --ignored survives_gc`. ## M3: MAX_HANDLER_ARMS bumped to 14 Off-by-one in the original doc-comment at `handlers.rs:79-83` (claimed "bit 31 corresponds to arm 13"). Bumped cap to 14 so `i ∈ [0, 13]` fully utilises the 32-bit bitmap with bit 31 set at i=13. Updated the doc-comment and the deviation entry to match. ## M4: Boundary-arity test `handler_frame_dispatch_at_max_arm_count` allocates a frame with `MAX_HANDLER_ARMS` op-arms, sets every arm to a real handler fn, pushes, performs against the LAST arm (op = MAX_HANDLER_ARMS - 1), and verifies dispatch succeeds. Exercises the full alloc + bitmap + perform path against the cap. ## M6: Counter semantics documented Module-level docs clarify `HANDLER_WALK_COUNT` increments per perform *attempt* (regardless of match), `HANDLER_WALK_DEPTH_SUM` sums frames inspected including matching frame on a hit OR full stack depth on unhandled-effect abort. Average walk depth = SUM / COUNT. ## M7: Arena reentrancy contract documented Module-level docs spell out the `RefCell` no-reentrancy invariant: the trampoline upholds it by reading dispatch info into stack locals and calling `sigil_arena_reset` BEFORE invoking the carried `cps_fn`. Plan B v1 has no path that nests `sigil_arena_alloc` calls within a single trampoline iteration; codegen (Task 55) preserves this by emitting a single `NextStep` allocation per cps_fn return. ## M8: Deviation entries updated - New `[DEVIATION Task 56] Runtime TLS roots: register/unregister via Boehm GC_add_roots` covering the rooting contract + test-mode caveat + conservative-scan pinning tradeoff. - New `[DEVIATION Task 56] MAX_INLINE_ARGS = 32 cap with bound-check at perform site` documenting the cap, where it's checked, and the Task 55 codegen impact. - New `[DEVIATION Task 56] Vec::reserve panic-on-OOM does NOT cross the FFI boundary` documenting the try_reserve_exact swap. - New `[DEVIATION Task 56] Arena alignment via Vec<u64> backing storage` documenting the alignment guarantee. ## M9: Tagged-vs-raw closure point updated The `[VERIFICATION DEBT] Tagged-vs-raw ABI contract enforcement` entry's closure point updated from "Task 56" to "Task 55 (when codegen lowers the first Int-typed user arg into args_buf)" with a one-line rationale: Task 56's runtime structs hold `*mut u8` pointers and raw `u64` slots, not Int-typed slots; the newtype contract lands when codegen does. ## Arena reset zero-fill Added: `sigil_arena_reset` zeros the `[start, start + len*8)` region before clearing `len`. Required because `GC_add_roots` covers the full arena capacity range — without zeroing, stale pointers from prior iterations or initial garbage from `try_reserve_exact` alias freed Boehm blocks during conservative scan, segfaulting collections. Cost is a `len*8`-byte memset per reset, typically tens of bytes per trampoline iteration. ## Test deltas 61 passing + 4 ignored (arena_overflow_aborts + 3 GC stress tests) prior pass-rate stays green. Pod-verify clean.
Plan B Task 49. Replaces Plan A1's identity stub in
compiler/src/monomorphize.rswith a full reachability-bounded whole-program specialisation pass.Landed in this PR
Three-step pipeline
1. Instantiation capture during typecheck. New
pending_call_instantiationsandpending_ctor_instantiationsonTctrack fresh-var ids perExpr::Identuse of a top-level fn (via newinstantiate_with_vars) and per ctor use of a generic user type (via newfresh_user_instance_with_subst_and_ids). End-of-typecheck resolves them throughsubstto produce concreteVec<Ty>per use site, exposed asCheckedProgram::call_site_instantiationsandCheckedProgram::ctor_site_instantiationskeyed by use-site span.CheckedProgram::fn_schemesis also exposed publicly so monomorphization can map fresh-var ids back to declared generic-parameter names when descending into a generic body.2. Post-elaborate worklist BFS rooted at
main.program_has_genericsearly-returns for non-generic programs (every Plan A1/A2/A3 program), preserving zero overhead. OtherwiseMonomorphizer::rundrains fn and type worklists to fixpoint, cloning each generic decl per (name, type-args) tuple. Unreachable generics are dropped — keeping binary size tight per the plan body's reachability-bounded clause. Dedup by mangled name (Ty doesn't impl Ord and adding it requires choosing one among many for FnSig shapes, which is out of scope here).3. AST rewrite pass. Every cloned body has its
TypeExpr::Applyand generic-paramTypeExpr::Namedresolved to concrete primitives or mangled user-type names. Every fn-call callee Ident rewritten tomangle_fn(name, args). Every ctor site rewritten tomangle_ctor(name, type_args). Pattern ctors rewritten usingmatch_scrut_tys+ a newctor_to_typereverse index built from the original program's TypeDecls.Canonical mangling
Formalised in module docs and pinned by tests. Plan example pinned:
list_map__List_Option_Int__List_Option_Int.canon(Int)="Int", same for other primitivescanon(User(name, []))=namecanon(User(name, [a1, ...]))=name_<canon(a1)>_<canon(a2)>_...(single underscore between parts within a type-arg)f__<canon(T1)>__<canon(T2)>__...(double underscore between top-level type-args)Foo__<canon(T1)>__<canon(T2)>__...C__<canon(T1)>__<canon(T2)>__...(suffix matches owning type so global ctor namespace stays unique post-mono)Type arguments are emitted in the callee's declared generic-parameter order, not lex-sorted (the plan body's literal "lexicographically-sorted" reading would lose the positional binding between a declared generic param's name and its concrete arg — see DEVIATIONS).
Effect rows preserved
Per Plan B v1: effect rows are not monomorphized; they pass through this phase and are erased at codegen (effect dispatch is runtime-indirect). The codegen-entry walker at
compiler/src/codegen.rs:128no longer rejectsf.effect_row_var.is_some()orExpr::Lambda { effect_row_var: Some(_), .. }. The rejection is narrowed to surface generic syntax only (generic_paramsnon-empty,TypeExpr::Apply, generic-paramNamedreferences).Test plan
compiler/src/monomorphize.rs::tests. All pass.lex → parse → resolve → typecheck → elaborate → monomorphize:id__Intclone with substituted Param.ty / return_typeid__Int,id__String)Empty__Int,Hold__Int)Wrap__IntvsWrap__String)Nada__Int,Just__Int)Memory discipline
compiler/src/monomorphize.rsuses BTreeMap / BTreeSet only (no HashMap / HashSet), no unwrap/expect (worklist invariants surface viaunreachable!per the project pattern), nopanic!, no interior pointers introduced. The new public API surface is purely additive:GenericInstantiation, three newCheckedProgramfields,instantiate_with_vars,fresh_user_instance_with_subst_and_ids. Oldinstantiateandfresh_user_instance/fresh_user_instance_with_substwrappers removed; their last callers now use the_with_vars/_with_subst_and_idsvariants.Deviations
Three new entries in
PLAN_B_DEVIATIONS.md:_/__separator).Handoff for review
The plan's "Every change in this plan should be reviewed by a human before merging" applies. Key things to scrutinise:
non_generic_program_passes_through_unchangedpins this.effect_row_var: Some(_)→ no-op at lowering).match_scrut_tysbeing populated for every match expression that reaches mono. Testmatch_against_generic_scrutinee_rewrites_pattern_ctorsexercises this.Carry-forward to Task 50+
--dump-colorflag (Task 50). Will consume the same monomorphic IR.examples/generic_map.sigil(Task 51) — first end-to-end test exercising mono via codegen.Plan stays in
in-progress/Per the
/implementskill workflow: plan moves todone/only when the entire Plan B completes (Tasks 49–61 + Stage 5/6 review checkpoints), not per-task. PR #15 (Task 48) followed this pattern.