[Task 50] Color inference: per-monomorph, SCC-aware, --dump-color by boldfield · Pull Request #17 · boldfield/sigil

boldfield · 2026-04-25T06:26:29Z

Plan B Stage 5 Task 50. Replaces the Plan A1 always-Native stub in
compiler/src/color.rs with real per-monomorph color analysis +
SCC-aware propagation + --dump-color CLI flag.

Landed in this PR

Per-monomorph local analysis in compiler/src/color.rs. Each
top-level fn is classified Native or Cps:
- Native iff closed row of ![] or ![IO], no perform of a
  non-IO effect anywhere in the body or nested lambdas, AND (after
  propagation) no transitive call into a Cps-color monomorph.
- Cps otherwise.
Tarjan SCC over the monomorph call graph. Within an SCC, if any
member is locally Cps the whole SCC is Cps; otherwise if any member
calls into a downstream SCC that's Cps, the SCC is Cps; otherwise
Native. Matches Plan B's "SCC-aware" guidance — avoids
over-pessimizing one fn because of an unrelated cycle.
Stable reason strings for --dump-color and adversarial review:
- cps: open effect row
- cps: row contains effect + E
- cps: performs + E.op
- cps: transitively calls + callee + which is cps
- cps: in SCC with cps member + name
- native: pure row / native: row is ![IO]
--dump-color flag in cli.rs + main.rs + new
pipeline::dump_color helper. Runs the front end through color
inference and prints <name> native|cps <reason> per fn (one line
per monomorph in program order) to stdout. No codegen. -o is
silently accepted but ignored under --dump-color;
--print-runtime-stats conflicts and emits a usage error.

Test plan

15 color::tests unit tests cover: open-row Cps, non-IO-row
Cps, perform-non-IO-with-IO-row Cps via body walk, caller-callee
both Native, mutual recursion Native, mutual recursion with one
Cps member tainting whole SCC, transitive Cps taint, unrelated
cycle non-pessimization, lambda body perform tainting parent,
stable dump_color program-order output.
4 cli::tests pin --dump-color parsing (default + human
formats), -o silent ignore, and the conflict with
--print-runtime-stats.
2 e2e tests exercise sigil <input> --dump-color end-to-end:
dump_color_hello_is_native_row_io, dump_color_multi_fn_pure_program.
Pod-verify green (cargo check, fmt, clippy on both crates,
runtime lib tests, no-interior-pointers, discipline greps).
CI green on both hosts (ubuntu-24.04 + macos-14, build+test +
cold-checkout).

Conservative-soundness notes

Calls to bare Ident references (not just calls in callee
position) count as outgoing edges. This is needed because Plan A2's
closure model treats top-level fn names as values (e.g., let f = some_fn), and lower_call's direct-Ident branch resolves them to
their target later. Counting both keeps the call graph sound when
future passes propagate fn-as-value forms.
Lambdas inside fn bodies are part of their parent fn's color at
this stage — closure conversion runs after color in the pipeline.
Their bodies are walked for performs and outgoing calls; the parent
picks up any Cps-tainting from inside.
Open effect rows ![IO | e] classify as Cps. We can't statically
prove the row var is never instantiated with anything beyond IO, and
Plan B Stage 5 keeps row vars in the IR. Stage 6's effect runtime is
the only mechanism that could discharge them, so treating them as
Cps is the safe v1 choice.

Out of scope (deferred to later tasks)

Color of synthetic lambda fns hoisted by closure conversion: those
don't exist yet at color time. When CC lands per-monomorph closures,
Task 55 (CPS transform) will need to revisit.
Effect-row monomorphization (the plan reserves this for v2 — rows
are erased at codegen-entry, not specialized).
The --debug-counters instrumentation flag from Task 56 (Stage 6).

Plan B Task 50 acceptance

Plan B Task 50 description:

Color inference (compiler/src/color.rs). After monomorphization, tag
each monomorph: Native if row is ![] or ![IO] AND transitively calls
only native monomorphs AND contains no perform to a user-handled
effect; CPS otherwise. SCC-aware propagation. --dump-color compiler
flag dumps one line per monomorph: <mangled_name> native|cps .

All five spec points are met:

After monomorphization: yes, runs in pipeline immediately after
monomorphize::monomorphize.
Native row check (![] or ![IO]): yes, local_color rejects any
row var or non-IO effect name.
Transitively calls only native: yes, propagated via SCC pass.
No perform to user-handled effect: yes, find_non_io_perform_in_*
walks the entire body and lambda bodies for any non-IO perform.
SCC-aware: yes, Tarjan SCC; within-SCC color is the disjunction of
member colors; cross-SCC propagation honors reverse-topological
order.
--dump-color flag: yes, plumbed through cli.rs, main.rs, and
pipeline::dump_color; one stable line per monomorph in program
order.

Replaces the Plan A1 always-Native stub in compiler/src/color.rs with real per-monomorph analysis. Each top-level fn (each post-mono "monomorph") is classified Native or Cps: - Native: closed effect row of [] or [IO], no `perform` of a non-IO effect anywhere in the body or nested lambdas, and (after propagation) no transitive call into a Cps-color monomorph. - Cps: anything else. The propagation pass uses a Tarjan SCC over the monomorph call graph. Within an SCC, if any member is locally Cps the whole SCC is Cps; otherwise if any member calls into a downstream SCC that's Cps, the SCC is Cps; otherwise Native. This matches Plan B's "SCC-aware" guidance and avoids over-pessimizing one fn because of an unrelated cycle. Reasons are stable, machine-readable, and human-readable, covering: - `cps: open effect row` - `cps: row contains effect <E>` - `cps: performs <E>.<op>` - `cps: transitively calls <callee> which is cps` - `cps: in SCC with cps member <name>` - `native: pure row` / `native: row is ![IO]` `--dump-color` lands in cli.rs + main.rs + new pipeline::dump_color helper. The flag runs the front end through color inference and prints `<name> native|cps <reason>` per fn (one line per monomorph in program order) to stdout, no codegen. `-o` is silently accepted but ignored; `--print-runtime-stats` conflicts with `--dump-color` and emits a usage error. Tests: - 15 color::tests cover open-row Cps, non-IO-row Cps, perform-non-IO-with-IO-row Cps (via the body-walk fallback), caller-callee both Native, mutual recursion Native, mutual recursion with one Cps member tainting whole SCC, transitive Cps taint, unrelated cycle non-pessimization, and lambda body perform tainting parent. - 4 cli::tests pin --dump-color parsing, default and human formats, -o silent ignore, and the conflict with --print-runtime-stats. - 2 e2e tests exercise `sigil <input> --dump-color` end-to-end. Pod-verify green; full lib + e2e tests deferred to CI.

boldfield · 2026-04-25T06:32:04Z

Code review — Task 50 color inference

Solid implementation overall. Plan B color spec is met, SCC propagation is correct (Tarjan emits in reverse-topological order, you consume callee SCCs before caller SCCs), and the test surface is genuinely thorough. That said, there are several issues — one possible soundness/precision gap, one robustness concern, and a handful of UX/diagnostic nits — I'd like fixed before this lands.

Bugs / correctness

1. Bare-`Ident` edge insertion over-approximates due to shadowing — falsely taints native fns

compiler/src/color.rs:555-567 (collect_calls_in_expr for Expr::Ident) treats any bare identifier whose name matches a top-level fn as an outgoing call edge:

Expr::Ident(name, _) => {
    if let Some(&idx) = fn_index.get(name) {
        out.insert(idx);
    }
}

The doc comment defends this as the conservative-sound choice. But Sigil's resolver only forbids shadowing within a scope (compiler/src/resolve.rs:38) — it does not disallow a let or parameter sharing a name with a top-level fn. Typecheck handles shadowing via env-precedence at compiler/src/typecheck.rs:1968 (local env wins over fn_schemes). The post-typecheck AST keeps the same Expr::Ident("name", span) either way; there is no NodeId-based disambiguation yet.

Concrete repro:

fn dangerous() -> Int ![Raise] { 0 }     // CPS

fn caller(dangerous: Int) -> Int ![] {   // legal — `dangerous` is the param
    dangerous + 1                         // Expr::Ident("dangerous"), refers to param
}

Color analysis adds a caller -> dangerous edge, classifies caller as CPS via "transitively calls dangerous which is cps". caller never invokes the top-level fn at runtime. The classification is sound (no missed CPS taint) but unnecessarily pessimizes a fn that should be native.

This isn't theoretical — it'll bite the moment any user writes a local that happens to shadow a stdlib fn name. The fix exploits info you already have: CheckedProgram::call_site_instantiations is keyed by use-site span and only contains real top-level-fn references (typecheck only inserts when fn_schemes.get(name) resolves — see typecheck.rs:1971-1981). Two viable paths:

(preferred) In infer_colors, use call_site_instantiations to drive edge insertion: an Ident is a fn reference iff its span is in that map. This is precise and uses already-computed data.
Or, narrow the bare-Ident edge to call-position only: drop the value-position branch and rely on Expr::Call's direct-Ident sub-arm. You'd lose soundness on let f = some_fn; ... f(), but typecheck records that as a call site too, so option 1 dominates.

Document the chosen invariant in the body of collect_calls_in_expr.

2. Recursive Tarjan — stack overflow risk on heavy monomorph workloads

compiler/src/color.rs:641 (strongconnect) recurses on st.edges[v]. Each frame is non-trivial (locals + the snapshotted neighbors: Vec<usize>). On a generic-heavy program — Plan B Task 47 explicitly mentions list_map__List_Option_Int__List_Option_Int-style monomorphs — a long linear call chain can blow the default 8 MB thread stack. Default Rust release-build frame ≈ a few hundred bytes; you'd hit overflow somewhere in the 20–40 k call-depth range.

This is straightforward to mitigate. Either:

Convert to iterative Tarjan (single explicit work stack, plus a "post-visit" stack to compute lowlink updates after children finish).
Wrap each recursion site in stacker::maybe_grow(64*1024, 1024*1024, || ...) if you want to keep the recursive structure.

The compiler wraps user input — a bug report from a user with a deep call chain is the kind of thing that's annoying to track down later. Fix it now.

Diagnostic / API issues

3. SCC reason text is misleading when multiple members are intrinsically CPS

compiler/src/color.rs:288-353. The intrinsic-CPS branch attributes the SCC's color to the first CPS member in program order; every other member — including ones that have their own intrinsic CPS reason — gets "cps: in SCC with cps member \`"`. Two issues:

A user looking at --dump-color for fn B would see "in SCC with cps member A" even though B itself has, say, a non-IO row. The more salient reason for B is B's own row.
In the (None, Some(caller, callee)) branch, the "caller" is the first SCC member with an outgoing edge to a CPS SCC. Every other SCC member gets "in SCC with cps member \`"— but thecaller` itself isn't intrinsically CPS, it's transitively CPS. Calling it a "cps member" is technically true but the reason chain is unclear.

Fix: report each node's own most-proximate reason — its own intrinsic local-CPS reason if present; otherwise its own outgoing edge into a CPS SCC if present; otherwise the SCC-membership fallback. Keep the SCC fallback only for nodes that have neither.

4. `pipeline::dump_color` returns `Result<String, usize>` with two semantically distinct usize meanings

compiler/src/pipeline.rs:104, 130. Err(1) from a file-read failure and Err(all_errs.len()) from a typecheck failure go through the same channel. The caller in main.rs:75 ignores the value entirely and returns exit code 1 unconditionally. Either:

Drop the usize and return Result<String, ()>.
Or use a proper error enum (DumpColorError::ReadFailed(io::Error) / ::FrontEndErrors(usize)) so the caller can produce useful exit codes / diagnostics later.

Right now it's neither informative nor type-safe.

5. `--dump-color` silently accepts and ignores `-o`

compiler/src/cli.rs:101-104. The PR description and the dump_color_ignores_dash_o test confirm this is intentional. But silent acceptance hides user errors: someone running sigil foo.sigil -o /tmp/x --dump-color reasonably expects an executable at /tmp/x and will be confused when nothing's produced. The help text says "No codegen" but doesn't call out that -o is ignored.

Either emit a stderr warning ("-o ignored under --dump-color") or extend the help text to make this explicit. Don't leave it silent.

Test issues

6. `unused_param_warning_silenced` — name doesn't match body

compiler/src/color.rs:1144-1158. The test body just verifies that a synth fn with one param classifies native. There's no warning being silenced. Rename to something accurate (synth_fn_with_param_classifies_native) or delete; the case is already covered implicitly by the e2e tests.

Style / nits (skip if not interested)

color.rs is now ~1010 lines, ~70% of which is tests. Consider splitting into color/mod.rs + color/tests.rs for navigability.
BTreeMap<String, usize> for fn_index has unnecessary log-n cost on a hot path. HashMap would do; not a perf bottleneck but cheap to fix.
The State + nested-fn pattern in tarjan_scc works but is hard to read. If you go iterative for issue Plan A2 Stage 1.5: scaffolding + pod-verify + cold-checkout fix + debug_assert #2, this disappears anyway.

Summary

Block on #1 and #2 — both are real concerns the moment the compiler sees real user code. #3–#5 are quality issues I'd want addressed in this PR rather than carried forward. #6 is trivial.

The core algorithm and the test coverage of that algorithm are good; the gaps are at the integration boundaries and on robustness assumptions.

boldfield · 2026-04-25T06:37:49Z

Review verdict: request changes

Implementation is fundamentally sound. Tarjan is textbook-correct, conservative-soundness holds, output is deterministic, 340 compiler-lib tests pass (319 → 340, +21 across the directed coverage), CI green on both hosts. But there are six issues — one structural risk and five coverage/wording gaps — that should land before merge per the project's "don't put it off" discipline.

Must-fix before merge

1. Convert `tarjan_scc` from recursive to explicit-stack iterative form

color.rs:467-547 recurses through strongconnect. A linear call chain of N fns produces N stack frames; each frame snapshots neighbors plus the recursion machinery — call it ~250B/frame conservatively. Default Rust thread stack is 8 MiB, so the practical bound is ~30K-40K monomorphs in a chain. Today's example corpus has single-digit fn counts, so the risk is theoretical, but Plan B will continue to grow monomorph counts (typeclass dictionary specialization in particular). Once a real program hits the limit, color analysis aborts the entire compile with a stack overflow — no diagnostic, no recovery.

Convert now. The standard rewrite is straightforward: maintain an explicit Vec<(node, neighbor_iter, lowlink_state)> and Vec<work_state_enum> to drive the loop. Tests stay unchanged.

2. Add test for single-node self-loop SCC

fib calling itself directly should produce a single-node SCC with a self-loop, and color should propagate correctly within it. examples/fibonacci.sigil exercises this path indirectly via smoke, but there's no targeted unit test pinning the algorithm's behavior on this case. Add self_loop_single_node_scc_native (and a Cps variant for completeness).

3. Add test for the bare-Ident-as-value soundness claim

The PR body explicitly states: "calls to bare Ident references (not just calls in callee position) count as outgoing edges." This is a load-bearing soundness claim — without it, a let f = some_cps_fn in an otherwise-Native parent would falsely classify the parent as Native. The implementation at color.rs:392-405 adds the edge correctly, but no test pins the contract. Add let_bound_fn_value_taints_parent_via_outgoing_edge (or equivalent name) — construct a parent that binds a Cps fn as a value but never invokes it; assert the parent is Cps.

4. Add test for 3+ node cross-SCC transitive chain

Current cross-SCC propagation tests are 2-node chains (one SCC calls into another Cps SCC). No test exercises a 3+ node chain (SCC A calls SCC B which calls SCC C where C is the Cps origin). Tarjan + reverse-topological propagation should handle arbitrary depth, but there's no test pinning that depth-N propagation works. Add cps_propagates_through_three_scc_chain or similar.

5. Add test for the transitive-only SCC reason branch

All current SCC-Cps tests fire the intrinsic branch (an SCC member is locally Cps). The transitive branch (no member intrinsically Cps, but one bridges to a downstream Cps SCC, taking the whole SCC down with it) has no test coverage. The reason-string for this branch is cps: transitively calls <callee> which is cps for the bridge member and cps: in SCC with cps member <bridge> for the rest. Without a test, future refactors could swap the branches and CI wouldn't catch it. Add scc_taint_via_transitive_only_branch or similar.

6. Tighten the propagation reason wording for the transitive-bridge case

At color.rs:163 and :178, when an SCC becomes Cps because a member transitively calls a Cps callee outside the SCC, non-bridge members get the reason cps: in SCC with cps member <bridge>. The bridge isn't itself intrinsically cps — it's cps because of its outgoing edge. Calling it a "cps member" is technically accurate (it's been colored cps) but obscures the causal chain in --dump-color output, which is a debugging tool.

Pick one of:

Distinguish the two phrasings: cps: in SCC with intrinsically-cps member <name> vs cps: in SCC bridging to cps callee via <name>. Two reason variants.
Document the conflation as deliberate in a comment at the propagation site, citing the readability tradeoff.

The two-variant form is more informative for debugging Stage 6 selective-CPS issues.

Out of scope (genuine future-task work; PR body is correct to defer)

Color of synthetic lambda fns hoisted by closure conversion — CC runs after color in the pipeline.
Effect-row monomorphization — deferred to v2 by design doc.
--debug-counters flag — Task 56 (Stage 6).

What's good (preserve through the rewrite)

Tarjan is correctly implemented — index, lowlink, on-stack, back-edge uses index (not lowlink) per the original paper, root condition correct, pop-until-v loop correct, reverse-topological emission order falls out naturally. The conversion to iterative form should preserve the algorithm; only the driving loop changes.
Conservative-soundness audit clean — every path I checked rejects to Cps when uncertain. Open rows, non-IO effects in closed rows, body-walk performs of non-IO ops, transitive Cps callees, and bare-Ident-as-value edges all classify correctly.
Output determinism verified — BTreeMap/BTreeSet and Vec source-order; no HashMap anywhere in the pass. Three consecutive runs produce byte-identical --dump-color output.
CLI integration is clean — --dump-color short-circuits before -o consumption, conflicts with --print-runtime-stats, plays nicely with --human-errors. Pipeline goes lex → parse → resolve → typecheck → elaborate → monomorphize → color → print, never invoking codegen.
Lambda body walking at color.rs:330 and :447 correctly bubbles performs and call edges to the enclosing fn (per the soundness note that closures haven't been hoisted yet at color time).

Continuation

After items 1–6 land and CI is green:

Re-verify Tarjan still produces the same SCCs / colors on the existing 15 unit tests post-iterative-rewrite.
Spot-check the four new tests pin the right cases.
Confirm --dump-color output reads cleanly under the new reason wording (smoke run on examples/higher_order.sigil and examples/fibonacci.sigil).

Expect ~10 minutes for the spot-check re-review. Plan B Stage 5 review checkpoint sits between Tasks 52 and 53; this PR closes Task 50 once items 1–6 land. Tasks 51 (generic_map.sigil) and 52 (P16/P17 prompts) come next; Task 51 is the load-bearing follow-up where the new mangling format from PR #16 first sees user-authored generic syntax.

…de reasons Addresses both review comments on PR #17. Six structural changes plus five new tests; existing 32 unit tests + 2 e2e tests still pass. Bugs / correctness: 1. Edge insertion now drives off CheckedProgram::call_site_instantiations instead of the bare-Ident-name heuristic. Typecheck's env-precedence rules win over fn_schemes lookup at every Ident span, so a parameter or `let` binding shadowing a top-level fn name does NOT produce a spurious outgoing edge. The bare-Ident heuristic over-approximated here — the new code uses information typecheck already computed. Pinned by parameter_shadowing_top_level_fn_does_not_taint_caller, a real-front-end test that runs through typecheck. 2. tarjan_scc is now iterative with an explicit work stack — eliminates the recursive-frame stack-overflow risk on long monomorph chains. Reverse-topological emission order preserved; behavior on existing tests unchanged (verified by re-running all 15 prior color tests plus the 5 new ones). Diagnostic / API: 3. Per-node reason text is now the node's *own* most-proximate cause: intrinsic-CPS members keep their specific local reason; bridge members (with an outgoing edge to a CPS SCC in a different SCC) get `cps: transitively calls <callee> which is cps`; non-bridge members propagated through SCC membership get either `cps: in SCC with intrinsically-cps member <name>` or `cps: in SCC bridging to cps callee via <name>` depending on which peer caused the taint. Two-variant SCC-fallback wording lets a --dump-color reader follow the causal chain in adversarial review. 4. pipeline::dump_color returns Result<String, DumpColorError> with typed ReadFailed / FrontEndErrors(usize) variants instead of conflating both as `usize`. Caller still maps to exit code 1 today, but downstream tooling can branch on the variant. 5. main.rs now warns on stderr when `-o <path>` is supplied alongside --dump-color: "warning: `-o <path>` ignored under --dump-color (no executable produced)". The flag is still accepted (shell-history ergonomics) but the misuse is no longer silent. 6. unused_param_warning_silenced renamed to synth_fn_with_param_classifies_native — the old name didn't match the test body. New tests (in addition to the rename): - self_loop_single_node_scc_native + self_loop_single_node_scc_cps — pin Tarjan correctness on a single-node-with-self-loop SCC. - let_bound_fn_value_taints_parent_via_outgoing_edge — pins the load-bearing soundness claim that `let f = some_cps_fn` taints the parent even though f is never invoked. - cps_propagates_through_three_scc_chain — pins reverse-topological propagation across depth-N hops, not just single-hop. - scc_taint_via_transitive_only_branch — exercises the transitive-only SCC reason branch (no intrinsic member, only bridge), pins the new "in SCC bridging to cps callee via <name>" reason wording. - parameter_shadowing_top_level_fn_does_not_taint_caller — the Review #1 reproducer; real front-end run; pins precision. Pod-verify green. Full lib tests + new e2e behavior (the -o warning) deferred to CI.

boldfield · 2026-04-25T06:49:34Z

Addressed both review comments in 8cf59c4. Summary of changes:

Bugs / correctness:

Shadowing precision — edge insertion now drives off CheckedProgram::call_site_instantiations (typecheck's span-keyed map of every Ident that resolved to a top-level fn under env-precedence rules). A parameter or let binding sharing a name with a top-level fn no longer produces a spurious outgoing edge. New parameter_shadowing_top_level_fn_does_not_taint_caller test runs through the real front end and pins this.
Iterative Tarjan — tarjan_scc rewritten with an explicit Vec<Frame> work stack. Lowlink propagation to parent frame happens on frame pop. SCC emission order (reverse-topological) preserved; all 15 prior tests pass unchanged.

Diagnostic / API:
3. Per-node reasons — each node now reports its own most-proximate cause. Intrinsic-CPS members keep their specific local reason; bridge members get cps: transitively calls <callee> which is cps; non-bridge SCC members get either cps: in SCC with intrinsically-cps member <name> or cps: in SCC bridging to cps callee via <name> depending on which peer caused the taint. Two-variant SCC fallback as you suggested.
4. Result<String, DumpColorError> — typed enum with ReadFailed / FrontEndErrors(usize) variants replacing the conflated usize channel.
5. -o warning — --dump-color -o <path> now prints sigil: warning: \-o ` ignored under --dump-color (no executable produced)` to stderr. Still accepted (shell-history ergonomics) but no longer silent.

Tests:
6. Renamed unused_param_warning_silenced → synth_fn_with_param_classifies_native.
7. +5 new tests:

self_loop_single_node_scc_native + self_loop_single_node_scc_cps
let_bound_fn_value_taints_parent_via_outgoing_edge (pins the load-bearing fn-as-value soundness claim)
cps_propagates_through_three_scc_chain (depth-N propagation)
scc_taint_via_transitive_only_branch (transitive-only SCC reason branch)
parameter_shadowing_top_level_fn_does_not_taint_caller (Review Plan A1 code-review fixes: 6 issues from the post-review audit #1 reproducer, real front end)

Existing test updated: mutual_recursion_with_one_cps_member_taints_whole_scc now asserts the explicit cps: in SCC with intrinsically-cps member \pong`` wording.

Pod-verify green; CI running.

boldfield · 2026-04-25T14:08:59Z

Re-review verdict: merge

All six directed must-fix items landed correctly. CI green on all 4 jobs (ubuntu-24.04 + macos-14, both regular and cold-checkout). 346 compiler-lib tests pass (340 → 346, +6).

Per-item verification

#	Item	Status	Evidence
1	Iterative Tarjan	PASS	`color.rs:505-634` — explicit `Vec<Frame>` work stack with `{v, neighbors, next}`. No recursive helper. Back-edge correctly uses `w_index` (lines 585-590), not lowlink. `for start in 0..n` outer loop covers disjoint subgraphs. Self-loop hits the `on_stack[w]` branch correctly.
2	Self-loop single-node SCC test	PASS	`self_loop_single_node_scc_native` at `:1216` + `self_loop_single_node_scc_cps` at `:1236`. Agent shipped both native and cps variants — exceeds directive.
3	Bare-Ident-as-value soundness test	PASS	`let_bound_fn_value_taints_parent_via_outgoing_edge` at `:1259`. `let f = cps_fn` in a pure parent; asserts parent gets bridge reason `transitively calls cps_fn which is cps`.
4	3+ node cross-SCC chain test	PASS	`cps_propagates_through_three_scc_chain` at `:1295`. a→b→c chain with c intrinsically CPS; asserts all three CPS with correct cascading reasons.
5	Transitive-only SCC reason branch test	PASS	`scc_taint_via_transitive_only_branch` at `:1340`. Mutually-recursive `{pure_a, pure_b}` SCC bridges to `cps_x`; pure_b (non-bridge member) gets the new `in SCC bridging to cps callee via pure_a` phrasing.
6	Distinguished propagation reason wording	PASS	`color.rs:209, :214` — two distinct strings: `cps: in SCC with intrinsically-cps member <name>` vs `cps: in SCC bridging to cps callee via <name>`. Branched on `intrinsic_member.is_some()`. Agent picked the preferred option (distinguish), not the comment-only fallback.

What the agent did beyond the directives

Split item 2 into native + cps variants (self_loop_single_node_scc_native + _cps). The cps variant pins that intrinsic-CPS reason wins over any SCC fallback for a single-node SCC — strictly better coverage than the directive asked for.
Migrated existing propagation_reason_wording_intrinsically_cps test at :1074 to assert the new intrinsically-cps member phrasing — correct test-suite update for the wording change in item 6.

Mechanical

cargo fmt --check: clean
cargo clippy --workspace --all-targets -- -D warnings: clean
cargo test -p sigil-compiler --lib: 346 passed, 0 failed (was 340; +6 instead of expected +4 because of the native/cps self-loop split)
scripts/pod-verify.sh: OK
2 commits on the branch (057d461 original + 8cf59c4 fixup)
CI: 4/4 SUCCESS

One non-blocking observation

The iterative Tarjan's push_frame closure takes seven &mut params. Functionally fine and clippy-clean. If borrow noise becomes a maintenance concern down the road, lifting the bookkeeping into a struct would tighten it. Not worth gating on — the implementation is correct and well-tested.

Continuation

Merge whenever. Plan B Stage 5 review checkpoint sits between Tasks 52 and 53; this PR closes Task 50 cleanly. Tasks 51 (generic_map.sigil) and 52 (P16/P17 prompts) come next. Task 51 is the load-bearing follow-up where PR #16's new $ mangling format first sees user-authored generic syntax flow through the full pipeline — worth a fresh round of reproducibility scrutiny when it lands.

boldfield · 2026-04-25T14:09:47Z

Re-review — fix-up commit `8cf59c41`

All six items from the prior review are addressed. The work is in good shape.

What's fixed and verified

Edge insertion now driven by call_site_instantiations (color.rs:386-394). Span-keyed lookup means typecheck's env-precedence rules carry through to the call graph; locals shadowing top-level fn names produce no edge. Structurally correct.
Iterative Tarjan (color.rs:505-630). Back-edge update uses w_index (not lowlink[w]) — correctly preserves SCC root identification per the original algorithm. Parent-frame lowlink propagation on pop is the right reconstruction of the post-recursion lowlink[v] = lowlink[v].min(lowlink[w]) step. Self-loop case is well-behaved (verified against the new self_loop_single_node_scc_* tests). No more recursion depth concerns.
Per-node reason text (color.rs:189-220). Each node reports its own most-proximate cause; the two-variant SCC fallback (intrinsically-cps member vs bridging to cps callee via) gives a --dump-color reader a clear causal chain. cps_propagates_through_three_scc_chain and scc_taint_via_transitive_only_branch pin the new wording.
Result<String, DumpColorError> (pipeline.rs). Typed variants; readable.
-o warning under --dump-color (main.rs:67-74). Misuse is no longer silent.
Test rename. synth_fn_with_param_classifies_native matches its body.

The new SCC corner-case coverage (self-loop singletons, bridge-only SCC, depth-3 transitive chain) is solid algorithmic confidence. let_bound_fn_value_taints_parent_via_outgoing_edge correctly pins the load-bearing soundness claim that value-position fn references count as edges.

One residual concern

`parameter_shadowing_top_level_fn_does_not_taint_caller` is a tautology under Stage 5 constraints

color.rs:1393-1419. The test sets up the shadowing scenario but uses ![IO] for dangerous because Stage 5's typechecker rejects non-IO effect names (E0042). That makes dangerous itself Native — which means caller would classify Native either way: precise edge logic gives no caller -> dangerous edge; the old heuristic gives the edge but dangerous is Native so SCC propagation still yields Native for caller. The reason text ("native: pure row") is identical under both regimes.

The test compiles and runs, but it does not differentiate the precision fix from the old heuristic. The comment at color.rs:1411-1419 acknowledges this ("switching dangerous to a Cps row in a future test would expose the bug") but the assertion itself is non-discriminating.

Suggested follow-ups (pick one — none block this PR landing):

Forward-looking guard via inspection: expose infer_colors internals (e.g., a #[cfg(test)] accessor on ColoredProgram or a separate test entry that returns the constructed edge map). Then the test asserts edges_of("caller") == empty, which discriminates regardless of dangerous's color.
Defer: leave the test as-is, add a // TODO: strengthen once Stage 6 lands non-IO effects comment, and revisit when the first non-IO effect is plumbed.
Synthetic-test span uniqueness: the synth_program test scaffolding uses a constant span() fixture so build_synthetic_calls_map collapses to a single map entry — meaning all synthetic tests effectively run the old heuristic via the calls map. Switching the fixture to a counter-based unique span generator would let synthetic tests genuinely model env-precedence (by simply not inserting param/let spans into the synthetic calls map).

I'd lean toward option 3 as the most durable: it makes the synth_program tests match real typecheck semantics and unlocks proper shadowing tests now rather than waiting for Stage 6.

Nits (truly optional)

bridge_callee_of is computed for every SCC (color.rs:170-188), including ones that turn out to be Native. Move the computation under if intrinsic_member.is_none() or skip it for SCCs with no outgoing cross-SCC edges. Trivial perf, mostly cleanliness.
The closure-as-helper pattern for push_frame works but is verbose. If you ever revisit, an explicit TarjanState struct with fn push(&mut self, v: usize) reads cleaner.
Commit message says "5 new tests" but the diff adds 6 (counting parameter_shadowing_top_level_fn_does_not_taint_caller). Minor.

Verdict

Approve from my side once the residual shadowing-test concern has a path forward (any of the three options above is fine, including "defer with TODO"). The structural fix is correct; only the test coverage of that specific fix is currently weak.

boldfield · 2026-04-25T14:12:15Z

Review note — status

Re-review complete on 8cf59c41. All six items from the initial review are addressed and structurally correct:

✅ Edge insertion driven by call_site_instantiations (shadowing-precise)
✅ Iterative Tarjan (no stack-overflow risk)
✅ Per-node reason text with two-variant SCC fallback
✅ Typed DumpColorError enum
✅ Stderr warning when -o supplied with --dump-color
✅ Test rename

One non-blocking item before merge: parameter_shadowing_top_level_fn_does_not_taint_caller is non-discriminating under Stage 5 (typecheck rejects non-IO effects, so dangerous ends up Native and the test passes under either edge regime). Pick a path forward — strengthen via edge-map inspection, defer with a TODO, or fix the synth-test span fixture. Author's call.

No further blocking concerns from my side. Ready to merge once the shadowing-test path forward is decided.

Addresses the residual non-blocking concern on PR #17: `parameter_shadowing_top_level_fn_does_not_taint_caller` was a tautology under Stage 5 (typecheck rejects non-IO effects, so `dangerous` ended up Native and the test passed under either the precise edge logic or the old name-only heuristic). Reviewer suggested three options; this commit applies option 3 (synthetic-test span uniqueness): - New `unique_span()` test helper: atomic-counter span generator that produces a fresh span per call. Used only by tests that need to disambiguate individual `Expr::Ident` occurrences via their span. The constant `span()` helper is unchanged for the rest of the synth tests. - New `synth_program_with_calls(items, calls)` test helper that accepts a caller-supplied `call_site_instantiations` map. The auto-built map produced by `build_synthetic_calls_map` is a name-only heuristic, not env-precedence-aware; tests modeling shadowing supply their own map manually. - Rewrote `parameter_shadowing_top_level_fn_does_not_taint_caller` as a synthetic program: `dangerous` is intrinsically CPS (Raise effect), `unrelated` is Native, `caller(dangerous: Int)` body has a param-ref `dangerous` Ident (unique span deliberately omitted from the calls map) AND a real call to `unrelated()` (Ident span recorded in the calls map). Asserts `caller` Native. This is now genuinely discriminating: under the old name-only heuristic, caller would acquire a spurious edge to `dangerous` (CPS) and falsely classify CPS; under the precise calls-map-driven edge logic, the shadow Ident produces no edge and caller stays Native. Also addresses the non-blocking nit on `bridge_callee_of`: replaced the misleading "skip when already-decided" hint with an honest comment explaining we always compute the map (the per-node reason loop needs it for both intrinsic-present and intrinsic-absent SCCs) and that pure leaf SCCs naturally hit the empty case without an explicit skip. Pod-verify green; existing 32+1 unit tests pass.

boldfield · 2026-04-25T14:18:29Z

Picked option 3 from the residual-concern menu in c770c31 — synthetic-test span uniqueness, with the shadowing test rewritten to be discriminating.

Changes:

New unique_span() test-only helper: atomic-counter span generator producing a fresh span per call. Used only by tests that need to disambiguate individual Expr::Ident occurrences via their span. The constant span() helper unchanged for all other synth tests.
New synth_program_with_calls(items, calls) test helper that accepts a caller-supplied call_site_instantiations map (the auto-built map is a name-only heuristic, not env-precedence-aware).
Rewrote parameter_shadowing_top_level_fn_does_not_taint_caller as a synthetic program:
- dangerous is intrinsically CPS (Raise effect).
- unrelated is Native.
- caller(dangerous: Int) body has a param-ref dangerous Ident (unique span deliberately omitted from the calls map) AND a real call to unrelated() (Ident span recorded in the calls map).
- Asserts caller Native.

This is now genuinely discriminating: under the old name-only heuristic, caller would acquire a spurious edge to dangerous (CPS) and falsely classify CPS; under the precise calls-map-driven edge logic, the shadow Ident produces no edge and caller stays Native. The two regimes give different answers — exactly what your option 3 description called for.

Also addressed the non-blocking nit on bridge_callee_of: the comment is now honest about why we always compute it (the per-node reason loop needs it regardless of whether the SCC has an intrinsic member) instead of suggesting an unimplemented skip.

CI running.

boldfield · 2026-04-25T14:22:31Z

Re-review — `c770c31e`

The shadowing-precision test is now genuinely discriminating, and the bridge_callee_of comment is honest about why the map is always computed. Both points from the prior re-review are resolved.

Verified

unique_span() + synth_program_with_calls (color.rs:786, 961). The atomic-counter span generator produces distinct keys per Ident; the manual calls-map constructor lets the test author model env-precedence by including/excluding specific spans.
Rewritten parameter_shadowing_top_level_fn_does_not_taint_caller (color.rs:1448). The construction:
- dangerous synth fn with vec!["Raise"] → local_color returns LocalColor::Cps (intrinsic).
- unrelated synth fn pure → Native sink.
- caller body has Stmt::Expr(Call(Ident("unrelated", S_unrelated))) plus tail Binary { lhs: Ident("dangerous", S_shadow), rhs: IntLit(1) } with both S_unrelated and S_shadow from unique_span().
- calls map contains only S_unrelated.
Under the precise drive (current code): caller's edges = {unrelated} (Native sink); caller stays Native. Under the old name-only heuristic: edges would be {unrelated, dangerous}; dangerous is intrinsic CPS; caller would propagate to CPS. Asserting caller == Native while dangerous == Cps discriminates the two regimes — exactly what was missing in the previous version.
bridge_callee_of comment (color.rs:146-152). The honest framing — "always computed because the per-node reason loop uses it for non-intrinsic members regardless of whether the SCC has an intrinsic CPS member" — correctly defends the design. My prior nit was wrong; the bridge map is needed even when an intrinsic member exists, to give bridge-form reasons to non-intrinsic peers in the same SCC.

Verdict

LGTM. No further blocking concerns. Ready to merge once CI is green.

…ts (#18) * [Task 51] examples/generic_map.sigil — first user-authored generics e2e First user-authored generic syntax to flow through the full Sigil pipeline (lex → parse → resolve → typecheck → elaborate → monomorphize → color → codegen). PR #17's reviewer flagged this as the canonical reproducibility checkpoint for PR #16 (Task 49)'s `$$` mangling format: prior tests stop at the monomorph-IR level, and prior end-to-end examples declare no generic parameters. What the example exercises: - `type List[A] = | Nil | Cons(A, List[A])` — a self-recursive generic type. Monomorphization dedups via `type_seen` so each instantiation produces exactly one clone (`List$$Int`, `List$$String`). - `fn map[A](xs: List[A]) -> List[A] ![]` — structure-preserving traversal, semantically `map id`. Sigil v1 surface has no `TypeExpr::Fn`, so the design doc's canonical higher-order `map[A,B,e](xs, f) -> List[B]` cannot be expressed; this is the closest Plan B can express. - `fn length[A](xs: List[A]) -> Int ![]` — generic fold returning a concrete `Int`. Pairs with `map` to produce a verifiable scalar. - `main` instantiates each generic at `Int` and `String` in the same program, producing four monomorph clones (`map$$Int`, `map$$String`, `length$$Int`, `length$$String`) plus mangled ctor sites (`Nil$$Int`, `Cons$$Int`, `Nil$$String`, `Cons$$String`). - Pattern matching on generic ctors: scrutinee `Ty::User(_, args)` propagates per-sub-pattern field types so `Cons(h, t)` correctly types `h: A`, `t: List[A]` against the scrutinee's instantiated `A` (closes Task 49 round 3's regression reproducer). Two e2e tests: - `generic_map_example_prints_3_and_2` — full lex → parse → typecheck → elaborate → monomorphize → color → codegen → run pipeline. Asserts stdout exactly `"3\n2\n"`. Length values deliberately differ (3 vs 2) so a copy-paste error between the two list literals would surface as a length mismatch rather than pass silently. - `generic_map_dump_color_all_native` — verifies Task 50's per- monomorph color inference classifies all four monomorphs plus `main` as native (bodies have row `![]`, `main` has `![IO]`, no `perform` to a non-IO effect). Pins each expected mangled name independently so a mangling-format slip on any single one (e.g. `map$$String`) lands on a directed assertion rather than an opaque overall-string diff. Smoke + reproducibility scripts updated to include the new example so the per-host byte-stable-binary invariant covers the full generic pipeline. Plan A2's `examples/higher_order.sigil` doc comment notes that `TypeExpr::Fn` was deferred to Plan A3; A3 did not actually deliver it (deferred again to v2 per scope). The plan's "generic map" text therefore describes what Plan B's surface CAN express, not the design doc's canonical higher-order signature. * [Task 52] Validation prompts P16 + P17 — generic id, compose Adds the two remaining Stage-5-feasible entries to the prompt bank: - **P16 — generic identity at Int and String**. Fully expressible in Plan B's surface (no `TypeExpr::Fn` required). Asserts oracle output `"42\nsigil\n"`. Pins the discriminating contract that Algorithm W's fresh-var-per-call instantiation plus Task 49's reachability-bounded specialization produce exactly two monomorph clones (`id$$Int`, `id$$String`) — not one polymorphic body, not three from double-counted call sites. - **P17 — generic compose applied across types**. Requires `TypeExpr::Fn` surface syntax for the higher-order parameters, same as P09 / P10. P09/P10 deferred this to Plan A3; A3 did not deliver it. P17 follows the same deferral pattern: oracle is graded against "program compiles" until function types ship. P17's distinguishing feature versus P10 is `A != C` (the result type genuinely travels through composition rather than absorbing into the trivial endo-functor case). Plan B Task 61 will add P18 / P19 / P20 (Raise-based parser, State-threaded counter, multi-shot Choose). Bank is now 17/20; remaining three land in Stage 6. * [Task 51-52] PROGRESS: flip pending-ci entries → done; record Tasks 51 + 52 Standard PROGRESS hygiene at Stage 5 closeout. Per Plan A2 / Task 49 precedent, status flips for prior PRs go in the next task's PR. - Tasks 4.5.1-4.5.5 (Stage 4.5 scaffolding) and Plan A3 carryover items: flipped `done-pending-ci` → `done` with squash-merge hash `01e0e13` (PR #14). All four CI jobs were green at merge. - Task 47 (parser): `done-pending-ci` → `done` with `01e0e13` (also PR #14). - Task 48 (HM unifier): `done-pending-ci` → `done` with `70756de` (PR #15). - Task 50 (color inference): `done-pending-ci` → `done` with `82e0f97` (PR #17). Round 2 / round 3 fixup hashes preserved in the activity log; the squash-merge hash is the canonical post-merge anchor. - Task 49 (monomorphization): notes corrected from the round-1 `_`/`__` mangling separator to the final `$`/`$$` format pinned by round-2 fixup `994f083`. Status hash `[981ec93]` left as-is (set by prior session's `531bcfe` flip commit, recorded the branch-commit hash convention rather than the squash-merge `858b4c2`). - Task 51 entry: `done-pending-ci` with commit pointer `[HEAD]` describing the new `examples/generic_map.sigil`, two e2e tests, and smoke + reproducibility script updates. - Task 52 entry: `done-pending-ci` with commit pointer `[HEAD]` noting that P17 follows the same `TypeExpr::Fn`-deferred grading as P09 / P10 until first-class function types ship. Stage 5 review checkpoint remains pending — the next step before Stage 6 begins is Brian's review of row-unification, let- generalization, color decisions, and monomorphization naming determinism on adversarial inputs. * [Task 51 fix-up] typecheck: cross-arm body consistency uses unify_ty, not Eq CI failure on PR #18 surfaced an unprecedented Stage 5 typechecker bug: `check_match` compared arm body types via structural equality (`first != t`) instead of attempting unification. With pre-Plan-B programs (every Plan A1/A2/A3 example) arm body types are concrete primitives or already-resolved user types, so the equality check held coincidentally. With generic-fn-internal matches whose arms return a generic-typed value (e.g. `fn map[A](xs: List[A]) -> List[A] { match xs { Nil => Nil, Cons(h, t) => Cons(h, map(t)) } }`), each ctor resolution allocates a fresh `List[?N]` user instance with a distinct fresh-var id — `Nil` resolves to `List[?6]` and `Cons(...)` to `List[?5]` — and the equality check reports `?6 != ?5` even though the two trivially unify. Fix: snapshot `self.errors.len()`, call `unify_ty(first, t, &arm.span)`, and on failure truncate any internal E0044 "type mismatch" errors `unify_ty` itself pushed and emit E0065 with the arm-specific phrasing so the user-facing diagnostic surface stays unchanged. On success the unifier binds the fresh vars together (directly or transitively through the function's declared return type), and the match expression's overall type is whichever representative falls out of `deref(first)`. Two regression tests: - `generic_match_returning_generic_unifies_arms` — pins the exact reproducer from `examples/generic_map.sigil`'s `map` fn. Pre-fix this fails with E0065; post-fix the program typechecks clean. - `match_arm_type_unification_still_rejects_real_mismatch` — regression guard that the new unify-based path still emits E0065 on a genuine Int-vs-String arm-body mismatch (i.e., the fix doesn't accidentally silence legitimate type errors). The long-standing `match_arm_types_must_unify_is_e0065` covers the same shape; this new test pins the contract specifically inside a generic-fn surface so the discriminating path is tested. Plan B classification: bug fix in Task 48 surface (HM unification), discovered by Task 51's example. Not a deviation — the plan specified HM unification end-to-end, the implementation just had a coincidentally-passing structural-Eq check. No PLAN_B_DEVIATIONS entry; PROGRESS notes for Task 51 will be updated with this fix-up in the PR body. * [Task 51-52 review fixups] address PR #18 reviewer items Two review comments on PR #18: an initial code-review pass, then a revised verdict ("request changes") that supersedes the first on the contentious subst-rollback question. This commit addresses every non-superseded item. == From the revised verdict (Comment 2) == 1. **PROGRESS Task 49 SHA normalized.** `[981ec93]` (branch-tip hash) → `[858b4c2]` (squash-merge SHA on `main`). Matches the convention every other prior task entry uses; closes the bookkeeping inconsistency the reviewer flagged. 2. **3-arm generic match regression test.** New `three_arm_generic_match_propagates_subst_across_all_arms` pins that cross-arm unify propagates substitutions across MORE than two arms. Each `W(x)`-style ctor allocates a fresh `Wrap[?N]` user instance, so arm 1 → `Wrap[?A]`, arm 2 → `Wrap[?B]`, arm 3 → `Wrap[?C]`. The cross-arm check unifies sequentially: arm1↔arm2 binds `?A := ?B`; arm1↔arm3 must then unify the deref'd representative against `Wrap[?C]`. A naive 2-arm-only check would miss a propagation bug on the third arm. 3. **Subst-pollution-pinning regression-guard.** New `subst_pollution_from_partial_unify_surfaces_at_call_site` is the discriminating test against a future "fix" that adds subst snapshot/restore on `unify_ty` failure. The program: `foo[A, B]` has a match where arm 1 returns `p: Pair[A, B]` and arm 2 returns `Pair("x", 3): Pair[String, Int]`. Cross-arm unify SUCCEEDS by binding `A := String`, `B := Int` — the body itself typechecks clean, but the bindings persist in the global subst. `foo`'s scheme is now over-constrained; `caller`'s `foo(Pair(1, 2), 0)` instantiates with `A := Int, B := Int` and the over-constraint surfaces as E0044 (concrete mismatch) + E0132 (ambiguous polymorphism: scheme generalization sees A and B already bound). With rollback: arm 2's bindings discarded, foo stays generic, caller accepts `Pair[Int, Int]` cleanly with NO errors. Test pins the cascade by asserting BOTH E0044 AND E0132 appear, AND that the body's match itself has no E0065 (cross-arm unify succeeded; a body-level error would mean a regression in the opposite direction). 4. **Perf-floor instability flagged for Task 60 in PLAN_B_DEVIATIONS.md.** `fib_perf_example_prints_6765_under_50ms` and `tree_example_prints_32767_under_500ms` exceed wall-clock floors by ~4x in debug profile (~200ms each on aarch64-apple-darwin). Pre-existing on `main`, not Task 51's fault. Surfaced as a new VERIFICATION DEBT entry so Task 60 doesn't have to rediscover. Three resolution options enumerated; reviewer prefers "platform-aware tightening or release-only mode." == From the initial review (Comment 1, where not superseded) == 5. **Issue 2: selective error truncation on cross-arm unify failure.** Pre-fix `self.errors.truncate(pre_unify_errors)` removed every error `unify_ty` pushed, including E0126 (occurs check) and E0127 (row occurs) — which name a real soundness problem the generic E0065 wouldn't capture. Now drains errors past baseline, keeps non-E0044 codes (occurs-check kinds), drops E0044 (replaced by arm-specific E0065). User-facing diagnostic surface unchanged on the common path; correctness improved on the edge. 6. **Issue 3: P16 e2e test pins the prompt-bank claim.** New `p16_generic_id_at_int_and_string_oracle` runs P16's source through the full pipeline. Asserts (a) stdout exactly `"42\nsigil\n"` (P16 oracle), (b) `--dump-color` produces exactly 3 monomorph lines `{id$$Int, id$$String, main}` — a regression that double-counts call sites would surface as a 4th line; an unmonomorphized polymorphic body would surface as a bare `id native`. Makes the PR description's "exactly two clones" claim substantive instead of aspirational. 7. **Issue 4: P17 surface-syntax-pending pin.** New `p17_compose_source_rejects_until_typeexpr_fn_ships` writes the exact P17 prompt source and asserts the front end rejects it (specific error code is implementation detail and could shift between parser- and typecheck-level rejections; the test just asserts non-success). Once `TypeExpr::Fn` ships, the test should be inverted to assert success against the prompt's stdout oracle. 8. **Issue 6: unused `h` binding.** `Cons(h, t) => 1 + length(t)` in `length`'s body → `Cons(_, t) => 1 + length(t)`. `h` was never used; `_` matches Sigil idiom and avoids relying on the unused-binding policy. `map`'s body still uses `h`, which is fine because it's used in the result. == Skipped == - **Issue 1 (subst rollback):** superseded by Comment 2's verdict that current no-rollback behavior is HM-correct. Item 7 above pins that contract via the regression-guard test. - **Issue 5 (helper extraction):** minor style, single call site, not worth extracting per repo convention against speculative abstractions. - **Issue 7 (reproducibility note):** informational only; no action requested. == Test counts == 348 → 351 compiler lib tests (+3 new typecheck tests). E2E gains 2 new tests (P16 oracle + dump-color, P17 surface-pending). Pod-verify green; full test suite deferred to CI.

boldfield marked this pull request as ready for review April 25, 2026 06:31

boldfield merged commit 82e0f97 into main Apr 25, 2026
4 checks passed

boldfield mentioned this pull request Apr 25, 2026

[Tasks 51 + 52] Stage 5 closeout: generic_map example + P16/P17 prompts #18

Merged

8 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Task 50] Color inference: per-monomorph, SCC-aware, --dump-color#17

[Task 50] Color inference: per-monomorph, SCC-aware, --dump-color#17
boldfield merged 3 commits into
mainfrom
plan-b-task-50

boldfield commented Apr 25, 2026

Uh oh!

boldfield commented Apr 25, 2026

Uh oh!

boldfield commented Apr 25, 2026

Uh oh!

boldfield commented Apr 25, 2026

Uh oh!

boldfield commented Apr 25, 2026

Uh oh!

boldfield commented Apr 25, 2026

Uh oh!

boldfield commented Apr 25, 2026

Uh oh!

boldfield commented Apr 25, 2026

Uh oh!

boldfield commented Apr 25, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

boldfield commented Apr 25, 2026

Landed in this PR

Test plan

Conservative-soundness notes

Out of scope (deferred to later tasks)

Plan B Task 50 acceptance

Uh oh!

boldfield commented Apr 25, 2026

Code review — Task 50 color inference

Bugs / correctness

1. Bare-Ident edge insertion over-approximates due to shadowing — falsely taints native fns

2. Recursive Tarjan — stack overflow risk on heavy monomorph workloads

Diagnostic / API issues

3. SCC reason text is misleading when multiple members are intrinsically CPS

4. pipeline::dump_color returns Result<String, usize> with two semantically distinct usize meanings

5. --dump-color silently accepts and ignores -o

Test issues

6. unused_param_warning_silenced — name doesn't match body

Style / nits (skip if not interested)

Summary

Uh oh!

boldfield commented Apr 25, 2026

Review verdict: request changes

Must-fix before merge

1. Convert tarjan_scc from recursive to explicit-stack iterative form

2. Add test for single-node self-loop SCC

3. Add test for the bare-Ident-as-value soundness claim

4. Add test for 3+ node cross-SCC transitive chain

5. Add test for the transitive-only SCC reason branch

6. Tighten the propagation reason wording for the transitive-bridge case

Out of scope (genuine future-task work; PR body is correct to defer)

What's good (preserve through the rewrite)

Continuation

Uh oh!

boldfield commented Apr 25, 2026

Uh oh!

boldfield commented Apr 25, 2026

Re-review verdict: merge

Per-item verification

What the agent did beyond the directives

Mechanical

One non-blocking observation

Continuation

Uh oh!

boldfield commented Apr 25, 2026

Re-review — fix-up commit 8cf59c41

What's fixed and verified

One residual concern

parameter_shadowing_top_level_fn_does_not_taint_caller is a tautology under Stage 5 constraints

Nits (truly optional)

Verdict

Uh oh!

boldfield commented Apr 25, 2026

Review note — status

Uh oh!

boldfield commented Apr 25, 2026

Uh oh!

boldfield commented Apr 25, 2026

Re-review — c770c31e

Verified

Verdict

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

1. Bare-`Ident` edge insertion over-approximates due to shadowing — falsely taints native fns

4. `pipeline::dump_color` returns `Result<String, usize>` with two semantically distinct usize meanings

5. `--dump-color` silently accepts and ignores `-o`

6. `unused_param_warning_silenced` — name doesn't match body

1. Convert `tarjan_scc` from recursive to explicit-stack iterative form

Re-review — fix-up commit `8cf59c41`

`parameter_shadowing_top_level_fn_does_not_taint_caller` is a tautology under Stage 5 constraints

Re-review — `c770c31e`