Skip to content

Phase 2: collect-all-errors in ExprVisitor + StatementVisitor#965

Merged
ankushdesai merged 8 commits into
masterfrom
ankush/multi-error-typecheck-phase2
May 27, 2026
Merged

Phase 2: collect-all-errors in ExprVisitor + StatementVisitor#965
ankushdesai merged 8 commits into
masterfrom
ankush/multi-error-typecheck-phase2

Conversation

@ankushdesai
Copy link
Copy Markdown
Member

@ankushdesai ankushdesai commented May 26, 2026

Summary

Phase 2 of the multi-error type checker initiative. Builds on Phase 1 (#957) by lighting up the diagnostic collector across the two largest visitor files.

  • ~67 throw sites in ExprVisitor and ~44 in StatementVisitor converted to handler.Diagnostics.Report(...) + recovery value
  • New TypeCheckingUtils.CheckAssignable helper implements the cascade-suppression rule (an ErrorType operand silently passes compatibility checks, so one upstream error doesn't generate a chain of downstream noise)
  • Strict mode (default) invariants (see "Strict-mode contract" below for precision)
  • Collecting mode (P_COMPILER_COLLECT_ERRORS=1) reports all independent errors in one pass

Recovery conventions

ExprVisitor: every method visits children first, short-circuits on ErrorType, and converts each throw to Report + return new ErrorExpr(context). VisitBinExpr notably collapses to a single guard at the top of the switch covering all 14 inner throws.

StatementVisitor: statements either continue building the node with what they have (most type-mismatch cases) or return new NoStmt(context) when a critical declaration is missing (interface / function / state / event).

Helpers: the new CheckAssignable is the single place where ErrorType cascade suppression for IsAssignableFrom happens. PLanguageType.IsSameTypeAs already had the symmetric short-circuit from Phase 1.

Tests

Updated: Phase1DormancyTest split

The Phase 1 fixture asserted bit-identical stderr in both modes. That's no longer the right invariant for StaticError/ inputs in Phase 2 since collecting mode can legitimately report MORE errors. Split into:

  • StrictAndCollectingAgreeOnValidPrograms (Correct/ only): both modes empty stderr + exit 0
  • CollectingReportsAtLeastAsManyErrorsAsStrict (StaticError/ only): both exit 1, collecting count ≥ strict count. Catches the regression that matters most — collecting silently suppressing a strict error.

New: MultipleErrors.p + MultiErrorAcceptanceTest.cs

A curated P file with 4 deliberately-independent errors (bool→int, missing decl, int+string, send-arity mismatch). Pinned counts:

  • Strict mode: exactly 1 error (aborts on first throw — preserves historical behavior)
  • Collecting mode: exactly 4 errors (no spurious cascade diagnostics)

A change in either count signals a specific regression — see the test's class doc for diagnostic guidance.

Phase 3 (next, deferred)

Pass-level tolerance classification in Analyzer.cs — decide per pass whether it can run on a partially-broken AST (capability check and module-system passes need a clean AST; control-flow check can tolerate per-function errors). Estimated 2–3 days. Will be a separate PR on top of this one.

Test plan

  • CI green on Build-And-Test-{MacOS,Ubuntu,Windows} (existing static-error suite passes unchanged in strict mode)
  • CI green on Phase1Dormancy category (split fixture covers Correct + StaticError invariants)
  • CI green on MultiErrorAcceptanceTest (pinned 1 vs 4 error counts)
  • Local smoke: P_COMPILER_COLLECT_ERRORS=1 p compile on MultipleErrors.p prints 4 errors

🤖 Generated with Claude Code

Strict-mode contract — precise invariants

Following Copilot's review of round 2, the original "strict mode is bit-for-bit unchanged" claim was over-broad. The precise invariants that hold (and that we test):

  1. Valid programs: compile to bit-identical output. No behavior change at all.
  2. Invalid programs: still exit 1; total error count is the same as pre-Phase 2 (strict still aborts on first error).
  3. First-error IDENTITY may differ for invalid programs that contain BOTH an argument error AND a callee/format/interface error. Specifically: VisitCtorExpr, VisitFunCallExpr, VisitFormatedString, VisitCtorStmt now visit child arguments BEFORE the lookup that uses them. If both layers have errors, strict mode now reports the child's first.

The reordering is deliberate — visiting children before parents is the natural compiler convention (rustc/swiftc/tsc) and is what lets collecting mode surface internal arg errors when the parent lookup fails (which is the whole point of Phase 2). Gating the reordering behind ContinueOnError would require 4× duplicated code paths in the affected visitors; the maintenance burden isn't worth the marginal benefit of identical first-error wording in invalid programs.

Existing StaticError/ regression tests (the broadest baseline) pass unchanged because StaticErrorValidator only checks exit code == 1, not the diagnostic text.

Copilot AI review requested due to automatic review settings May 26, 2026 04:13
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Phase 2 of the multi-error type-checker work: converts large portions of expression/statement type-checking from throw-on-first-error to diagnostic collection with recovery, enabling P_COMPILER_COLLECT_ERRORS=1 to report multiple independent errors in one compilation pass while preserving strict-mode behavior.

Changes:

  • Convert many throw handler.X(...) sites in ExprVisitor and StatementVisitor to handler.Diagnostics.Report(...) with recovery (ErrorExpr / NoStmt / best-effort statement construction).
  • Add TypeCheckingUtils.CheckAssignable(...) to centralize cascade suppression when either side is ErrorType.
  • Update/add tests and regression inputs to validate strict vs collecting invariants and pinned multi-error counts.

Reviewed changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated 5 comments.

Show a summary per file
File Description
Tst/UnitTests/TypeChecker/Phase1DormancyTest.cs Splits strict/collecting invariants between Correct/ and StaticError/ suites; adds coarse error counting.
Tst/UnitTests/TypeChecker/MultiErrorAcceptanceTest.cs New acceptance test pinning strict vs collecting diagnostic counts for a curated multi-error program.
Tst/RegressionTests/Feature3Exprs/StaticError/MultipleErrors/MultipleErrors.p New regression input containing 4 independent errors used by the acceptance test.
Src/PCompiler/CompilerCore/TypeChecker/TypeCheckingUtils.cs Introduces CheckAssignable and adjusts payload validation to report via diagnostics.
Src/PCompiler/CompilerCore/TypeChecker/StatementVisitor.cs Converts many statement type-check errors to Report + recover, with placeholder statements where needed.
Src/PCompiler/CompilerCore/TypeChecker/ExprVisitor.cs Converts many expression type-check errors to Report + return ErrorExpr, adds top-level binop guard, and visits args earlier in some constructs.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +71 to +73
// If any argument failed to type-check, an upstream diagnostic was
// already reported. Skip payload validation to avoid cascade noise.
if (arguments.Any(a => a.Type is ErrorType) || payloadType is ErrorType) return;
handler.Diagnostics.Report(handler.MissingDeclaration(context.item, "foreach iterator variable", varName));
// Still visit children so their internal errors surface.
exprVisitor.Visit(context.collection);
Visit(context.statement());
Comment on lines 548 to +551
if (machine?.IsSpec == true)
{
throw handler.IllegalMonitorOperation(context, context.RECEIVE().Symbol, machine);
handler.Diagnostics.Report(handler.IllegalMonitorOperation(context, context.RECEIVE().Symbol, machine));
return new NoStmt(context);
Comment on lines 457 to +461
public override IPExpr VisitTestExpr(PParser.TestExprContext context)
{
var instance = Visit(context.instance);
string name = context.kind.GetText();

Comment on lines 481 to +485
public override IPExpr VisitTargetsExpr(PParser.TargetsExprContext context)
{
var instance = Visit(context.instance);
var target = Visit(context.target);

ankushdesai added a commit that referenced this pull request May 26, 2026
All five comments from Copilot's review of #965 were valid recovery
holes — places where the converted code suppressed a diagnostic too
aggressively or failed to propagate ErrorType. Fixes:

## 1. ValidatePayloadTypes: don't swallow the arity check

Previous: any argument with ErrorType caused an early return from the
whole helper, hiding the independent IncorrectArgumentCount diagnostic
when arity was wrong AND one arg also errored upstream.

Fix: remove the blanket early-return on `arguments.Any(a => a.Type is
ErrorType)`. Per-argument cascade suppression is already handled by
CheckAssignable's own ErrorType short-circuit, so the arity check (a
property of the call-site, not the arg types) now fires independently.
Added an explicit arity check inside the tuple branch since the
previous code relied on Zip silently truncating mismatched counts.

## 2. VisitForeachStmt: visit invariants when iterator var is missing

Previous: missing-iterator recovery visited the collection and body
but not `context._invariants`. Type errors inside loop invariants
were silently suppressed in collecting mode.

Fix: visit each invariant via exprVisitor.Visit before returning NoStmt.

## 3. VisitReceiveStmt: visit handler bodies on spec-illegal monitor op

Previous: when receive was illegal (spec machine), the method reported
IllegalMonitorOperation and returned NoStmt immediately, suppressing
all diagnostics inside the handler bodies.

Fix: mirror what Send/Announce/Raise do for their child expressions —
construct each recvCase handler (with its scope and parameter) and run
FunctionBodyVisitor.PopulateMethod to surface body errors before
returning NoStmt. Event-id lookups and duplicate-case checks are
intentionally skipped since the receive itself is already illegal.

## 4. VisitTestExpr: propagate ErrorType from instance

Previous: an instance with ErrorType still produced a bool-typed
TestExpr, masking the cascade for downstream consumers.

Fix: do the kind-identifier lookup first (so missing-decl on the kind
is still reported even when instance errored), then bail to ErrorExpr
if instance.Type is ErrorType. Preserves both diagnostics' independence.

## 5. VisitTargetsExpr / VisitFlyingExpr / VisitSentExpr: propagate ErrorType

Previous: these three constructed bool-typed nodes regardless of operand
state, breaking the ErrorType propagation convention documented in
ExprVisitor's class header.

Fix: add the same early-guard pattern used elsewhere — if any visited
child has ErrorType, return new ErrorExpr(context) before constructing
the typed node.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@ankushdesai ankushdesai requested a review from Copilot May 26, 2026 08:10
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 11 out of 11 changed files in this pull request and generated 5 comments.

{
throw handler.TypeMismatch(context.rvalue(), valueType, expectedValueType);
}
TypeCheckingUtils.CheckAssignable(handler, context.rvalue(), expectedKeyType, index);
Comment on lines 283 to +291
var interfaceName = context.interfaceName.GetText();
// Always visit the argument list so internal arg errors surface,
// even if the interface itself is unknown.
var arguments = TypeCheckingUtils.VisitRvalueList(context.rvalueList(), this).ToArray();

if (!table.Lookup(interfaceName, out Interface @interface))
{
throw handler.MissingDeclaration(context.interfaceName, "interface", interfaceName);
handler.Diagnostics.Report(handler.MissingDeclaration(context.interfaceName, "interface", interfaceName));
return new ErrorExpr(context);
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Acknowledged — you're right that the eager argument visit changes WHICH error fires first in strict mode when an invalid program has both an argument error and a callee/format error. The PR description over-claimed "bit-for-bit unchanged" — the precise invariants that hold are:

  1. Valid programs compile to bit-identical output (no behavior change at all).
  2. Invalid programs still exit 1 with the same total error count in strict mode (the test harness's StaticErrorValidator only checks exit code).
  3. The IDENTITY of the first error reported can differ for invalid programs containing both an argument error AND a callee/format/interface error.

This was a deliberate tradeoff captured by the prior strict-mode safety audit: visiting children before parents is the more natural compiler convention (it's what rustc/swiftc/tsc do), and crucially the new order is what lets collecting mode surface child errors when the parent lookup fails — which is the whole point of Phase 2.

Gating the reordering behind ContinueOnError would require 4 duplicated code paths in VisitCtorExpr / VisitFunCallExpr / VisitFormatedString / VisitCtorStmt. The tradeoff isn't worth the maintenance burden for "the first of several diagnostics on an invalid program looks slightly different".

I'll update the PR description to be precise about which strict-mode invariants are preserved and which are intentionally relaxed. Existing static-error regression tests pass unchanged because they only check exit code, not the first-error message.

Comment on lines 309 to +315
var funName = context.fun.GetText();
// Visit arguments first so their internal errors surface regardless
// of whether the callee resolves. Argument errors do not block
// dispatch — we still try to bind a function/pure with the args we
// got, treating ErrorExpr args as compatible with anything.
var arguments = TypeCheckingUtils.VisitRvalueList(context.rvalueList(), this).ToArray();

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Acknowledged — you're right that the eager argument visit changes WHICH error fires first in strict mode when an invalid program has both an argument error and a callee/format error. The PR description over-claimed "bit-for-bit unchanged" — the precise invariants that hold are:

  1. Valid programs compile to bit-identical output (no behavior change at all).
  2. Invalid programs still exit 1 with the same total error count in strict mode (the test harness's StaticErrorValidator only checks exit code).
  3. The IDENTITY of the first error reported can differ for invalid programs containing both an argument error AND a callee/format/interface error.

This was a deliberate tradeoff captured by the prior strict-mode safety audit: visiting children before parents is the more natural compiler convention (it's what rustc/swiftc/tsc do), and crucially the new order is what lets collecting mode surface child errors when the parent lookup fails — which is the whole point of Phase 2.

Gating the reordering behind ContinueOnError would require 4 duplicated code paths in VisitCtorExpr / VisitFunCallExpr / VisitFormatedString / VisitCtorStmt. The tradeoff isn't worth the maintenance burden for "the first of several diagnostics on an invalid program looks slightly different".

I'll update the PR description to be precise about which strict-mode invariants are preserved and which are intentionally relaxed. Existing static-error regression tests pass unchanged because they only check exit code, not the first-error message.

Comment on lines 993 to +1002
var baseString = context.StringLiteral().GetText();
baseString = baseString.Substring(1, baseString.Length - 2); // strip beginning / end double quote
var numNecessaryArgs = TypeCheckingUtils.PrintStmtNumArgs(baseString);
// Visit args regardless of format validity so their internal errors surface.
var args = TypeCheckingUtils.VisitRvalueList(context.rvalueList(), this).ToList();

if (numNecessaryArgs == -1)
{
throw handler.InvalidStringExprFormat(context, context.StringLiteral().Symbol);
handler.Diagnostics.Report(handler.InvalidStringExprFormat(context, context.StringLiteral().Symbol));
return new ErrorExpr(context);
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Acknowledged — you're right that the eager argument visit changes WHICH error fires first in strict mode when an invalid program has both an argument error and a callee/format error. The PR description over-claimed "bit-for-bit unchanged" — the precise invariants that hold are:

  1. Valid programs compile to bit-identical output (no behavior change at all).
  2. Invalid programs still exit 1 with the same total error count in strict mode (the test harness's StaticErrorValidator only checks exit code).
  3. The IDENTITY of the first error reported can differ for invalid programs containing both an argument error AND a callee/format/interface error.

This was a deliberate tradeoff captured by the prior strict-mode safety audit: visiting children before parents is the more natural compiler convention (it's what rustc/swiftc/tsc do), and crucially the new order is what lets collecting mode surface child errors when the parent lookup fails — which is the whole point of Phase 2.

Gating the reordering behind ContinueOnError would require 4 duplicated code paths in VisitCtorExpr / VisitFunCallExpr / VisitFormatedString / VisitCtorStmt. The tradeoff isn't worth the maintenance burden for "the first of several diagnostics on an invalid program looks slightly different".

I'll update the PR description to be precise about which strict-mode invariants are preserved and which are intentionally relaxed. Existing static-error regression tests pass unchanged because they only check exit code, not the first-error message.

Comment on lines 333 to +342
public override IPStmt VisitCtorStmt(PParser.CtorStmtContext context)
{
var interfaceName = context.iden().GetText();
// Always visit arguments so their internal errors surface.
var args = TypeCheckingUtils.VisitRvalueList(context.rvalueList(), exprVisitor).ToList();

if (!table.Lookup(interfaceName, out Interface targetInterface))
{
throw handler.MissingDeclaration(context.iden(), "interface", interfaceName);
handler.Diagnostics.Report(handler.MissingDeclaration(context.iden(), "interface", interfaceName));
return new NoStmt(context);
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Acknowledged — you're right that the eager argument visit changes WHICH error fires first in strict mode when an invalid program has both an argument error and a callee/format error. The PR description over-claimed "bit-for-bit unchanged" — the precise invariants that hold are:

  1. Valid programs compile to bit-identical output (no behavior change at all).
  2. Invalid programs still exit 1 with the same total error count in strict mode (the test harness's StaticErrorValidator only checks exit code).
  3. The IDENTITY of the first error reported can differ for invalid programs containing both an argument error AND a callee/format/interface error.

This was a deliberate tradeoff captured by the prior strict-mode safety audit: visiting children before parents is the more natural compiler convention (it's what rustc/swiftc/tsc do), and crucially the new order is what lets collecting mode surface child errors when the parent lookup fails — which is the whole point of Phase 2.

Gating the reordering behind ContinueOnError would require 4 duplicated code paths in VisitCtorExpr / VisitFunCallExpr / VisitFormatedString / VisitCtorStmt. The tradeoff isn't worth the maintenance burden for "the first of several diagnostics on an invalid program looks slightly different".

I'll update the PR description to be precise about which strict-mode invariants are preserved and which are intentionally relaxed. Existing static-error regression tests pass unchanged because they only check exit code, not the first-error message.

ankushdesai added a commit that referenced this pull request May 27, 2026
Multi-agent review of PRs #957/#965/#967 identified 3 HIGH-severity
harness fragilities (could silently mask real bugs) and ~10 coverage
gaps. This commit applies the harness fixes + the 4 highest-priority
new pinned tests + 1 doc fix.

## Harness fixes

### 1. Count ALL diagnostic markers, not just `[Error:]`

Previous: `CountOccurrences(stderr, "[Error:]") + CountOccurrences(stderr, "[Parser Error:]")`.

Problem: Compiler.cs also emits `[NotSupportedError:]`, `[NotImplementedError:]`,
and per-backend `[<entry> Compiling Generated Code:]` markers. A regression
that converts a real diagnostic into a `NotSupportedException` would be
counted as zero errors, making `CollectingReportsAtLeastAsManyErrorsAsStrict`'s
`>=` invariant trivially pass with both sides zero — masking the regression
entirely. We already hit this once when the ForeachInvariantError.p test
tripped a `NotSupportedException` on the PVerifier `invariant` keyword.

Fix: new `CountErrorMarkers(stderr)` uses two regexes — one for
`[<text>Error<text>:]` (catches all five marker variants) and one for the
explicit `Compiling Generated Code` marker. Applied to both
Phase1DormancyTest.cs and MultiErrorAcceptanceTest.cs.

### 2. Catch ONLY TranslationException

Previous: `catch (Exception e)` in both fixture's RunOnce helpers.

Problem: Phase 2's MultiAgent audit explicitly avoided this pattern. NREs,
`Debug.Assert` failures, and other runtime exceptions get folded into
`exitCode = -1` and identical stderr text — two parallel NREs produce
equal output, equal exit codes, equal error counts (0), and the
`StrictAndCollectingAgreeOnValidPrograms` equality assertion passes
green despite both runs crashing.

Fix: catch only `TranslationException`. Other exceptions propagate so
genuine bugs surface as test failures with full stack traces.

### 3. Mark env-var test [NonParallelizable]

Previous: `DiagnosticCollectorTest.ContinueOnError_ReadsEnvVar` mutates
the process-global `P_COMPILER_COLLECT_ERRORS` env var.

Problem: `RegressionTests.CompileOnlyRegressionTests` is
`[Parallelizable(ParallelScope.Children)]`. With `dotnet test` running
fixtures concurrently, a parallel CompilerConfiguration construction
could read the temporarily-set env var and run a strict-mode regression
test in collecting mode — silently masking a strict-mode regression as
a flaky-pass.

Fix: add `[NonParallelizable]` to the env-var test.

## Doc fix

`MultipleErrors.p` line 11 said the 4th error was "arg count mismatch".
The actual mechanism is "payload TYPE mismatch via single-arg
CheckArgument branch" because `(1,2,3)` parses as a single unnamed-tuple
argument. Count is 1 either way, but the prose was actively misleading.

## 4 new pinned coverage tests

The audit identified ~12 throw sites in ExprVisitor/StatementVisitor with
NO multi-error coverage at all. The four highest-priority additions:

1. **CtorArgErrors.p (1/3)** — VisitCtorExpr arg-pre-visit recovery.
   Validates that constructor args are visited BEFORE interface lookup,
   so missing-decl errors in args surface even when the interface
   itself is unknown.

2. **FunCallArgErrors.p (1/4)** — VisitFunCallExpr Function branch.
   Two calls combining argument errors with call-shape errors (arity +
   type mismatch). Validates cascade-suppression of arg[0] while arg[1]
   mismatch still surfaces.

3. **MultiFunctionPerMachine.p (1/3)** — Phase 3 per-FUNCTION isolation
   distinct from per-machine. Three functions in ONE machine, one
   error each. Catches a regression where Phase 3's per-function
   TolerantStep wrapper stops iterating after the first failing
   function.

4. **GotoRecoveryErrors.p (1/4)** — VisitGotoStmt rvalue-pre-visit
   recovery in both the missing-state branch and the arity-mismatch
   branch. Validates that argument errors surface even when the
   goto itself is malformed.

## Deferred (lower-priority audit findings)

- 6 more coverage gaps (CastExpr, ChooseExpr, FormatString, dup tuple
  field, RaiseStmt non-void, etc.). Less critical because each is
  covered by Phase1DormancyTest's weak `>=` baseline.
- Harness #5/#7 (scratch-dir GetHashCode collisions, parallel dotnet
  build). The Ubuntu flake from earlier may relate but hasn't recurred
  since the projectRoot fix. Tracked separately.
- Harness #8 (test timeouts). Nice-to-have; current CI runs reliably.
- Identity-aware assertions (stderr substring matches) — would lock in
  which error fires first, but adds maintenance cost. Defer.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Base automatically changed from ankush/multi-error-typecheck-phase1 to master May 27, 2026 22:21
ankushdesai added a commit that referenced this pull request May 27, 2026
Real bug + 3 doc accuracy improvements from Copilot's second-round
review of PR #965 (1 code fix) and first-round review of PR #967
(2 doc updates):

## 1. VisitInsertStmt key-check pointed at the wrong AST node

`StatementVisitor.VisitInsertStmt` reported the key/index TypeMismatch
at `context.rvalue()` (the VALUE token of `insert lvalue[index] = value`)
when the key actually lives in `context.expr()`. Diagnostics like
"got type: string, expected: int" would point the user at the value
expression instead of the index expression — confusingly off-target.

The bug pre-existed the Phase 2 conversion (master had the same wrong
context), but Copilot caught it because Phase 2 made the diff visible
when converting `throw → CheckAssignable`. Fixed at line 215 to use
context.expr() for the key check; context.rvalue() remains correct for
the value check.

## 2. TolerantStep XML doc no longer claims cascade-suppression

The previous doc said TolerantStep "runs an analysis step under cascade-
suppression rules" — but it's used for both gathering AND analysis
passes, and it doesn't implement cascade suppression at all (cascade
suppression lives in CheckAssignable + the ErrorType IsSameTypeAs
short-circuit; TolerantStep is the per-pass-item analog: same goal,
different mechanism). Rewritten to:
- Acknowledge dual-use across gathering and analysis passes
- Distinguish from the expression-level cascade-suppression machinery
- Make the strict vs collecting behavior contracts explicit

## 3. MultiMachineErrors test claim narrowed to what it actually tests

The .p file header and the MultiErrorAcceptanceTest description both
claimed the test validates "TolerantStep wrappers around pass 2a
(MachineChecker) and pass 3". In reality, each machine in the file has
a valid start state and payload type, so pass 2a never throws — the
test only exercises pass 3 (function body type-checking) errors flowing
through Phase 2's Report-and-continue plus Phase 3's per-function
TolerantStep.

Both descriptions updated to accurately call out:
- What the test DOES exercise (per-machine isolation across pass 3)
- What it DOESN'T cover (per-machine pass-2a throws like
  MissingStartState — would need a separate fixture)

## 4. PR description: precise strict-mode invariants

Also updated PR #965's description to be precise about which strict-mode
invariants the PR preserves vs. relaxes. The original claim "strict mode
bit-for-bit unchanged" was over-broad — valid programs ARE bit-identical
and invalid programs still exit 1, but the first-error IDENTITY can
differ for invalid programs with both an argument error AND a
callee/format/interface error (because args are now visited BEFORE
lookup, which is the compiler-convention order and what lets collecting
mode surface child errors).

Replied on the 4 strict-mode reorder Copilot threads with the same
rationale.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
ankushdesai and others added 8 commits May 27, 2026 15:42
Lights up the diagnostic collector scaffolded in #957. In strict mode
(default), behavior is unchanged because handler.Diagnostics.Report
re-throws immediately. In collecting mode (P_COMPILER_COLLECT_ERRORS=1)
the type checker now reports all independent expression / statement
errors in one pass instead of aborting at the first.

- CheckAssignable(handler, ctx, expected, actual): report-and-return-false
  variant of IsAssignableFrom. ErrorType on either side is silently
  treated as compatible, preventing one upstream error from generating
  a chain of "incompatible operand" diagnostics downstream.
- CheckArgument and ValidatePayloadTypes route through CheckAssignable
  so call-site / send-site argument checks inherit the same suppression.

Every visit method follows the same convention:
  1. Visit children first (so their internal errors surface).
  2. If any child has ErrorType, return new ErrorExpr(context) without
     further checks (combiner rule).
  3. Each `throw handler.X(...)` becomes
     `handler.Diagnostics.Report(handler.X(...)); return new ErrorExpr(...);`
     with remaining compatibility checks routed via CheckAssignable.

Notable per-site recoveries:
  - VisitBinExpr: a single ErrorType guard at the top covers all 14
    throws inside the switch.
  - VisitFunCallExpr / VisitCtorExpr: arguments are visited *before*
    callee/interface resolution so their internal errors surface even
    when the callee is unknown.
  - VisitCastExpr: resolve target type even when sub-expression errored,
    so a malformed type expression in a cast also gets reported.

Statements either continue building the AST node with the values they
have (most cases) or return new NoStmt(context) when there's nothing
meaningful to construct (missing variable / interface / function /
state / event):
  - Missing-declaration sites in CtorStmt, FunCallStmt, ForeachStmt,
    GotoStmt, ReceiveStmt → NoStmt placeholder.
  - Type-mismatch sites in AssertStmt, AssumeStmt, PrintStmt, ReturnStmt,
    AssignStmt, IfStmt, WhileStmt etc. → Report and continue with the
    typed expression as-is (downstream sees ErrorType if upstream).
  - SendStmt / RaiseStmt / AnnounceStmt: ErrorType guard around the
    null-event + assignability checks so an upstream-errored event
    expression doesn't generate spurious diagnostics.

The Phase 1 fixture asserted bit-identical stderr in both modes. That's
no longer the right invariant for StaticError/ inputs — collecting mode
can now legitimately report MORE errors per file. Split into:
  - StrictAndCollectingAgreeOnValidPrograms: Correct/ only. Both modes
    must produce empty stderr and exit 0.
  - CollectingReportsAtLeastAsManyErrorsAsStrict: StaticError/ only.
    Both exit 1; collecting count >= strict count. Catches the only
    regression that matters: collecting silently suppressing a strict
    error.

A curated P file with 4 deliberately-independent errors:
  - bool assigned to int
  - missing-declaration (must not cascade into "wrong type")
  - int + string
  - send arg-count mismatch

Strict mode reports exactly 1 (aborts on first). Collecting mode
reports exactly 4 with no spurious cascade diagnostics. Pinning both
counts catches regressions to either the throw-site conversion or
the cascade-suppression rules.

handler.Diagnostics.Report re-throws when ContinueOnError=false. Every
existing static-error test passes unchanged — the throw-site conversion
preserves the historical "first error wins, abort" behavior bit-for-bit
in strict mode.

Pass-level tolerance classification in Analyzer.cs — decide per pass
whether it can run on a partially-broken AST (e.g. capability check
and module-system passes need a clean AST; control-flow check can
tolerate per-function errors). Estimated 2-3 days.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Two fixes for the Phase 2 CI failures:

## 1. NamedTupleBody dict collision (real bug)

In collecting mode, the recovery for a duplicate field name added an
entry with the *original* name to the entries[] array. Downstream
NamedTupleType construction builds a Dictionary keyed by entry name,
which throws InvalidOperationException on the collision. Test
`Feature4DataTypes | StaticError | namedDuplicateField2` failed with
"An item with the same key has already been added. Key: a".

Fix: mangle the duplicate's name with a unique suffix (`$dup$N`). The
duplicate-field diagnostic is already reported to the user via
handler.Diagnostics.Report; mangling just keeps the AST well-formed for
any subsequent type checks that happen to inspect the tuple.

## 2. Stack traces in test harness (diagnostic improvement)

Phase1DormancyTest + MultiErrorAcceptanceTest both swallow uncaught
exceptions from new Compiler().Compile(...) and report only e.Message.
Several CI failures showed NREs on valid programs in strict mode with
no way to identify which compiler pass NRE'd. Now we print
`e.GetType().Name` and the full stack trace into stderr so the next CI
run pinpoints the responsible code site.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
PCheckerCodeGenerator.Compile dereferences job.ProjectRootPath.FullName
at the dotnet-build stage. My fixtures used the 5-arg CompilerConfiguration
constructor without supplying projectRoot, so the PChecker backend NRE'd
during the second stage of Compile on every Correct program.

This was masked in Phase 1 because both modes (strict and collecting)
NRE'd symmetrically, and the Phase1DormancyTest's identical-stderr
assertion still held. The Phase 2 split asserts exit code == 0 on
valid programs, which exposed the bug.

Mirror PCheckerRunner.cs by passing scratchDir as the projectRoot. The
MultiErrorAcceptanceTest fixture also gets the same fix as defence in
depth — its tests stop at type-check (HasErrors causes Compiler.cs to
return early before reaching the backend), so the bug wouldn't fire
there in practice, but consistency keeps the next person from re-
discovering this.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
All five comments from Copilot's review of #965 were valid recovery
holes — places where the converted code suppressed a diagnostic too
aggressively or failed to propagate ErrorType. Fixes:

## 1. ValidatePayloadTypes: don't swallow the arity check

Previous: any argument with ErrorType caused an early return from the
whole helper, hiding the independent IncorrectArgumentCount diagnostic
when arity was wrong AND one arg also errored upstream.

Fix: remove the blanket early-return on `arguments.Any(a => a.Type is
ErrorType)`. Per-argument cascade suppression is already handled by
CheckAssignable's own ErrorType short-circuit, so the arity check (a
property of the call-site, not the arg types) now fires independently.
Added an explicit arity check inside the tuple branch since the
previous code relied on Zip silently truncating mismatched counts.

## 2. VisitForeachStmt: visit invariants when iterator var is missing

Previous: missing-iterator recovery visited the collection and body
but not `context._invariants`. Type errors inside loop invariants
were silently suppressed in collecting mode.

Fix: visit each invariant via exprVisitor.Visit before returning NoStmt.

## 3. VisitReceiveStmt: visit handler bodies on spec-illegal monitor op

Previous: when receive was illegal (spec machine), the method reported
IllegalMonitorOperation and returned NoStmt immediately, suppressing
all diagnostics inside the handler bodies.

Fix: mirror what Send/Announce/Raise do for their child expressions —
construct each recvCase handler (with its scope and parameter) and run
FunctionBodyVisitor.PopulateMethod to surface body errors before
returning NoStmt. Event-id lookups and duplicate-case checks are
intentionally skipped since the receive itself is already illegal.

## 4. VisitTestExpr: propagate ErrorType from instance

Previous: an instance with ErrorType still produced a bool-typed
TestExpr, masking the cascade for downstream consumers.

Fix: do the kind-identifier lookup first (so missing-decl on the kind
is still reported even when instance errored), then bail to ErrorExpr
if instance.Type is ErrorType. Preserves both diagnostics' independence.

## 5. VisitTargetsExpr / VisitFlyingExpr / VisitSentExpr: propagate ErrorType

Previous: these three constructed bool-typed nodes regardless of operand
state, breaking the ErrorType propagation convention documented in
ExprVisitor's class header.

Fix: add the same early-guard pattern used elsewhere — if any visited
child has ErrorType, return new ErrorExpr(context) before constructing
the typed node.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
After 3 rounds of Copilot findings, ran a 5-agent parallel audit covering
cascade-suppression conventions, AST-recovery downstream NRE risks, strict-
mode preservation, test coverage gaps, and Phase 3 forward compatibility.
This commit applies the HIGH and MEDIUM severity findings and adds three
new pinned-count acceptance tests.

## Why Copilot kept finding things

Cascade-suppression has ~5 rules (producer, propagator, combiner, lvalue,
call) and the conversion touched ~30 visit methods. Verifying every method
follows every rule is combinatorially hard for a single reviewer. The
multi-agent audit caught a third set of bugs that a single read missed —
this time we also added programmatic acceptance tests that pin error
counts so the next regression fires loudly.

## Tier 1: cascade-suppression fixes (HIGH severity)

**1. `VisitReceiveStmt` spec-machine recovery now visits event-id lookups.**
Copilot's first round added handler body visiting; the audit caught that
undeclared events in the `case ...` lists were still being swallowed.

**2. `VisitQuantExpr` diff-bound recovery now visits the body.**
The two early-returns (arity mismatch, bound-type mismatch) skipped the
body, so nested type errors inside `forall ... :: <body>` quantifiers
got lost when the bound was malformed.

**3. `VisitQuantExpr` bound-ErrorType short-circuit.**
When TypeResolver reported a missing declaration on the bound type, the
diff switch defaulted to a spurious "expected Event" cascade. New
explicit `if (bound[0].Type is ErrorType)` guard bails silently and
visits the body anyway.

## Tier 2: defense-in-depth for Phase 3

**4. Analyzer.cs HasErrors gate before pass 8.**
ModuleSystemDeclarations / ModuleSystemTypeChecker (passes 8-10) contain
hard casts (`(NamedTupleExpr)`, `(NamedTupleType)`, `(SeqLiteralExpr)`)
that ErrorExpr / ErrorType would trip with InvalidCastException in
collecting mode on a partial AST. Today this is masked by Compiler.cs's
HasErrors gate at the END of compilation — but those passes still RAN
on the broken AST. Skip them when diagnostics are present. Phase 3 will
replace this single gate with per-pass tolerance classification.

**5. `FunctionBodyVisitor.PopulateMethod` backfills empty Body.**
`Function.IsForeign = Body == null`, so a function that errored partway
through population would silently be misclassified as foreign. Backfill
`new CompoundStmt(...)` on failure. Belt-and-braces — Compiler.cs's
HasErrors gate already ensures these don't reach IR or backends, but
preserving the "non-foreign ⇒ has body" invariant simplifies Phase 3.

**6. `ValidatePayloadTypes` null payload safety.**
Pass 2a (MachineChecker) may leave PayloadType null when an event/
interface declaration itself errored. Treat null as ErrorType bail so
ValidatePayloadTypes doesn't NRE in collecting mode when called via
pass-3 recovery paths.

## Tier 3: pinned-count acceptance tests

Refactored `MultiErrorAcceptanceTest` from one hard-coded file to a
`[TestCaseSource]`-driven fixture. Each row pins (strict count,
collecting count) for a curated `.p` file. When counts change
intentionally, update the row AND the file's header comment.

Three new pinned cases:

- **`NestedExprErrors.p`** (1 / 2): single expression with two nested
  missing-declarations. Validates combiner rule — `undeclaredA.foo +
  undeclaredB.bar * "str"` must NOT produce extra cascades on `+` or
  `*`.
- **`ForeachInvariantError.p`** (1 / 2): foreach with valid iterator,
  body error + invariant error. Validates the happy-path branch of
  VisitForeachStmt visits both body and each invariant.
- **`SpecReceiveBodyError.p`** (1 / 3): `receive` inside a spec
  machine: IllegalMonitorOperation + undeclared event + body type
  mismatch. Validates the new event-id lookup + the existing handler-
  body recovery in the spec-machine branch.

Diagnostic guidance for future failures is in MultiErrorAcceptanceTest's
class doc.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
CompoundStmt's constructor takes IEnumerable<IPStmt>; the empty-body
fail-safe I just added references IPStmt without the matching using.
Build broke on all 3 platforms with CS0246. Trivial import fix.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
ForeachInvariantError.p triggered a NotSupportedException because
`invariant` is gated to PVerifier mode — the regression suite runs
in default PChecker mode which rejects PVerifier tokens at the lexer.
The MultiErrorAcceptanceTest pinned 1/2 errors but the harness got
0/0 because [NotSupportedError:] isn't counted as [Error:].

Rename ForeachInvariantError → ForeachBodyErrors and rework to exercise
two body errors (MissingDeclaration + TypeMismatch) without any PVerifier
syntax. Still validates the audit-flagged "happy path visits the body"
invariant; the parallel "happy path visits invariants" fix in
StatementVisitor.VisitForeachStmt remains in code and will get its own
regression once a PVerifier-mode test harness exists.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
… precheck

Subtle bug noticed while resolving the rebase conflict: the hasKind
precheck loop used `table.Lookup(name, out State _)` (cross-machine
lookup) for the State branch, but the actual dispatch below used
`TryResolveStateForInstance` (PR #963's narrowed lookup).

Failure path: `myA is S2` where MachineA has no S2 but MachineB does.
hasKind succeeds via the cross-machine lookup (S2 found in B). The
ErrorType check passes. Then dispatch falls through Machine/Event,
and `TryResolveStateForInstance` correctly returns false (S2 not in
A's scope) — but the trailing "Unreachable" comment becomes REACHABLE
and silently returns ErrorExpr without firing the MissingDeclaration
diagnostic. The user gets exit code 1 with NO error message.

Fix: change the hasKind precheck's State branch to use
`TryResolveStateForInstance` so `hasKind` reflects whether a USABLE
binding exists, not just any binding anywhere in the program. Now
the cross-machine S2 case correctly fails the precheck and emits
the "could not find machine, event, or state 'S2'" diagnostic.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@ankushdesai ankushdesai force-pushed the ankush/multi-error-typecheck-phase2 branch from bd1cafe to dd3f4bf Compare May 27, 2026 22:45
ankushdesai added a commit that referenced this pull request May 27, 2026
Stacks on Phase 2 (#965). Closes the multi-error type-checker initiative
by adding per-item try/catch isolation to the analyzer's gathering passes
and a HasErrors gate before the analysis passes that assume a clean AST.

## What changed

### `TolerantStep` helper

New private helper in `Analyzer.cs`:

    private static void TolerantStep(ITranslationErrorHandler handler, Action body)
    {
        try { body(); }
        catch (TranslationException e) when (handler.Diagnostics.ContinueOnError)
        {
            handler.Diagnostics.Report(e);
        }
    }

In strict mode (ContinueOnError == false) the `when` filter is false so
the exception propagates exactly as before — bit-for-bit preservation of
today's throw-on-first-error semantics. In collecting mode the catch
captures the TranslationException into the collector and the loop
continues with the next item. Only TranslationException is caught; NREs
and other runtime exceptions still propagate so genuine bugs surface
loudly.

### Tolerant gathering passes (2a, 2b, 3, 3b, 7)

Each iteration wrapped in `TolerantStep`:

- **2a MachineChecker.Validate** — per-machine. One bad machine no
  longer aborts MachineChecker for siblings.
- **3 FunctionBody + Validator** — per-function. One bad function
  (TypeResolver throw on a bad var decl, missing-paths-return, etc.)
  no longer aborts pass 3 for the rest of the program.
- **3b Invariants / Axioms / AssumeOnStarts / Pures / Foreign** —
  per-item across all five subgroups. One bad invariant no longer
  blocks axiom checking.
- **2b ValidateNoStaticHandlers** — per-machine.
- **7 InferMachineCreates** — per-machine. Already gated by HasErrors
  in practice, but the wrapper is cheap insurance.

### Gathering→analysis boundary

New HasErrors gate placed after pass 2b (the last gathering pass) and
before pass 4 (ApplyPropagations):

    if (handler.Diagnostics.HasErrors) return globalScope;

If any gathering pass collected an error, the analysis passes (4-7) and
the module-system passes (8-10) are skipped. The duplicate gate before
pass 8 (added in Phase 2 robustness) is kept as defense-in-depth — once
Phase 4+ converts pass-5 capability checks to Report, that second gate
will earn its keep.

## Why this matters

Without Phase 3, a single machine with a bad start state aborts MachineChecker
mid-pass, so a sibling machine's MoreThanOneParameterForHandlers error never
gets reported. A single function with a malformed var decl aborts pass 3, so
none of the OTHER functions' bodies get type-checked at all. Combined with
Phase 2's expression-level recovery, Phase 3 means a user with N independent
errors (in any combination of expressions, statements, machines, or functions)
sees ALL of them in one compile.

## Tests

### New: `MultiMachineErrors.p` + pinned counts

Three machines, each with one independent error (bool->int, missing
declaration, int+string binop). Pinned in MultiErrorAcceptanceTest:
strict=1, collecting=3. Validates that one machine's error doesn't
clobber its siblings' diagnostics in collecting mode.

### Existing: All Phase 2 acceptance tests unchanged

MultipleErrors (1/4), NestedExprErrors (1/2), ForeachBodyErrors (1/2),
SpecReceiveBodyError (1/3). Same counts as Phase 2 — Phase 3 doesn't
change behavior within a single machine.

### Existing: Phase1DormancyTest baseline

The "collecting >= strict count" invariant on every Correct/StaticError
test still holds. Phase 3 only *adds* errors (when multi-machine /
multi-function scenarios exist); it never removes.

## Phase 4 (deferred, NOT in this PR)

Convert pass 5 (capability checks) and pass 6 (ControlFlowChecker)
throw sites to Report so capability and control-flow violations also
accumulate across machines instead of aborting on first. The current
HasErrors gate already prevents these passes from running on a broken
AST, so they remain RequiresClean for now. Estimated: ~1 day if
warranted by user feedback.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
ankushdesai added a commit that referenced this pull request May 27, 2026
Multi-agent review of PRs #957/#965/#967 identified 3 HIGH-severity
harness fragilities (could silently mask real bugs) and ~10 coverage
gaps. This commit applies the harness fixes + the 4 highest-priority
new pinned tests + 1 doc fix.

## Harness fixes

### 1. Count ALL diagnostic markers, not just `[Error:]`

Previous: `CountOccurrences(stderr, "[Error:]") + CountOccurrences(stderr, "[Parser Error:]")`.

Problem: Compiler.cs also emits `[NotSupportedError:]`, `[NotImplementedError:]`,
and per-backend `[<entry> Compiling Generated Code:]` markers. A regression
that converts a real diagnostic into a `NotSupportedException` would be
counted as zero errors, making `CollectingReportsAtLeastAsManyErrorsAsStrict`'s
`>=` invariant trivially pass with both sides zero — masking the regression
entirely. We already hit this once when the ForeachInvariantError.p test
tripped a `NotSupportedException` on the PVerifier `invariant` keyword.

Fix: new `CountErrorMarkers(stderr)` uses two regexes — one for
`[<text>Error<text>:]` (catches all five marker variants) and one for the
explicit `Compiling Generated Code` marker. Applied to both
Phase1DormancyTest.cs and MultiErrorAcceptanceTest.cs.

### 2. Catch ONLY TranslationException

Previous: `catch (Exception e)` in both fixture's RunOnce helpers.

Problem: Phase 2's MultiAgent audit explicitly avoided this pattern. NREs,
`Debug.Assert` failures, and other runtime exceptions get folded into
`exitCode = -1` and identical stderr text — two parallel NREs produce
equal output, equal exit codes, equal error counts (0), and the
`StrictAndCollectingAgreeOnValidPrograms` equality assertion passes
green despite both runs crashing.

Fix: catch only `TranslationException`. Other exceptions propagate so
genuine bugs surface as test failures with full stack traces.

### 3. Mark env-var test [NonParallelizable]

Previous: `DiagnosticCollectorTest.ContinueOnError_ReadsEnvVar` mutates
the process-global `P_COMPILER_COLLECT_ERRORS` env var.

Problem: `RegressionTests.CompileOnlyRegressionTests` is
`[Parallelizable(ParallelScope.Children)]`. With `dotnet test` running
fixtures concurrently, a parallel CompilerConfiguration construction
could read the temporarily-set env var and run a strict-mode regression
test in collecting mode — silently masking a strict-mode regression as
a flaky-pass.

Fix: add `[NonParallelizable]` to the env-var test.

## Doc fix

`MultipleErrors.p` line 11 said the 4th error was "arg count mismatch".
The actual mechanism is "payload TYPE mismatch via single-arg
CheckArgument branch" because `(1,2,3)` parses as a single unnamed-tuple
argument. Count is 1 either way, but the prose was actively misleading.

## 4 new pinned coverage tests

The audit identified ~12 throw sites in ExprVisitor/StatementVisitor with
NO multi-error coverage at all. The four highest-priority additions:

1. **CtorArgErrors.p (1/3)** — VisitCtorExpr arg-pre-visit recovery.
   Validates that constructor args are visited BEFORE interface lookup,
   so missing-decl errors in args surface even when the interface
   itself is unknown.

2. **FunCallArgErrors.p (1/4)** — VisitFunCallExpr Function branch.
   Two calls combining argument errors with call-shape errors (arity +
   type mismatch). Validates cascade-suppression of arg[0] while arg[1]
   mismatch still surfaces.

3. **MultiFunctionPerMachine.p (1/3)** — Phase 3 per-FUNCTION isolation
   distinct from per-machine. Three functions in ONE machine, one
   error each. Catches a regression where Phase 3's per-function
   TolerantStep wrapper stops iterating after the first failing
   function.

4. **GotoRecoveryErrors.p (1/4)** — VisitGotoStmt rvalue-pre-visit
   recovery in both the missing-state branch and the arity-mismatch
   branch. Validates that argument errors surface even when the
   goto itself is malformed.

## Deferred (lower-priority audit findings)

- 6 more coverage gaps (CastExpr, ChooseExpr, FormatString, dup tuple
  field, RaiseStmt non-void, etc.). Less critical because each is
  covered by Phase1DormancyTest's weak `>=` baseline.
- Harness #5/#7 (scratch-dir GetHashCode collisions, parallel dotnet
  build). The Ubuntu flake from earlier may relate but hasn't recurred
  since the projectRoot fix. Tracked separately.
- Harness #8 (test timeouts). Nice-to-have; current CI runs reliably.
- Identity-aware assertions (stderr substring matches) — would lock in
  which error fires first, but adds maintenance cost. Defer.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
ankushdesai added a commit that referenced this pull request May 27, 2026
Real bug + 3 doc accuracy improvements from Copilot's second-round
review of PR #965 (1 code fix) and first-round review of PR #967
(2 doc updates):

## 1. VisitInsertStmt key-check pointed at the wrong AST node

`StatementVisitor.VisitInsertStmt` reported the key/index TypeMismatch
at `context.rvalue()` (the VALUE token of `insert lvalue[index] = value`)
when the key actually lives in `context.expr()`. Diagnostics like
"got type: string, expected: int" would point the user at the value
expression instead of the index expression — confusingly off-target.

The bug pre-existed the Phase 2 conversion (master had the same wrong
context), but Copilot caught it because Phase 2 made the diff visible
when converting `throw → CheckAssignable`. Fixed at line 215 to use
context.expr() for the key check; context.rvalue() remains correct for
the value check.

## 2. TolerantStep XML doc no longer claims cascade-suppression

The previous doc said TolerantStep "runs an analysis step under cascade-
suppression rules" — but it's used for both gathering AND analysis
passes, and it doesn't implement cascade suppression at all (cascade
suppression lives in CheckAssignable + the ErrorType IsSameTypeAs
short-circuit; TolerantStep is the per-pass-item analog: same goal,
different mechanism). Rewritten to:
- Acknowledge dual-use across gathering and analysis passes
- Distinguish from the expression-level cascade-suppression machinery
- Make the strict vs collecting behavior contracts explicit

## 3. MultiMachineErrors test claim narrowed to what it actually tests

The .p file header and the MultiErrorAcceptanceTest description both
claimed the test validates "TolerantStep wrappers around pass 2a
(MachineChecker) and pass 3". In reality, each machine in the file has
a valid start state and payload type, so pass 2a never throws — the
test only exercises pass 3 (function body type-checking) errors flowing
through Phase 2's Report-and-continue plus Phase 3's per-function
TolerantStep.

Both descriptions updated to accurately call out:
- What the test DOES exercise (per-machine isolation across pass 3)
- What it DOESN'T cover (per-machine pass-2a throws like
  MissingStartState — would need a separate fixture)

## 4. PR description: precise strict-mode invariants

Also updated PR #965's description to be precise about which strict-mode
invariants the PR preserves vs. relaxes. The original claim "strict mode
bit-for-bit unchanged" was over-broad — valid programs ARE bit-identical
and invalid programs still exit 1, but the first-error IDENTITY can
differ for invalid programs with both an argument error AND a
callee/format/interface error (because args are now visited BEFORE
lookup, which is the compiler-convention order and what lets collecting
mode surface child errors).

Replied on the 4 strict-mode reorder Copilot threads with the same
rationale.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@ankushdesai ankushdesai merged commit c0ef9d4 into master May 27, 2026
10 checks passed
@ankushdesai ankushdesai deleted the ankush/multi-error-typecheck-phase2 branch May 27, 2026 23:22
ankushdesai added a commit that referenced this pull request May 27, 2026
Stacks on Phase 2 (#965). Closes the multi-error type-checker initiative
by adding per-item try/catch isolation to the analyzer's gathering passes
and a HasErrors gate before the analysis passes that assume a clean AST.

## What changed

### `TolerantStep` helper

New private helper in `Analyzer.cs`:

    private static void TolerantStep(ITranslationErrorHandler handler, Action body)
    {
        try { body(); }
        catch (TranslationException e) when (handler.Diagnostics.ContinueOnError)
        {
            handler.Diagnostics.Report(e);
        }
    }

In strict mode (ContinueOnError == false) the `when` filter is false so
the exception propagates exactly as before — bit-for-bit preservation of
today's throw-on-first-error semantics. In collecting mode the catch
captures the TranslationException into the collector and the loop
continues with the next item. Only TranslationException is caught; NREs
and other runtime exceptions still propagate so genuine bugs surface
loudly.

### Tolerant gathering passes (2a, 2b, 3, 3b, 7)

Each iteration wrapped in `TolerantStep`:

- **2a MachineChecker.Validate** — per-machine. One bad machine no
  longer aborts MachineChecker for siblings.
- **3 FunctionBody + Validator** — per-function. One bad function
  (TypeResolver throw on a bad var decl, missing-paths-return, etc.)
  no longer aborts pass 3 for the rest of the program.
- **3b Invariants / Axioms / AssumeOnStarts / Pures / Foreign** —
  per-item across all five subgroups. One bad invariant no longer
  blocks axiom checking.
- **2b ValidateNoStaticHandlers** — per-machine.
- **7 InferMachineCreates** — per-machine. Already gated by HasErrors
  in practice, but the wrapper is cheap insurance.

### Gathering→analysis boundary

New HasErrors gate placed after pass 2b (the last gathering pass) and
before pass 4 (ApplyPropagations):

    if (handler.Diagnostics.HasErrors) return globalScope;

If any gathering pass collected an error, the analysis passes (4-7) and
the module-system passes (8-10) are skipped. The duplicate gate before
pass 8 (added in Phase 2 robustness) is kept as defense-in-depth — once
Phase 4+ converts pass-5 capability checks to Report, that second gate
will earn its keep.

## Why this matters

Without Phase 3, a single machine with a bad start state aborts MachineChecker
mid-pass, so a sibling machine's MoreThanOneParameterForHandlers error never
gets reported. A single function with a malformed var decl aborts pass 3, so
none of the OTHER functions' bodies get type-checked at all. Combined with
Phase 2's expression-level recovery, Phase 3 means a user with N independent
errors (in any combination of expressions, statements, machines, or functions)
sees ALL of them in one compile.

## Tests

### New: `MultiMachineErrors.p` + pinned counts

Three machines, each with one independent error (bool->int, missing
declaration, int+string binop). Pinned in MultiErrorAcceptanceTest:
strict=1, collecting=3. Validates that one machine's error doesn't
clobber its siblings' diagnostics in collecting mode.

### Existing: All Phase 2 acceptance tests unchanged

MultipleErrors (1/4), NestedExprErrors (1/2), ForeachBodyErrors (1/2),
SpecReceiveBodyError (1/3). Same counts as Phase 2 — Phase 3 doesn't
change behavior within a single machine.

### Existing: Phase1DormancyTest baseline

The "collecting >= strict count" invariant on every Correct/StaticError
test still holds. Phase 3 only *adds* errors (when multi-machine /
multi-function scenarios exist); it never removes.

## Phase 4 (deferred, NOT in this PR)

Convert pass 5 (capability checks) and pass 6 (ControlFlowChecker)
throw sites to Report so capability and control-flow violations also
accumulate across machines instead of aborting on first. The current
HasErrors gate already prevents these passes from running on a broken
AST, so they remain RequiresClean for now. Estimated: ~1 day if
warranted by user feedback.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
ankushdesai added a commit that referenced this pull request May 27, 2026
Multi-agent review of PRs #957/#965/#967 identified 3 HIGH-severity
harness fragilities (could silently mask real bugs) and ~10 coverage
gaps. This commit applies the harness fixes + the 4 highest-priority
new pinned tests + 1 doc fix.

## Harness fixes

### 1. Count ALL diagnostic markers, not just `[Error:]`

Previous: `CountOccurrences(stderr, "[Error:]") + CountOccurrences(stderr, "[Parser Error:]")`.

Problem: Compiler.cs also emits `[NotSupportedError:]`, `[NotImplementedError:]`,
and per-backend `[<entry> Compiling Generated Code:]` markers. A regression
that converts a real diagnostic into a `NotSupportedException` would be
counted as zero errors, making `CollectingReportsAtLeastAsManyErrorsAsStrict`'s
`>=` invariant trivially pass with both sides zero — masking the regression
entirely. We already hit this once when the ForeachInvariantError.p test
tripped a `NotSupportedException` on the PVerifier `invariant` keyword.

Fix: new `CountErrorMarkers(stderr)` uses two regexes — one for
`[<text>Error<text>:]` (catches all five marker variants) and one for the
explicit `Compiling Generated Code` marker. Applied to both
Phase1DormancyTest.cs and MultiErrorAcceptanceTest.cs.

### 2. Catch ONLY TranslationException

Previous: `catch (Exception e)` in both fixture's RunOnce helpers.

Problem: Phase 2's MultiAgent audit explicitly avoided this pattern. NREs,
`Debug.Assert` failures, and other runtime exceptions get folded into
`exitCode = -1` and identical stderr text — two parallel NREs produce
equal output, equal exit codes, equal error counts (0), and the
`StrictAndCollectingAgreeOnValidPrograms` equality assertion passes
green despite both runs crashing.

Fix: catch only `TranslationException`. Other exceptions propagate so
genuine bugs surface as test failures with full stack traces.

### 3. Mark env-var test [NonParallelizable]

Previous: `DiagnosticCollectorTest.ContinueOnError_ReadsEnvVar` mutates
the process-global `P_COMPILER_COLLECT_ERRORS` env var.

Problem: `RegressionTests.CompileOnlyRegressionTests` is
`[Parallelizable(ParallelScope.Children)]`. With `dotnet test` running
fixtures concurrently, a parallel CompilerConfiguration construction
could read the temporarily-set env var and run a strict-mode regression
test in collecting mode — silently masking a strict-mode regression as
a flaky-pass.

Fix: add `[NonParallelizable]` to the env-var test.

## Doc fix

`MultipleErrors.p` line 11 said the 4th error was "arg count mismatch".
The actual mechanism is "payload TYPE mismatch via single-arg
CheckArgument branch" because `(1,2,3)` parses as a single unnamed-tuple
argument. Count is 1 either way, but the prose was actively misleading.

## 4 new pinned coverage tests

The audit identified ~12 throw sites in ExprVisitor/StatementVisitor with
NO multi-error coverage at all. The four highest-priority additions:

1. **CtorArgErrors.p (1/3)** — VisitCtorExpr arg-pre-visit recovery.
   Validates that constructor args are visited BEFORE interface lookup,
   so missing-decl errors in args surface even when the interface
   itself is unknown.

2. **FunCallArgErrors.p (1/4)** — VisitFunCallExpr Function branch.
   Two calls combining argument errors with call-shape errors (arity +
   type mismatch). Validates cascade-suppression of arg[0] while arg[1]
   mismatch still surfaces.

3. **MultiFunctionPerMachine.p (1/3)** — Phase 3 per-FUNCTION isolation
   distinct from per-machine. Three functions in ONE machine, one
   error each. Catches a regression where Phase 3's per-function
   TolerantStep wrapper stops iterating after the first failing
   function.

4. **GotoRecoveryErrors.p (1/4)** — VisitGotoStmt rvalue-pre-visit
   recovery in both the missing-state branch and the arity-mismatch
   branch. Validates that argument errors surface even when the
   goto itself is malformed.

## Deferred (lower-priority audit findings)

- 6 more coverage gaps (CastExpr, ChooseExpr, FormatString, dup tuple
  field, RaiseStmt non-void, etc.). Less critical because each is
  covered by Phase1DormancyTest's weak `>=` baseline.
- Harness #5/#7 (scratch-dir GetHashCode collisions, parallel dotnet
  build). The Ubuntu flake from earlier may relate but hasn't recurred
  since the projectRoot fix. Tracked separately.
- Harness #8 (test timeouts). Nice-to-have; current CI runs reliably.
- Identity-aware assertions (stderr substring matches) — would lock in
  which error fires first, but adds maintenance cost. Defer.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
ankushdesai added a commit that referenced this pull request May 27, 2026
Real bug + 3 doc accuracy improvements from Copilot's second-round
review of PR #965 (1 code fix) and first-round review of PR #967
(2 doc updates):

## 1. VisitInsertStmt key-check pointed at the wrong AST node

`StatementVisitor.VisitInsertStmt` reported the key/index TypeMismatch
at `context.rvalue()` (the VALUE token of `insert lvalue[index] = value`)
when the key actually lives in `context.expr()`. Diagnostics like
"got type: string, expected: int" would point the user at the value
expression instead of the index expression — confusingly off-target.

The bug pre-existed the Phase 2 conversion (master had the same wrong
context), but Copilot caught it because Phase 2 made the diff visible
when converting `throw → CheckAssignable`. Fixed at line 215 to use
context.expr() for the key check; context.rvalue() remains correct for
the value check.

## 2. TolerantStep XML doc no longer claims cascade-suppression

The previous doc said TolerantStep "runs an analysis step under cascade-
suppression rules" — but it's used for both gathering AND analysis
passes, and it doesn't implement cascade suppression at all (cascade
suppression lives in CheckAssignable + the ErrorType IsSameTypeAs
short-circuit; TolerantStep is the per-pass-item analog: same goal,
different mechanism). Rewritten to:
- Acknowledge dual-use across gathering and analysis passes
- Distinguish from the expression-level cascade-suppression machinery
- Make the strict vs collecting behavior contracts explicit

## 3. MultiMachineErrors test claim narrowed to what it actually tests

The .p file header and the MultiErrorAcceptanceTest description both
claimed the test validates "TolerantStep wrappers around pass 2a
(MachineChecker) and pass 3". In reality, each machine in the file has
a valid start state and payload type, so pass 2a never throws — the
test only exercises pass 3 (function body type-checking) errors flowing
through Phase 2's Report-and-continue plus Phase 3's per-function
TolerantStep.

Both descriptions updated to accurately call out:
- What the test DOES exercise (per-machine isolation across pass 3)
- What it DOESN'T cover (per-machine pass-2a throws like
  MissingStartState — would need a separate fixture)

## 4. PR description: precise strict-mode invariants

Also updated PR #965's description to be precise about which strict-mode
invariants the PR preserves vs. relaxes. The original claim "strict mode
bit-for-bit unchanged" was over-broad — valid programs ARE bit-identical
and invalid programs still exit 1, but the first-error IDENTITY can
differ for invalid programs with both an argument error AND a
callee/format/interface error (because args are now visited BEFORE
lookup, which is the compiler-convention order and what lets collecting
mode surface child errors).

Replied on the 4 strict-mode reorder Copilot threads with the same
rationale.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
ankushdesai added a commit that referenced this pull request May 27, 2026
* Phase 3: pass-level tolerance classification in Analyzer.cs

Stacks on Phase 2 (#965). Closes the multi-error type-checker initiative
by adding per-item try/catch isolation to the analyzer's gathering passes
and a HasErrors gate before the analysis passes that assume a clean AST.

## What changed

### `TolerantStep` helper

New private helper in `Analyzer.cs`:

    private static void TolerantStep(ITranslationErrorHandler handler, Action body)
    {
        try { body(); }
        catch (TranslationException e) when (handler.Diagnostics.ContinueOnError)
        {
            handler.Diagnostics.Report(e);
        }
    }

In strict mode (ContinueOnError == false) the `when` filter is false so
the exception propagates exactly as before — bit-for-bit preservation of
today's throw-on-first-error semantics. In collecting mode the catch
captures the TranslationException into the collector and the loop
continues with the next item. Only TranslationException is caught; NREs
and other runtime exceptions still propagate so genuine bugs surface
loudly.

### Tolerant gathering passes (2a, 2b, 3, 3b, 7)

Each iteration wrapped in `TolerantStep`:

- **2a MachineChecker.Validate** — per-machine. One bad machine no
  longer aborts MachineChecker for siblings.
- **3 FunctionBody + Validator** — per-function. One bad function
  (TypeResolver throw on a bad var decl, missing-paths-return, etc.)
  no longer aborts pass 3 for the rest of the program.
- **3b Invariants / Axioms / AssumeOnStarts / Pures / Foreign** —
  per-item across all five subgroups. One bad invariant no longer
  blocks axiom checking.
- **2b ValidateNoStaticHandlers** — per-machine.
- **7 InferMachineCreates** — per-machine. Already gated by HasErrors
  in practice, but the wrapper is cheap insurance.

### Gathering→analysis boundary

New HasErrors gate placed after pass 2b (the last gathering pass) and
before pass 4 (ApplyPropagations):

    if (handler.Diagnostics.HasErrors) return globalScope;

If any gathering pass collected an error, the analysis passes (4-7) and
the module-system passes (8-10) are skipped. The duplicate gate before
pass 8 (added in Phase 2 robustness) is kept as defense-in-depth — once
Phase 4+ converts pass-5 capability checks to Report, that second gate
will earn its keep.

## Why this matters

Without Phase 3, a single machine with a bad start state aborts MachineChecker
mid-pass, so a sibling machine's MoreThanOneParameterForHandlers error never
gets reported. A single function with a malformed var decl aborts pass 3, so
none of the OTHER functions' bodies get type-checked at all. Combined with
Phase 2's expression-level recovery, Phase 3 means a user with N independent
errors (in any combination of expressions, statements, machines, or functions)
sees ALL of them in one compile.

## Tests

### New: `MultiMachineErrors.p` + pinned counts

Three machines, each with one independent error (bool->int, missing
declaration, int+string binop). Pinned in MultiErrorAcceptanceTest:
strict=1, collecting=3. Validates that one machine's error doesn't
clobber its siblings' diagnostics in collecting mode.

### Existing: All Phase 2 acceptance tests unchanged

MultipleErrors (1/4), NestedExprErrors (1/2), ForeachBodyErrors (1/2),
SpecReceiveBodyError (1/3). Same counts as Phase 2 — Phase 3 doesn't
change behavior within a single machine.

### Existing: Phase1DormancyTest baseline

The "collecting >= strict count" invariant on every Correct/StaticError
test still holds. Phase 3 only *adds* errors (when multi-machine /
multi-function scenarios exist); it never removes.

## Phase 4 (deferred, NOT in this PR)

Convert pass 5 (capability checks) and pass 6 (ControlFlowChecker)
throw sites to Report so capability and control-flow violations also
accumulate across machines instead of aborting on first. The current
HasErrors gate already prevents these passes from running on a broken
AST, so they remain RequiresClean for now. Estimated: ~1 day if
warranted by user feedback.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* Multi-agent review: harness fixes + 4 new pinned coverage tests

Multi-agent review of PRs #957/#965/#967 identified 3 HIGH-severity
harness fragilities (could silently mask real bugs) and ~10 coverage
gaps. This commit applies the harness fixes + the 4 highest-priority
new pinned tests + 1 doc fix.

## Harness fixes

### 1. Count ALL diagnostic markers, not just `[Error:]`

Previous: `CountOccurrences(stderr, "[Error:]") + CountOccurrences(stderr, "[Parser Error:]")`.

Problem: Compiler.cs also emits `[NotSupportedError:]`, `[NotImplementedError:]`,
and per-backend `[<entry> Compiling Generated Code:]` markers. A regression
that converts a real diagnostic into a `NotSupportedException` would be
counted as zero errors, making `CollectingReportsAtLeastAsManyErrorsAsStrict`'s
`>=` invariant trivially pass with both sides zero — masking the regression
entirely. We already hit this once when the ForeachInvariantError.p test
tripped a `NotSupportedException` on the PVerifier `invariant` keyword.

Fix: new `CountErrorMarkers(stderr)` uses two regexes — one for
`[<text>Error<text>:]` (catches all five marker variants) and one for the
explicit `Compiling Generated Code` marker. Applied to both
Phase1DormancyTest.cs and MultiErrorAcceptanceTest.cs.

### 2. Catch ONLY TranslationException

Previous: `catch (Exception e)` in both fixture's RunOnce helpers.

Problem: Phase 2's MultiAgent audit explicitly avoided this pattern. NREs,
`Debug.Assert` failures, and other runtime exceptions get folded into
`exitCode = -1` and identical stderr text — two parallel NREs produce
equal output, equal exit codes, equal error counts (0), and the
`StrictAndCollectingAgreeOnValidPrograms` equality assertion passes
green despite both runs crashing.

Fix: catch only `TranslationException`. Other exceptions propagate so
genuine bugs surface as test failures with full stack traces.

### 3. Mark env-var test [NonParallelizable]

Previous: `DiagnosticCollectorTest.ContinueOnError_ReadsEnvVar` mutates
the process-global `P_COMPILER_COLLECT_ERRORS` env var.

Problem: `RegressionTests.CompileOnlyRegressionTests` is
`[Parallelizable(ParallelScope.Children)]`. With `dotnet test` running
fixtures concurrently, a parallel CompilerConfiguration construction
could read the temporarily-set env var and run a strict-mode regression
test in collecting mode — silently masking a strict-mode regression as
a flaky-pass.

Fix: add `[NonParallelizable]` to the env-var test.

## Doc fix

`MultipleErrors.p` line 11 said the 4th error was "arg count mismatch".
The actual mechanism is "payload TYPE mismatch via single-arg
CheckArgument branch" because `(1,2,3)` parses as a single unnamed-tuple
argument. Count is 1 either way, but the prose was actively misleading.

## 4 new pinned coverage tests

The audit identified ~12 throw sites in ExprVisitor/StatementVisitor with
NO multi-error coverage at all. The four highest-priority additions:

1. **CtorArgErrors.p (1/3)** — VisitCtorExpr arg-pre-visit recovery.
   Validates that constructor args are visited BEFORE interface lookup,
   so missing-decl errors in args surface even when the interface
   itself is unknown.

2. **FunCallArgErrors.p (1/4)** — VisitFunCallExpr Function branch.
   Two calls combining argument errors with call-shape errors (arity +
   type mismatch). Validates cascade-suppression of arg[0] while arg[1]
   mismatch still surfaces.

3. **MultiFunctionPerMachine.p (1/3)** — Phase 3 per-FUNCTION isolation
   distinct from per-machine. Three functions in ONE machine, one
   error each. Catches a regression where Phase 3's per-function
   TolerantStep wrapper stops iterating after the first failing
   function.

4. **GotoRecoveryErrors.p (1/4)** — VisitGotoStmt rvalue-pre-visit
   recovery in both the missing-state branch and the arity-mismatch
   branch. Validates that argument errors surface even when the
   goto itself is malformed.

## Deferred (lower-priority audit findings)

- 6 more coverage gaps (CastExpr, ChooseExpr, FormatString, dup tuple
  field, RaiseStmt non-void, etc.). Less critical because each is
  covered by Phase1DormancyTest's weak `>=` baseline.
- Harness #5/#7 (scratch-dir GetHashCode collisions, parallel dotnet
  build). The Ubuntu flake from earlier may relate but hasn't recurred
  since the projectRoot fix. Tracked separately.
- Harness #8 (test timeouts). Nice-to-have; current CI runs reliably.
- Identity-aware assertions (stderr substring matches) — would lock in
  which error fires first, but adds maintenance cost. Defer.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* Round-2 multi-agent: typo fixes + source-order diagnostics + null-safety

Five fixes from a second multi-agent audit covering 5 new dimensions
(backwards-compat, PVerifier integration, error message quality,
performance/scale, cross-PR consistency, PeasyAI integration). The
audit produced ~30 findings; this commit applies the 5 quick-wins.
The rest are tracked as deferred follow-ups (see PR description).

## 1. Error message typo: "is not undeclared" reverses meaning

`DefaultTranslationErrorHandler.UndeclaredGlobalParam` previously emitted
`'global param X' is not undeclared` — literally says the param IS
declared, confusing the user reading the diagnostic. Fixed to
`global param 'X' is undeclared` (also cleaned up the inner quoting).

## 2. Error message typo: "Value M has no start state" → "machine M..."

`DefaultTranslationErrorHandler.MissingStartState` said
`Value {machine.Name} has no start state`. The word "Value" is residual
from a long-ago refactor; the entity is a Machine. Fixed.

## 3. Source-order diagnostic flushing

Previously `FlushCollectedDiagnostics` iterated `Diagnostics` in
insertion order (visitor traversal order). With multi-error collecting
mode the user expects errors in source reading order so IDE jump-to-line
flows naturally. Now sorts by `(file, line, column)` with a stable
tiebreaker. Diagnostics without a parseable location prefix sort last.

Cached compiled regex (`LocationPrefixPattern`) to avoid per-diagnostic
regex compilation on the flush path.

## 4. SourceLocation.ToString() null-safety

`SourceLocation.ToString()` previously did
`return File == null ? throw new ArgumentException() : ...`. The throw
fires partway through `FlushCollectedDiagnostics`'s loop if any
diagnostic has File=null (reachable for `EmptyContext` per
`DefaultLocationResolver.GetLocation`), partial-flushing the user's
diagnostics with a confusing ArgumentException stack trace.

Fixed: fall back to `<no source>:line:col` instead of throwing.

## 5. Cached compiled regexes in test harness CountErrorMarkers

`Phase1DormancyTest.CountErrorMarkers` and the parallel helper in
`MultiErrorAcceptanceTest` previously constructed two `new Regex(...)`
per call. With ~280 test dirs × 2 modes × 2 fixtures, that's ~1000+
regex compilations per suite. Negligible in absolute terms (~100-200ms
total) but mechanical to fix — moved to static fields with
`RegexOptions.Compiled`. Also documented why the harness's own
`[Test harness caught uncaught TranslationException:]` marker is
substring-free of "Error" (so it deliberately doesn't inflate counts).

## Deferred (multi-agent findings tracked but not applied)

- **PR #963 ↔ Phase 2 conflict resolution** — needs explicit merge-order
  decision (cherry-pick #963 into Phase 2 stack vs. land #963 first
  then rebase Phase 2's VisitTestExpr rework on top).
- **Wrap pass 5 capability + pass 6 ControlFlowChecker in TolerantStep** —
  semantic change; deserves its own PR with new pinned tests.
  Could be a "Phase 4".
- **PVerifier end-to-end CI smoke** — multi-agent flagged ZERO CI
  coverage of PVerifier-mode compile. PR #963 noted the existing
  `Tutorial/Advanced/4_Paxos/PVerifiedPaxos` already crashes on bare
  `p compile`. Needs separate fix + test addition.
- **PeasyAI collecting-mode integration** — multi-agent produced
  concrete 5-file integration plan (compilation.py, fixer.py,
  generation.py, server.py, fixing.py). The parser already supports
  multiple errors; PeasyAI just needs to set `P_COMPILER_COLLECT_ERRORS=1`
  and call the existing `get_all_errors` instead of `parse_error`.
  Separate PR scope.
- **Parallelize Phase1DormancyTest** — biggest CI speedup (~9min → ~2-3min)
  but needs scratch-dir race-condition validation.
- **"N errors suppressed" footer** — UX improvement; new infrastructure.
- **Truncation cap** — needed for pathological N>50 cases.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* Address Copilot round-2 review: InsertStmt location + doc accuracy

Real bug + 3 doc accuracy improvements from Copilot's second-round
review of PR #965 (1 code fix) and first-round review of PR #967
(2 doc updates):

## 1. VisitInsertStmt key-check pointed at the wrong AST node

`StatementVisitor.VisitInsertStmt` reported the key/index TypeMismatch
at `context.rvalue()` (the VALUE token of `insert lvalue[index] = value`)
when the key actually lives in `context.expr()`. Diagnostics like
"got type: string, expected: int" would point the user at the value
expression instead of the index expression — confusingly off-target.

The bug pre-existed the Phase 2 conversion (master had the same wrong
context), but Copilot caught it because Phase 2 made the diff visible
when converting `throw → CheckAssignable`. Fixed at line 215 to use
context.expr() for the key check; context.rvalue() remains correct for
the value check.

## 2. TolerantStep XML doc no longer claims cascade-suppression

The previous doc said TolerantStep "runs an analysis step under cascade-
suppression rules" — but it's used for both gathering AND analysis
passes, and it doesn't implement cascade suppression at all (cascade
suppression lives in CheckAssignable + the ErrorType IsSameTypeAs
short-circuit; TolerantStep is the per-pass-item analog: same goal,
different mechanism). Rewritten to:
- Acknowledge dual-use across gathering and analysis passes
- Distinguish from the expression-level cascade-suppression machinery
- Make the strict vs collecting behavior contracts explicit

## 3. MultiMachineErrors test claim narrowed to what it actually tests

The .p file header and the MultiErrorAcceptanceTest description both
claimed the test validates "TolerantStep wrappers around pass 2a
(MachineChecker) and pass 3". In reality, each machine in the file has
a valid start state and payload type, so pass 2a never throws — the
test only exercises pass 3 (function body type-checking) errors flowing
through Phase 2's Report-and-continue plus Phase 3's per-function
TolerantStep.

Both descriptions updated to accurately call out:
- What the test DOES exercise (per-machine isolation across pass 3)
- What it DOESN'T cover (per-machine pass-2a throws like
  MissingStartState — would need a separate fixture)

## 4. PR description: precise strict-mode invariants

Also updated PR #965's description to be precise about which strict-mode
invariants the PR preserves vs. relaxes. The original claim "strict mode
bit-for-bit unchanged" was over-broad — valid programs ARE bit-identical
and invalid programs still exit 1, but the first-error IDENTITY can
differ for invalid programs with both an argument error AND a
callee/format/interface error (because args are now visited BEFORE
lookup, which is the compiler-convention order and what lets collecting
mode surface child errors).

Replied on the 4 strict-mode reorder Copilot threads with the same
rationale.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
ankushdesai added a commit that referenced this pull request May 28, 2026
…omments (#968)

Final multi-agent audit on master found that the multi-error compilation
feature (PRs #957, #963, #965) shipped without any user-facing docs and
with stale "Phase 1 wiring only" / "Phase 2 will..." comments in the
public XML API. Users can't discover the env var; contributors have no
convention to follow when adding new throw sites.

## User-facing docs

- **README.md** — new "Multi-Error Compilation" section in "What's New"
  with a worked example showing strict vs collecting output and pointing
  at the AI-fix-loop use case.
- **ChangeList.md** — entry under "P 3.0 Changes" listing the env var,
  cascade-suppression semantics, the #963 state-lookup narrowing, and
  explicitly noting strict-mode bit-for-bit preservation.

## Contributor docs

- **CLAUDE.md** — new "Multi-Error Type Checking (Compiler)" top-level
  section. Covers architecture (IDiagnosticCollector, ErrorType/ErrorExpr,
  CheckAssignable, TolerantStep), the per-visitor convention checklist
  (visit children first → short-circuit on ErrorType → Report+ErrorExpr),
  the analyzer pass-tolerance pattern, and the test-fixture layout.
  Mentions the env var in the "Working with P Programs" command examples.

## MCP tool descriptions

- **peasy-ai-compile** — append a Tip about setting
  P_COMPILER_COLLECT_ERRORS=1 to receive all errors in one response.
- **peasy-ai-fix-all** — append the same Tip with a note that batched
  errors converge in N/k iterations instead of N.

(The PeasyAI subprocess invocations themselves still need to actually
PROPAGATE the env var — separate PR per the audit's recommendation.)

## Stale XML-doc refresh

- **IDiagnosticCollector** — strike "Phase 1 wiring only — no visitor
  currently calls Report" (Phase 2 lit it up). Add a Lifecycle list
  pointing at ReadContinueOnErrorEnvVar, the visitors, TolerantStep,
  and FlushCollectedDiagnostics. Cross-reference the three test files
  that exercise the contract.
- **ErrorType** — strike the "Phase 2 will additionally add a
  CheckAssignable helper" forward reference (it landed). Promote the
  helper to a peer of the IsAssignableFrom override + IsSameTypeAs
  short-circuit so the three-piece cascade-suppression story is told
  once and accurately.
- **ErrorExpr** — strike "Phase 1 introduces this class; no visitor
  produces it yet." Rephrase the IExprTerm-leak protection clause to
  cite Compiler.Compile (not the generic <see cref="Compiler"/>).

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants