Phase 2: collect-all-errors in ExprVisitor + StatementVisitor#965
Conversation
There was a problem hiding this comment.
Pull request overview
Phase 2 of the multi-error type-checker work: converts large portions of expression/statement type-checking from throw-on-first-error to diagnostic collection with recovery, enabling P_COMPILER_COLLECT_ERRORS=1 to report multiple independent errors in one compilation pass while preserving strict-mode behavior.
Changes:
- Convert many
throw handler.X(...)sites inExprVisitorandStatementVisitortohandler.Diagnostics.Report(...)with recovery (ErrorExpr/NoStmt/ best-effort statement construction). - Add
TypeCheckingUtils.CheckAssignable(...)to centralize cascade suppression when either side isErrorType. - Update/add tests and regression inputs to validate strict vs collecting invariants and pinned multi-error counts.
Reviewed changes
Copilot reviewed 6 out of 6 changed files in this pull request and generated 5 comments.
Show a summary per file
| File | Description |
|---|---|
| Tst/UnitTests/TypeChecker/Phase1DormancyTest.cs | Splits strict/collecting invariants between Correct/ and StaticError/ suites; adds coarse error counting. |
| Tst/UnitTests/TypeChecker/MultiErrorAcceptanceTest.cs | New acceptance test pinning strict vs collecting diagnostic counts for a curated multi-error program. |
| Tst/RegressionTests/Feature3Exprs/StaticError/MultipleErrors/MultipleErrors.p | New regression input containing 4 independent errors used by the acceptance test. |
| Src/PCompiler/CompilerCore/TypeChecker/TypeCheckingUtils.cs | Introduces CheckAssignable and adjusts payload validation to report via diagnostics. |
| Src/PCompiler/CompilerCore/TypeChecker/StatementVisitor.cs | Converts many statement type-check errors to Report + recover, with placeholder statements where needed. |
| Src/PCompiler/CompilerCore/TypeChecker/ExprVisitor.cs | Converts many expression type-check errors to Report + return ErrorExpr, adds top-level binop guard, and visits args earlier in some constructs. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| // If any argument failed to type-check, an upstream diagnostic was | ||
| // already reported. Skip payload validation to avoid cascade noise. | ||
| if (arguments.Any(a => a.Type is ErrorType) || payloadType is ErrorType) return; |
| handler.Diagnostics.Report(handler.MissingDeclaration(context.item, "foreach iterator variable", varName)); | ||
| // Still visit children so their internal errors surface. | ||
| exprVisitor.Visit(context.collection); | ||
| Visit(context.statement()); |
| if (machine?.IsSpec == true) | ||
| { | ||
| throw handler.IllegalMonitorOperation(context, context.RECEIVE().Symbol, machine); | ||
| handler.Diagnostics.Report(handler.IllegalMonitorOperation(context, context.RECEIVE().Symbol, machine)); | ||
| return new NoStmt(context); |
| public override IPExpr VisitTestExpr(PParser.TestExprContext context) | ||
| { | ||
| var instance = Visit(context.instance); | ||
| string name = context.kind.GetText(); | ||
|
|
| public override IPExpr VisitTargetsExpr(PParser.TargetsExprContext context) | ||
| { | ||
| var instance = Visit(context.instance); | ||
| var target = Visit(context.target); | ||
|
|
All five comments from Copilot's review of #965 were valid recovery holes — places where the converted code suppressed a diagnostic too aggressively or failed to propagate ErrorType. Fixes: ## 1. ValidatePayloadTypes: don't swallow the arity check Previous: any argument with ErrorType caused an early return from the whole helper, hiding the independent IncorrectArgumentCount diagnostic when arity was wrong AND one arg also errored upstream. Fix: remove the blanket early-return on `arguments.Any(a => a.Type is ErrorType)`. Per-argument cascade suppression is already handled by CheckAssignable's own ErrorType short-circuit, so the arity check (a property of the call-site, not the arg types) now fires independently. Added an explicit arity check inside the tuple branch since the previous code relied on Zip silently truncating mismatched counts. ## 2. VisitForeachStmt: visit invariants when iterator var is missing Previous: missing-iterator recovery visited the collection and body but not `context._invariants`. Type errors inside loop invariants were silently suppressed in collecting mode. Fix: visit each invariant via exprVisitor.Visit before returning NoStmt. ## 3. VisitReceiveStmt: visit handler bodies on spec-illegal monitor op Previous: when receive was illegal (spec machine), the method reported IllegalMonitorOperation and returned NoStmt immediately, suppressing all diagnostics inside the handler bodies. Fix: mirror what Send/Announce/Raise do for their child expressions — construct each recvCase handler (with its scope and parameter) and run FunctionBodyVisitor.PopulateMethod to surface body errors before returning NoStmt. Event-id lookups and duplicate-case checks are intentionally skipped since the receive itself is already illegal. ## 4. VisitTestExpr: propagate ErrorType from instance Previous: an instance with ErrorType still produced a bool-typed TestExpr, masking the cascade for downstream consumers. Fix: do the kind-identifier lookup first (so missing-decl on the kind is still reported even when instance errored), then bail to ErrorExpr if instance.Type is ErrorType. Preserves both diagnostics' independence. ## 5. VisitTargetsExpr / VisitFlyingExpr / VisitSentExpr: propagate ErrorType Previous: these three constructed bool-typed nodes regardless of operand state, breaking the ErrorType propagation convention documented in ExprVisitor's class header. Fix: add the same early-guard pattern used elsewhere — if any visited child has ErrorType, return new ErrorExpr(context) before constructing the typed node. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
| { | ||
| throw handler.TypeMismatch(context.rvalue(), valueType, expectedValueType); | ||
| } | ||
| TypeCheckingUtils.CheckAssignable(handler, context.rvalue(), expectedKeyType, index); |
| var interfaceName = context.interfaceName.GetText(); | ||
| // Always visit the argument list so internal arg errors surface, | ||
| // even if the interface itself is unknown. | ||
| var arguments = TypeCheckingUtils.VisitRvalueList(context.rvalueList(), this).ToArray(); | ||
|
|
||
| if (!table.Lookup(interfaceName, out Interface @interface)) | ||
| { | ||
| throw handler.MissingDeclaration(context.interfaceName, "interface", interfaceName); | ||
| handler.Diagnostics.Report(handler.MissingDeclaration(context.interfaceName, "interface", interfaceName)); | ||
| return new ErrorExpr(context); |
There was a problem hiding this comment.
Acknowledged — you're right that the eager argument visit changes WHICH error fires first in strict mode when an invalid program has both an argument error and a callee/format error. The PR description over-claimed "bit-for-bit unchanged" — the precise invariants that hold are:
- Valid programs compile to bit-identical output (no behavior change at all).
- Invalid programs still exit 1 with the same total error count in strict mode (the test harness's StaticErrorValidator only checks exit code).
- The IDENTITY of the first error reported can differ for invalid programs containing both an argument error AND a callee/format/interface error.
This was a deliberate tradeoff captured by the prior strict-mode safety audit: visiting children before parents is the more natural compiler convention (it's what rustc/swiftc/tsc do), and crucially the new order is what lets collecting mode surface child errors when the parent lookup fails — which is the whole point of Phase 2.
Gating the reordering behind ContinueOnError would require 4 duplicated code paths in VisitCtorExpr / VisitFunCallExpr / VisitFormatedString / VisitCtorStmt. The tradeoff isn't worth the maintenance burden for "the first of several diagnostics on an invalid program looks slightly different".
I'll update the PR description to be precise about which strict-mode invariants are preserved and which are intentionally relaxed. Existing static-error regression tests pass unchanged because they only check exit code, not the first-error message.
| var funName = context.fun.GetText(); | ||
| // Visit arguments first so their internal errors surface regardless | ||
| // of whether the callee resolves. Argument errors do not block | ||
| // dispatch — we still try to bind a function/pure with the args we | ||
| // got, treating ErrorExpr args as compatible with anything. | ||
| var arguments = TypeCheckingUtils.VisitRvalueList(context.rvalueList(), this).ToArray(); | ||
|
|
There was a problem hiding this comment.
Acknowledged — you're right that the eager argument visit changes WHICH error fires first in strict mode when an invalid program has both an argument error and a callee/format error. The PR description over-claimed "bit-for-bit unchanged" — the precise invariants that hold are:
- Valid programs compile to bit-identical output (no behavior change at all).
- Invalid programs still exit 1 with the same total error count in strict mode (the test harness's StaticErrorValidator only checks exit code).
- The IDENTITY of the first error reported can differ for invalid programs containing both an argument error AND a callee/format/interface error.
This was a deliberate tradeoff captured by the prior strict-mode safety audit: visiting children before parents is the more natural compiler convention (it's what rustc/swiftc/tsc do), and crucially the new order is what lets collecting mode surface child errors when the parent lookup fails — which is the whole point of Phase 2.
Gating the reordering behind ContinueOnError would require 4 duplicated code paths in VisitCtorExpr / VisitFunCallExpr / VisitFormatedString / VisitCtorStmt. The tradeoff isn't worth the maintenance burden for "the first of several diagnostics on an invalid program looks slightly different".
I'll update the PR description to be precise about which strict-mode invariants are preserved and which are intentionally relaxed. Existing static-error regression tests pass unchanged because they only check exit code, not the first-error message.
| var baseString = context.StringLiteral().GetText(); | ||
| baseString = baseString.Substring(1, baseString.Length - 2); // strip beginning / end double quote | ||
| var numNecessaryArgs = TypeCheckingUtils.PrintStmtNumArgs(baseString); | ||
| // Visit args regardless of format validity so their internal errors surface. | ||
| var args = TypeCheckingUtils.VisitRvalueList(context.rvalueList(), this).ToList(); | ||
|
|
||
| if (numNecessaryArgs == -1) | ||
| { | ||
| throw handler.InvalidStringExprFormat(context, context.StringLiteral().Symbol); | ||
| handler.Diagnostics.Report(handler.InvalidStringExprFormat(context, context.StringLiteral().Symbol)); | ||
| return new ErrorExpr(context); |
There was a problem hiding this comment.
Acknowledged — you're right that the eager argument visit changes WHICH error fires first in strict mode when an invalid program has both an argument error and a callee/format error. The PR description over-claimed "bit-for-bit unchanged" — the precise invariants that hold are:
- Valid programs compile to bit-identical output (no behavior change at all).
- Invalid programs still exit 1 with the same total error count in strict mode (the test harness's StaticErrorValidator only checks exit code).
- The IDENTITY of the first error reported can differ for invalid programs containing both an argument error AND a callee/format/interface error.
This was a deliberate tradeoff captured by the prior strict-mode safety audit: visiting children before parents is the more natural compiler convention (it's what rustc/swiftc/tsc do), and crucially the new order is what lets collecting mode surface child errors when the parent lookup fails — which is the whole point of Phase 2.
Gating the reordering behind ContinueOnError would require 4 duplicated code paths in VisitCtorExpr / VisitFunCallExpr / VisitFormatedString / VisitCtorStmt. The tradeoff isn't worth the maintenance burden for "the first of several diagnostics on an invalid program looks slightly different".
I'll update the PR description to be precise about which strict-mode invariants are preserved and which are intentionally relaxed. Existing static-error regression tests pass unchanged because they only check exit code, not the first-error message.
| public override IPStmt VisitCtorStmt(PParser.CtorStmtContext context) | ||
| { | ||
| var interfaceName = context.iden().GetText(); | ||
| // Always visit arguments so their internal errors surface. | ||
| var args = TypeCheckingUtils.VisitRvalueList(context.rvalueList(), exprVisitor).ToList(); | ||
|
|
||
| if (!table.Lookup(interfaceName, out Interface targetInterface)) | ||
| { | ||
| throw handler.MissingDeclaration(context.iden(), "interface", interfaceName); | ||
| handler.Diagnostics.Report(handler.MissingDeclaration(context.iden(), "interface", interfaceName)); | ||
| return new NoStmt(context); |
There was a problem hiding this comment.
Acknowledged — you're right that the eager argument visit changes WHICH error fires first in strict mode when an invalid program has both an argument error and a callee/format error. The PR description over-claimed "bit-for-bit unchanged" — the precise invariants that hold are:
- Valid programs compile to bit-identical output (no behavior change at all).
- Invalid programs still exit 1 with the same total error count in strict mode (the test harness's StaticErrorValidator only checks exit code).
- The IDENTITY of the first error reported can differ for invalid programs containing both an argument error AND a callee/format/interface error.
This was a deliberate tradeoff captured by the prior strict-mode safety audit: visiting children before parents is the more natural compiler convention (it's what rustc/swiftc/tsc do), and crucially the new order is what lets collecting mode surface child errors when the parent lookup fails — which is the whole point of Phase 2.
Gating the reordering behind ContinueOnError would require 4 duplicated code paths in VisitCtorExpr / VisitFunCallExpr / VisitFormatedString / VisitCtorStmt. The tradeoff isn't worth the maintenance burden for "the first of several diagnostics on an invalid program looks slightly different".
I'll update the PR description to be precise about which strict-mode invariants are preserved and which are intentionally relaxed. Existing static-error regression tests pass unchanged because they only check exit code, not the first-error message.
Multi-agent review of PRs #957/#965/#967 identified 3 HIGH-severity harness fragilities (could silently mask real bugs) and ~10 coverage gaps. This commit applies the harness fixes + the 4 highest-priority new pinned tests + 1 doc fix. ## Harness fixes ### 1. Count ALL diagnostic markers, not just `[Error:]` Previous: `CountOccurrences(stderr, "[Error:]") + CountOccurrences(stderr, "[Parser Error:]")`. Problem: Compiler.cs also emits `[NotSupportedError:]`, `[NotImplementedError:]`, and per-backend `[<entry> Compiling Generated Code:]` markers. A regression that converts a real diagnostic into a `NotSupportedException` would be counted as zero errors, making `CollectingReportsAtLeastAsManyErrorsAsStrict`'s `>=` invariant trivially pass with both sides zero — masking the regression entirely. We already hit this once when the ForeachInvariantError.p test tripped a `NotSupportedException` on the PVerifier `invariant` keyword. Fix: new `CountErrorMarkers(stderr)` uses two regexes — one for `[<text>Error<text>:]` (catches all five marker variants) and one for the explicit `Compiling Generated Code` marker. Applied to both Phase1DormancyTest.cs and MultiErrorAcceptanceTest.cs. ### 2. Catch ONLY TranslationException Previous: `catch (Exception e)` in both fixture's RunOnce helpers. Problem: Phase 2's MultiAgent audit explicitly avoided this pattern. NREs, `Debug.Assert` failures, and other runtime exceptions get folded into `exitCode = -1` and identical stderr text — two parallel NREs produce equal output, equal exit codes, equal error counts (0), and the `StrictAndCollectingAgreeOnValidPrograms` equality assertion passes green despite both runs crashing. Fix: catch only `TranslationException`. Other exceptions propagate so genuine bugs surface as test failures with full stack traces. ### 3. Mark env-var test [NonParallelizable] Previous: `DiagnosticCollectorTest.ContinueOnError_ReadsEnvVar` mutates the process-global `P_COMPILER_COLLECT_ERRORS` env var. Problem: `RegressionTests.CompileOnlyRegressionTests` is `[Parallelizable(ParallelScope.Children)]`. With `dotnet test` running fixtures concurrently, a parallel CompilerConfiguration construction could read the temporarily-set env var and run a strict-mode regression test in collecting mode — silently masking a strict-mode regression as a flaky-pass. Fix: add `[NonParallelizable]` to the env-var test. ## Doc fix `MultipleErrors.p` line 11 said the 4th error was "arg count mismatch". The actual mechanism is "payload TYPE mismatch via single-arg CheckArgument branch" because `(1,2,3)` parses as a single unnamed-tuple argument. Count is 1 either way, but the prose was actively misleading. ## 4 new pinned coverage tests The audit identified ~12 throw sites in ExprVisitor/StatementVisitor with NO multi-error coverage at all. The four highest-priority additions: 1. **CtorArgErrors.p (1/3)** — VisitCtorExpr arg-pre-visit recovery. Validates that constructor args are visited BEFORE interface lookup, so missing-decl errors in args surface even when the interface itself is unknown. 2. **FunCallArgErrors.p (1/4)** — VisitFunCallExpr Function branch. Two calls combining argument errors with call-shape errors (arity + type mismatch). Validates cascade-suppression of arg[0] while arg[1] mismatch still surfaces. 3. **MultiFunctionPerMachine.p (1/3)** — Phase 3 per-FUNCTION isolation distinct from per-machine. Three functions in ONE machine, one error each. Catches a regression where Phase 3's per-function TolerantStep wrapper stops iterating after the first failing function. 4. **GotoRecoveryErrors.p (1/4)** — VisitGotoStmt rvalue-pre-visit recovery in both the missing-state branch and the arity-mismatch branch. Validates that argument errors surface even when the goto itself is malformed. ## Deferred (lower-priority audit findings) - 6 more coverage gaps (CastExpr, ChooseExpr, FormatString, dup tuple field, RaiseStmt non-void, etc.). Less critical because each is covered by Phase1DormancyTest's weak `>=` baseline. - Harness #5/#7 (scratch-dir GetHashCode collisions, parallel dotnet build). The Ubuntu flake from earlier may relate but hasn't recurred since the projectRoot fix. Tracked separately. - Harness #8 (test timeouts). Nice-to-have; current CI runs reliably. - Identity-aware assertions (stderr substring matches) — would lock in which error fires first, but adds maintenance cost. Defer. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Real bug + 3 doc accuracy improvements from Copilot's second-round review of PR #965 (1 code fix) and first-round review of PR #967 (2 doc updates): ## 1. VisitInsertStmt key-check pointed at the wrong AST node `StatementVisitor.VisitInsertStmt` reported the key/index TypeMismatch at `context.rvalue()` (the VALUE token of `insert lvalue[index] = value`) when the key actually lives in `context.expr()`. Diagnostics like "got type: string, expected: int" would point the user at the value expression instead of the index expression — confusingly off-target. The bug pre-existed the Phase 2 conversion (master had the same wrong context), but Copilot caught it because Phase 2 made the diff visible when converting `throw → CheckAssignable`. Fixed at line 215 to use context.expr() for the key check; context.rvalue() remains correct for the value check. ## 2. TolerantStep XML doc no longer claims cascade-suppression The previous doc said TolerantStep "runs an analysis step under cascade- suppression rules" — but it's used for both gathering AND analysis passes, and it doesn't implement cascade suppression at all (cascade suppression lives in CheckAssignable + the ErrorType IsSameTypeAs short-circuit; TolerantStep is the per-pass-item analog: same goal, different mechanism). Rewritten to: - Acknowledge dual-use across gathering and analysis passes - Distinguish from the expression-level cascade-suppression machinery - Make the strict vs collecting behavior contracts explicit ## 3. MultiMachineErrors test claim narrowed to what it actually tests The .p file header and the MultiErrorAcceptanceTest description both claimed the test validates "TolerantStep wrappers around pass 2a (MachineChecker) and pass 3". In reality, each machine in the file has a valid start state and payload type, so pass 2a never throws — the test only exercises pass 3 (function body type-checking) errors flowing through Phase 2's Report-and-continue plus Phase 3's per-function TolerantStep. Both descriptions updated to accurately call out: - What the test DOES exercise (per-machine isolation across pass 3) - What it DOESN'T cover (per-machine pass-2a throws like MissingStartState — would need a separate fixture) ## 4. PR description: precise strict-mode invariants Also updated PR #965's description to be precise about which strict-mode invariants the PR preserves vs. relaxes. The original claim "strict mode bit-for-bit unchanged" was over-broad — valid programs ARE bit-identical and invalid programs still exit 1, but the first-error IDENTITY can differ for invalid programs with both an argument error AND a callee/format/interface error (because args are now visited BEFORE lookup, which is the compiler-convention order and what lets collecting mode surface child errors). Replied on the 4 strict-mode reorder Copilot threads with the same rationale. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Lights up the diagnostic collector scaffolded in #957. In strict mode (default), behavior is unchanged because handler.Diagnostics.Report re-throws immediately. In collecting mode (P_COMPILER_COLLECT_ERRORS=1) the type checker now reports all independent expression / statement errors in one pass instead of aborting at the first. - CheckAssignable(handler, ctx, expected, actual): report-and-return-false variant of IsAssignableFrom. ErrorType on either side is silently treated as compatible, preventing one upstream error from generating a chain of "incompatible operand" diagnostics downstream. - CheckArgument and ValidatePayloadTypes route through CheckAssignable so call-site / send-site argument checks inherit the same suppression. Every visit method follows the same convention: 1. Visit children first (so their internal errors surface). 2. If any child has ErrorType, return new ErrorExpr(context) without further checks (combiner rule). 3. Each `throw handler.X(...)` becomes `handler.Diagnostics.Report(handler.X(...)); return new ErrorExpr(...);` with remaining compatibility checks routed via CheckAssignable. Notable per-site recoveries: - VisitBinExpr: a single ErrorType guard at the top covers all 14 throws inside the switch. - VisitFunCallExpr / VisitCtorExpr: arguments are visited *before* callee/interface resolution so their internal errors surface even when the callee is unknown. - VisitCastExpr: resolve target type even when sub-expression errored, so a malformed type expression in a cast also gets reported. Statements either continue building the AST node with the values they have (most cases) or return new NoStmt(context) when there's nothing meaningful to construct (missing variable / interface / function / state / event): - Missing-declaration sites in CtorStmt, FunCallStmt, ForeachStmt, GotoStmt, ReceiveStmt → NoStmt placeholder. - Type-mismatch sites in AssertStmt, AssumeStmt, PrintStmt, ReturnStmt, AssignStmt, IfStmt, WhileStmt etc. → Report and continue with the typed expression as-is (downstream sees ErrorType if upstream). - SendStmt / RaiseStmt / AnnounceStmt: ErrorType guard around the null-event + assignability checks so an upstream-errored event expression doesn't generate spurious diagnostics. The Phase 1 fixture asserted bit-identical stderr in both modes. That's no longer the right invariant for StaticError/ inputs — collecting mode can now legitimately report MORE errors per file. Split into: - StrictAndCollectingAgreeOnValidPrograms: Correct/ only. Both modes must produce empty stderr and exit 0. - CollectingReportsAtLeastAsManyErrorsAsStrict: StaticError/ only. Both exit 1; collecting count >= strict count. Catches the only regression that matters: collecting silently suppressing a strict error. A curated P file with 4 deliberately-independent errors: - bool assigned to int - missing-declaration (must not cascade into "wrong type") - int + string - send arg-count mismatch Strict mode reports exactly 1 (aborts on first). Collecting mode reports exactly 4 with no spurious cascade diagnostics. Pinning both counts catches regressions to either the throw-site conversion or the cascade-suppression rules. handler.Diagnostics.Report re-throws when ContinueOnError=false. Every existing static-error test passes unchanged — the throw-site conversion preserves the historical "first error wins, abort" behavior bit-for-bit in strict mode. Pass-level tolerance classification in Analyzer.cs — decide per pass whether it can run on a partially-broken AST (e.g. capability check and module-system passes need a clean AST; control-flow check can tolerate per-function errors). Estimated 2-3 days. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Two fixes for the Phase 2 CI failures: ## 1. NamedTupleBody dict collision (real bug) In collecting mode, the recovery for a duplicate field name added an entry with the *original* name to the entries[] array. Downstream NamedTupleType construction builds a Dictionary keyed by entry name, which throws InvalidOperationException on the collision. Test `Feature4DataTypes | StaticError | namedDuplicateField2` failed with "An item with the same key has already been added. Key: a". Fix: mangle the duplicate's name with a unique suffix (`$dup$N`). The duplicate-field diagnostic is already reported to the user via handler.Diagnostics.Report; mangling just keeps the AST well-formed for any subsequent type checks that happen to inspect the tuple. ## 2. Stack traces in test harness (diagnostic improvement) Phase1DormancyTest + MultiErrorAcceptanceTest both swallow uncaught exceptions from new Compiler().Compile(...) and report only e.Message. Several CI failures showed NREs on valid programs in strict mode with no way to identify which compiler pass NRE'd. Now we print `e.GetType().Name` and the full stack trace into stderr so the next CI run pinpoints the responsible code site. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
PCheckerCodeGenerator.Compile dereferences job.ProjectRootPath.FullName at the dotnet-build stage. My fixtures used the 5-arg CompilerConfiguration constructor without supplying projectRoot, so the PChecker backend NRE'd during the second stage of Compile on every Correct program. This was masked in Phase 1 because both modes (strict and collecting) NRE'd symmetrically, and the Phase1DormancyTest's identical-stderr assertion still held. The Phase 2 split asserts exit code == 0 on valid programs, which exposed the bug. Mirror PCheckerRunner.cs by passing scratchDir as the projectRoot. The MultiErrorAcceptanceTest fixture also gets the same fix as defence in depth — its tests stop at type-check (HasErrors causes Compiler.cs to return early before reaching the backend), so the bug wouldn't fire there in practice, but consistency keeps the next person from re- discovering this. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
All five comments from Copilot's review of #965 were valid recovery holes — places where the converted code suppressed a diagnostic too aggressively or failed to propagate ErrorType. Fixes: ## 1. ValidatePayloadTypes: don't swallow the arity check Previous: any argument with ErrorType caused an early return from the whole helper, hiding the independent IncorrectArgumentCount diagnostic when arity was wrong AND one arg also errored upstream. Fix: remove the blanket early-return on `arguments.Any(a => a.Type is ErrorType)`. Per-argument cascade suppression is already handled by CheckAssignable's own ErrorType short-circuit, so the arity check (a property of the call-site, not the arg types) now fires independently. Added an explicit arity check inside the tuple branch since the previous code relied on Zip silently truncating mismatched counts. ## 2. VisitForeachStmt: visit invariants when iterator var is missing Previous: missing-iterator recovery visited the collection and body but not `context._invariants`. Type errors inside loop invariants were silently suppressed in collecting mode. Fix: visit each invariant via exprVisitor.Visit before returning NoStmt. ## 3. VisitReceiveStmt: visit handler bodies on spec-illegal monitor op Previous: when receive was illegal (spec machine), the method reported IllegalMonitorOperation and returned NoStmt immediately, suppressing all diagnostics inside the handler bodies. Fix: mirror what Send/Announce/Raise do for their child expressions — construct each recvCase handler (with its scope and parameter) and run FunctionBodyVisitor.PopulateMethod to surface body errors before returning NoStmt. Event-id lookups and duplicate-case checks are intentionally skipped since the receive itself is already illegal. ## 4. VisitTestExpr: propagate ErrorType from instance Previous: an instance with ErrorType still produced a bool-typed TestExpr, masking the cascade for downstream consumers. Fix: do the kind-identifier lookup first (so missing-decl on the kind is still reported even when instance errored), then bail to ErrorExpr if instance.Type is ErrorType. Preserves both diagnostics' independence. ## 5. VisitTargetsExpr / VisitFlyingExpr / VisitSentExpr: propagate ErrorType Previous: these three constructed bool-typed nodes regardless of operand state, breaking the ErrorType propagation convention documented in ExprVisitor's class header. Fix: add the same early-guard pattern used elsewhere — if any visited child has ErrorType, return new ErrorExpr(context) before constructing the typed node. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
After 3 rounds of Copilot findings, ran a 5-agent parallel audit covering cascade-suppression conventions, AST-recovery downstream NRE risks, strict- mode preservation, test coverage gaps, and Phase 3 forward compatibility. This commit applies the HIGH and MEDIUM severity findings and adds three new pinned-count acceptance tests. ## Why Copilot kept finding things Cascade-suppression has ~5 rules (producer, propagator, combiner, lvalue, call) and the conversion touched ~30 visit methods. Verifying every method follows every rule is combinatorially hard for a single reviewer. The multi-agent audit caught a third set of bugs that a single read missed — this time we also added programmatic acceptance tests that pin error counts so the next regression fires loudly. ## Tier 1: cascade-suppression fixes (HIGH severity) **1. `VisitReceiveStmt` spec-machine recovery now visits event-id lookups.** Copilot's first round added handler body visiting; the audit caught that undeclared events in the `case ...` lists were still being swallowed. **2. `VisitQuantExpr` diff-bound recovery now visits the body.** The two early-returns (arity mismatch, bound-type mismatch) skipped the body, so nested type errors inside `forall ... :: <body>` quantifiers got lost when the bound was malformed. **3. `VisitQuantExpr` bound-ErrorType short-circuit.** When TypeResolver reported a missing declaration on the bound type, the diff switch defaulted to a spurious "expected Event" cascade. New explicit `if (bound[0].Type is ErrorType)` guard bails silently and visits the body anyway. ## Tier 2: defense-in-depth for Phase 3 **4. Analyzer.cs HasErrors gate before pass 8.** ModuleSystemDeclarations / ModuleSystemTypeChecker (passes 8-10) contain hard casts (`(NamedTupleExpr)`, `(NamedTupleType)`, `(SeqLiteralExpr)`) that ErrorExpr / ErrorType would trip with InvalidCastException in collecting mode on a partial AST. Today this is masked by Compiler.cs's HasErrors gate at the END of compilation — but those passes still RAN on the broken AST. Skip them when diagnostics are present. Phase 3 will replace this single gate with per-pass tolerance classification. **5. `FunctionBodyVisitor.PopulateMethod` backfills empty Body.** `Function.IsForeign = Body == null`, so a function that errored partway through population would silently be misclassified as foreign. Backfill `new CompoundStmt(...)` on failure. Belt-and-braces — Compiler.cs's HasErrors gate already ensures these don't reach IR or backends, but preserving the "non-foreign ⇒ has body" invariant simplifies Phase 3. **6. `ValidatePayloadTypes` null payload safety.** Pass 2a (MachineChecker) may leave PayloadType null when an event/ interface declaration itself errored. Treat null as ErrorType bail so ValidatePayloadTypes doesn't NRE in collecting mode when called via pass-3 recovery paths. ## Tier 3: pinned-count acceptance tests Refactored `MultiErrorAcceptanceTest` from one hard-coded file to a `[TestCaseSource]`-driven fixture. Each row pins (strict count, collecting count) for a curated `.p` file. When counts change intentionally, update the row AND the file's header comment. Three new pinned cases: - **`NestedExprErrors.p`** (1 / 2): single expression with two nested missing-declarations. Validates combiner rule — `undeclaredA.foo + undeclaredB.bar * "str"` must NOT produce extra cascades on `+` or `*`. - **`ForeachInvariantError.p`** (1 / 2): foreach with valid iterator, body error + invariant error. Validates the happy-path branch of VisitForeachStmt visits both body and each invariant. - **`SpecReceiveBodyError.p`** (1 / 3): `receive` inside a spec machine: IllegalMonitorOperation + undeclared event + body type mismatch. Validates the new event-id lookup + the existing handler- body recovery in the spec-machine branch. Diagnostic guidance for future failures is in MultiErrorAcceptanceTest's class doc. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
CompoundStmt's constructor takes IEnumerable<IPStmt>; the empty-body fail-safe I just added references IPStmt without the matching using. Build broke on all 3 platforms with CS0246. Trivial import fix. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
ForeachInvariantError.p triggered a NotSupportedException because `invariant` is gated to PVerifier mode — the regression suite runs in default PChecker mode which rejects PVerifier tokens at the lexer. The MultiErrorAcceptanceTest pinned 1/2 errors but the harness got 0/0 because [NotSupportedError:] isn't counted as [Error:]. Rename ForeachInvariantError → ForeachBodyErrors and rework to exercise two body errors (MissingDeclaration + TypeMismatch) without any PVerifier syntax. Still validates the audit-flagged "happy path visits the body" invariant; the parallel "happy path visits invariants" fix in StatementVisitor.VisitForeachStmt remains in code and will get its own regression once a PVerifier-mode test harness exists. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
… precheck Subtle bug noticed while resolving the rebase conflict: the hasKind precheck loop used `table.Lookup(name, out State _)` (cross-machine lookup) for the State branch, but the actual dispatch below used `TryResolveStateForInstance` (PR #963's narrowed lookup). Failure path: `myA is S2` where MachineA has no S2 but MachineB does. hasKind succeeds via the cross-machine lookup (S2 found in B). The ErrorType check passes. Then dispatch falls through Machine/Event, and `TryResolveStateForInstance` correctly returns false (S2 not in A's scope) — but the trailing "Unreachable" comment becomes REACHABLE and silently returns ErrorExpr without firing the MissingDeclaration diagnostic. The user gets exit code 1 with NO error message. Fix: change the hasKind precheck's State branch to use `TryResolveStateForInstance` so `hasKind` reflects whether a USABLE binding exists, not just any binding anywhere in the program. Now the cross-machine S2 case correctly fails the precheck and emits the "could not find machine, event, or state 'S2'" diagnostic. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
bd1cafe to
dd3f4bf
Compare
Stacks on Phase 2 (#965). Closes the multi-error type-checker initiative by adding per-item try/catch isolation to the analyzer's gathering passes and a HasErrors gate before the analysis passes that assume a clean AST. ## What changed ### `TolerantStep` helper New private helper in `Analyzer.cs`: private static void TolerantStep(ITranslationErrorHandler handler, Action body) { try { body(); } catch (TranslationException e) when (handler.Diagnostics.ContinueOnError) { handler.Diagnostics.Report(e); } } In strict mode (ContinueOnError == false) the `when` filter is false so the exception propagates exactly as before — bit-for-bit preservation of today's throw-on-first-error semantics. In collecting mode the catch captures the TranslationException into the collector and the loop continues with the next item. Only TranslationException is caught; NREs and other runtime exceptions still propagate so genuine bugs surface loudly. ### Tolerant gathering passes (2a, 2b, 3, 3b, 7) Each iteration wrapped in `TolerantStep`: - **2a MachineChecker.Validate** — per-machine. One bad machine no longer aborts MachineChecker for siblings. - **3 FunctionBody + Validator** — per-function. One bad function (TypeResolver throw on a bad var decl, missing-paths-return, etc.) no longer aborts pass 3 for the rest of the program. - **3b Invariants / Axioms / AssumeOnStarts / Pures / Foreign** — per-item across all five subgroups. One bad invariant no longer blocks axiom checking. - **2b ValidateNoStaticHandlers** — per-machine. - **7 InferMachineCreates** — per-machine. Already gated by HasErrors in practice, but the wrapper is cheap insurance. ### Gathering→analysis boundary New HasErrors gate placed after pass 2b (the last gathering pass) and before pass 4 (ApplyPropagations): if (handler.Diagnostics.HasErrors) return globalScope; If any gathering pass collected an error, the analysis passes (4-7) and the module-system passes (8-10) are skipped. The duplicate gate before pass 8 (added in Phase 2 robustness) is kept as defense-in-depth — once Phase 4+ converts pass-5 capability checks to Report, that second gate will earn its keep. ## Why this matters Without Phase 3, a single machine with a bad start state aborts MachineChecker mid-pass, so a sibling machine's MoreThanOneParameterForHandlers error never gets reported. A single function with a malformed var decl aborts pass 3, so none of the OTHER functions' bodies get type-checked at all. Combined with Phase 2's expression-level recovery, Phase 3 means a user with N independent errors (in any combination of expressions, statements, machines, or functions) sees ALL of them in one compile. ## Tests ### New: `MultiMachineErrors.p` + pinned counts Three machines, each with one independent error (bool->int, missing declaration, int+string binop). Pinned in MultiErrorAcceptanceTest: strict=1, collecting=3. Validates that one machine's error doesn't clobber its siblings' diagnostics in collecting mode. ### Existing: All Phase 2 acceptance tests unchanged MultipleErrors (1/4), NestedExprErrors (1/2), ForeachBodyErrors (1/2), SpecReceiveBodyError (1/3). Same counts as Phase 2 — Phase 3 doesn't change behavior within a single machine. ### Existing: Phase1DormancyTest baseline The "collecting >= strict count" invariant on every Correct/StaticError test still holds. Phase 3 only *adds* errors (when multi-machine / multi-function scenarios exist); it never removes. ## Phase 4 (deferred, NOT in this PR) Convert pass 5 (capability checks) and pass 6 (ControlFlowChecker) throw sites to Report so capability and control-flow violations also accumulate across machines instead of aborting on first. The current HasErrors gate already prevents these passes from running on a broken AST, so they remain RequiresClean for now. Estimated: ~1 day if warranted by user feedback. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Multi-agent review of PRs #957/#965/#967 identified 3 HIGH-severity harness fragilities (could silently mask real bugs) and ~10 coverage gaps. This commit applies the harness fixes + the 4 highest-priority new pinned tests + 1 doc fix. ## Harness fixes ### 1. Count ALL diagnostic markers, not just `[Error:]` Previous: `CountOccurrences(stderr, "[Error:]") + CountOccurrences(stderr, "[Parser Error:]")`. Problem: Compiler.cs also emits `[NotSupportedError:]`, `[NotImplementedError:]`, and per-backend `[<entry> Compiling Generated Code:]` markers. A regression that converts a real diagnostic into a `NotSupportedException` would be counted as zero errors, making `CollectingReportsAtLeastAsManyErrorsAsStrict`'s `>=` invariant trivially pass with both sides zero — masking the regression entirely. We already hit this once when the ForeachInvariantError.p test tripped a `NotSupportedException` on the PVerifier `invariant` keyword. Fix: new `CountErrorMarkers(stderr)` uses two regexes — one for `[<text>Error<text>:]` (catches all five marker variants) and one for the explicit `Compiling Generated Code` marker. Applied to both Phase1DormancyTest.cs and MultiErrorAcceptanceTest.cs. ### 2. Catch ONLY TranslationException Previous: `catch (Exception e)` in both fixture's RunOnce helpers. Problem: Phase 2's MultiAgent audit explicitly avoided this pattern. NREs, `Debug.Assert` failures, and other runtime exceptions get folded into `exitCode = -1` and identical stderr text — two parallel NREs produce equal output, equal exit codes, equal error counts (0), and the `StrictAndCollectingAgreeOnValidPrograms` equality assertion passes green despite both runs crashing. Fix: catch only `TranslationException`. Other exceptions propagate so genuine bugs surface as test failures with full stack traces. ### 3. Mark env-var test [NonParallelizable] Previous: `DiagnosticCollectorTest.ContinueOnError_ReadsEnvVar` mutates the process-global `P_COMPILER_COLLECT_ERRORS` env var. Problem: `RegressionTests.CompileOnlyRegressionTests` is `[Parallelizable(ParallelScope.Children)]`. With `dotnet test` running fixtures concurrently, a parallel CompilerConfiguration construction could read the temporarily-set env var and run a strict-mode regression test in collecting mode — silently masking a strict-mode regression as a flaky-pass. Fix: add `[NonParallelizable]` to the env-var test. ## Doc fix `MultipleErrors.p` line 11 said the 4th error was "arg count mismatch". The actual mechanism is "payload TYPE mismatch via single-arg CheckArgument branch" because `(1,2,3)` parses as a single unnamed-tuple argument. Count is 1 either way, but the prose was actively misleading. ## 4 new pinned coverage tests The audit identified ~12 throw sites in ExprVisitor/StatementVisitor with NO multi-error coverage at all. The four highest-priority additions: 1. **CtorArgErrors.p (1/3)** — VisitCtorExpr arg-pre-visit recovery. Validates that constructor args are visited BEFORE interface lookup, so missing-decl errors in args surface even when the interface itself is unknown. 2. **FunCallArgErrors.p (1/4)** — VisitFunCallExpr Function branch. Two calls combining argument errors with call-shape errors (arity + type mismatch). Validates cascade-suppression of arg[0] while arg[1] mismatch still surfaces. 3. **MultiFunctionPerMachine.p (1/3)** — Phase 3 per-FUNCTION isolation distinct from per-machine. Three functions in ONE machine, one error each. Catches a regression where Phase 3's per-function TolerantStep wrapper stops iterating after the first failing function. 4. **GotoRecoveryErrors.p (1/4)** — VisitGotoStmt rvalue-pre-visit recovery in both the missing-state branch and the arity-mismatch branch. Validates that argument errors surface even when the goto itself is malformed. ## Deferred (lower-priority audit findings) - 6 more coverage gaps (CastExpr, ChooseExpr, FormatString, dup tuple field, RaiseStmt non-void, etc.). Less critical because each is covered by Phase1DormancyTest's weak `>=` baseline. - Harness #5/#7 (scratch-dir GetHashCode collisions, parallel dotnet build). The Ubuntu flake from earlier may relate but hasn't recurred since the projectRoot fix. Tracked separately. - Harness #8 (test timeouts). Nice-to-have; current CI runs reliably. - Identity-aware assertions (stderr substring matches) — would lock in which error fires first, but adds maintenance cost. Defer. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Real bug + 3 doc accuracy improvements from Copilot's second-round review of PR #965 (1 code fix) and first-round review of PR #967 (2 doc updates): ## 1. VisitInsertStmt key-check pointed at the wrong AST node `StatementVisitor.VisitInsertStmt` reported the key/index TypeMismatch at `context.rvalue()` (the VALUE token of `insert lvalue[index] = value`) when the key actually lives in `context.expr()`. Diagnostics like "got type: string, expected: int" would point the user at the value expression instead of the index expression — confusingly off-target. The bug pre-existed the Phase 2 conversion (master had the same wrong context), but Copilot caught it because Phase 2 made the diff visible when converting `throw → CheckAssignable`. Fixed at line 215 to use context.expr() for the key check; context.rvalue() remains correct for the value check. ## 2. TolerantStep XML doc no longer claims cascade-suppression The previous doc said TolerantStep "runs an analysis step under cascade- suppression rules" — but it's used for both gathering AND analysis passes, and it doesn't implement cascade suppression at all (cascade suppression lives in CheckAssignable + the ErrorType IsSameTypeAs short-circuit; TolerantStep is the per-pass-item analog: same goal, different mechanism). Rewritten to: - Acknowledge dual-use across gathering and analysis passes - Distinguish from the expression-level cascade-suppression machinery - Make the strict vs collecting behavior contracts explicit ## 3. MultiMachineErrors test claim narrowed to what it actually tests The .p file header and the MultiErrorAcceptanceTest description both claimed the test validates "TolerantStep wrappers around pass 2a (MachineChecker) and pass 3". In reality, each machine in the file has a valid start state and payload type, so pass 2a never throws — the test only exercises pass 3 (function body type-checking) errors flowing through Phase 2's Report-and-continue plus Phase 3's per-function TolerantStep. Both descriptions updated to accurately call out: - What the test DOES exercise (per-machine isolation across pass 3) - What it DOESN'T cover (per-machine pass-2a throws like MissingStartState — would need a separate fixture) ## 4. PR description: precise strict-mode invariants Also updated PR #965's description to be precise about which strict-mode invariants the PR preserves vs. relaxes. The original claim "strict mode bit-for-bit unchanged" was over-broad — valid programs ARE bit-identical and invalid programs still exit 1, but the first-error IDENTITY can differ for invalid programs with both an argument error AND a callee/format/interface error (because args are now visited BEFORE lookup, which is the compiler-convention order and what lets collecting mode surface child errors). Replied on the 4 strict-mode reorder Copilot threads with the same rationale. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Stacks on Phase 2 (#965). Closes the multi-error type-checker initiative by adding per-item try/catch isolation to the analyzer's gathering passes and a HasErrors gate before the analysis passes that assume a clean AST. ## What changed ### `TolerantStep` helper New private helper in `Analyzer.cs`: private static void TolerantStep(ITranslationErrorHandler handler, Action body) { try { body(); } catch (TranslationException e) when (handler.Diagnostics.ContinueOnError) { handler.Diagnostics.Report(e); } } In strict mode (ContinueOnError == false) the `when` filter is false so the exception propagates exactly as before — bit-for-bit preservation of today's throw-on-first-error semantics. In collecting mode the catch captures the TranslationException into the collector and the loop continues with the next item. Only TranslationException is caught; NREs and other runtime exceptions still propagate so genuine bugs surface loudly. ### Tolerant gathering passes (2a, 2b, 3, 3b, 7) Each iteration wrapped in `TolerantStep`: - **2a MachineChecker.Validate** — per-machine. One bad machine no longer aborts MachineChecker for siblings. - **3 FunctionBody + Validator** — per-function. One bad function (TypeResolver throw on a bad var decl, missing-paths-return, etc.) no longer aborts pass 3 for the rest of the program. - **3b Invariants / Axioms / AssumeOnStarts / Pures / Foreign** — per-item across all five subgroups. One bad invariant no longer blocks axiom checking. - **2b ValidateNoStaticHandlers** — per-machine. - **7 InferMachineCreates** — per-machine. Already gated by HasErrors in practice, but the wrapper is cheap insurance. ### Gathering→analysis boundary New HasErrors gate placed after pass 2b (the last gathering pass) and before pass 4 (ApplyPropagations): if (handler.Diagnostics.HasErrors) return globalScope; If any gathering pass collected an error, the analysis passes (4-7) and the module-system passes (8-10) are skipped. The duplicate gate before pass 8 (added in Phase 2 robustness) is kept as defense-in-depth — once Phase 4+ converts pass-5 capability checks to Report, that second gate will earn its keep. ## Why this matters Without Phase 3, a single machine with a bad start state aborts MachineChecker mid-pass, so a sibling machine's MoreThanOneParameterForHandlers error never gets reported. A single function with a malformed var decl aborts pass 3, so none of the OTHER functions' bodies get type-checked at all. Combined with Phase 2's expression-level recovery, Phase 3 means a user with N independent errors (in any combination of expressions, statements, machines, or functions) sees ALL of them in one compile. ## Tests ### New: `MultiMachineErrors.p` + pinned counts Three machines, each with one independent error (bool->int, missing declaration, int+string binop). Pinned in MultiErrorAcceptanceTest: strict=1, collecting=3. Validates that one machine's error doesn't clobber its siblings' diagnostics in collecting mode. ### Existing: All Phase 2 acceptance tests unchanged MultipleErrors (1/4), NestedExprErrors (1/2), ForeachBodyErrors (1/2), SpecReceiveBodyError (1/3). Same counts as Phase 2 — Phase 3 doesn't change behavior within a single machine. ### Existing: Phase1DormancyTest baseline The "collecting >= strict count" invariant on every Correct/StaticError test still holds. Phase 3 only *adds* errors (when multi-machine / multi-function scenarios exist); it never removes. ## Phase 4 (deferred, NOT in this PR) Convert pass 5 (capability checks) and pass 6 (ControlFlowChecker) throw sites to Report so capability and control-flow violations also accumulate across machines instead of aborting on first. The current HasErrors gate already prevents these passes from running on a broken AST, so they remain RequiresClean for now. Estimated: ~1 day if warranted by user feedback. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Multi-agent review of PRs #957/#965/#967 identified 3 HIGH-severity harness fragilities (could silently mask real bugs) and ~10 coverage gaps. This commit applies the harness fixes + the 4 highest-priority new pinned tests + 1 doc fix. ## Harness fixes ### 1. Count ALL diagnostic markers, not just `[Error:]` Previous: `CountOccurrences(stderr, "[Error:]") + CountOccurrences(stderr, "[Parser Error:]")`. Problem: Compiler.cs also emits `[NotSupportedError:]`, `[NotImplementedError:]`, and per-backend `[<entry> Compiling Generated Code:]` markers. A regression that converts a real diagnostic into a `NotSupportedException` would be counted as zero errors, making `CollectingReportsAtLeastAsManyErrorsAsStrict`'s `>=` invariant trivially pass with both sides zero — masking the regression entirely. We already hit this once when the ForeachInvariantError.p test tripped a `NotSupportedException` on the PVerifier `invariant` keyword. Fix: new `CountErrorMarkers(stderr)` uses two regexes — one for `[<text>Error<text>:]` (catches all five marker variants) and one for the explicit `Compiling Generated Code` marker. Applied to both Phase1DormancyTest.cs and MultiErrorAcceptanceTest.cs. ### 2. Catch ONLY TranslationException Previous: `catch (Exception e)` in both fixture's RunOnce helpers. Problem: Phase 2's MultiAgent audit explicitly avoided this pattern. NREs, `Debug.Assert` failures, and other runtime exceptions get folded into `exitCode = -1` and identical stderr text — two parallel NREs produce equal output, equal exit codes, equal error counts (0), and the `StrictAndCollectingAgreeOnValidPrograms` equality assertion passes green despite both runs crashing. Fix: catch only `TranslationException`. Other exceptions propagate so genuine bugs surface as test failures with full stack traces. ### 3. Mark env-var test [NonParallelizable] Previous: `DiagnosticCollectorTest.ContinueOnError_ReadsEnvVar` mutates the process-global `P_COMPILER_COLLECT_ERRORS` env var. Problem: `RegressionTests.CompileOnlyRegressionTests` is `[Parallelizable(ParallelScope.Children)]`. With `dotnet test` running fixtures concurrently, a parallel CompilerConfiguration construction could read the temporarily-set env var and run a strict-mode regression test in collecting mode — silently masking a strict-mode regression as a flaky-pass. Fix: add `[NonParallelizable]` to the env-var test. ## Doc fix `MultipleErrors.p` line 11 said the 4th error was "arg count mismatch". The actual mechanism is "payload TYPE mismatch via single-arg CheckArgument branch" because `(1,2,3)` parses as a single unnamed-tuple argument. Count is 1 either way, but the prose was actively misleading. ## 4 new pinned coverage tests The audit identified ~12 throw sites in ExprVisitor/StatementVisitor with NO multi-error coverage at all. The four highest-priority additions: 1. **CtorArgErrors.p (1/3)** — VisitCtorExpr arg-pre-visit recovery. Validates that constructor args are visited BEFORE interface lookup, so missing-decl errors in args surface even when the interface itself is unknown. 2. **FunCallArgErrors.p (1/4)** — VisitFunCallExpr Function branch. Two calls combining argument errors with call-shape errors (arity + type mismatch). Validates cascade-suppression of arg[0] while arg[1] mismatch still surfaces. 3. **MultiFunctionPerMachine.p (1/3)** — Phase 3 per-FUNCTION isolation distinct from per-machine. Three functions in ONE machine, one error each. Catches a regression where Phase 3's per-function TolerantStep wrapper stops iterating after the first failing function. 4. **GotoRecoveryErrors.p (1/4)** — VisitGotoStmt rvalue-pre-visit recovery in both the missing-state branch and the arity-mismatch branch. Validates that argument errors surface even when the goto itself is malformed. ## Deferred (lower-priority audit findings) - 6 more coverage gaps (CastExpr, ChooseExpr, FormatString, dup tuple field, RaiseStmt non-void, etc.). Less critical because each is covered by Phase1DormancyTest's weak `>=` baseline. - Harness #5/#7 (scratch-dir GetHashCode collisions, parallel dotnet build). The Ubuntu flake from earlier may relate but hasn't recurred since the projectRoot fix. Tracked separately. - Harness #8 (test timeouts). Nice-to-have; current CI runs reliably. - Identity-aware assertions (stderr substring matches) — would lock in which error fires first, but adds maintenance cost. Defer. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Real bug + 3 doc accuracy improvements from Copilot's second-round review of PR #965 (1 code fix) and first-round review of PR #967 (2 doc updates): ## 1. VisitInsertStmt key-check pointed at the wrong AST node `StatementVisitor.VisitInsertStmt` reported the key/index TypeMismatch at `context.rvalue()` (the VALUE token of `insert lvalue[index] = value`) when the key actually lives in `context.expr()`. Diagnostics like "got type: string, expected: int" would point the user at the value expression instead of the index expression — confusingly off-target. The bug pre-existed the Phase 2 conversion (master had the same wrong context), but Copilot caught it because Phase 2 made the diff visible when converting `throw → CheckAssignable`. Fixed at line 215 to use context.expr() for the key check; context.rvalue() remains correct for the value check. ## 2. TolerantStep XML doc no longer claims cascade-suppression The previous doc said TolerantStep "runs an analysis step under cascade- suppression rules" — but it's used for both gathering AND analysis passes, and it doesn't implement cascade suppression at all (cascade suppression lives in CheckAssignable + the ErrorType IsSameTypeAs short-circuit; TolerantStep is the per-pass-item analog: same goal, different mechanism). Rewritten to: - Acknowledge dual-use across gathering and analysis passes - Distinguish from the expression-level cascade-suppression machinery - Make the strict vs collecting behavior contracts explicit ## 3. MultiMachineErrors test claim narrowed to what it actually tests The .p file header and the MultiErrorAcceptanceTest description both claimed the test validates "TolerantStep wrappers around pass 2a (MachineChecker) and pass 3". In reality, each machine in the file has a valid start state and payload type, so pass 2a never throws — the test only exercises pass 3 (function body type-checking) errors flowing through Phase 2's Report-and-continue plus Phase 3's per-function TolerantStep. Both descriptions updated to accurately call out: - What the test DOES exercise (per-machine isolation across pass 3) - What it DOESN'T cover (per-machine pass-2a throws like MissingStartState — would need a separate fixture) ## 4. PR description: precise strict-mode invariants Also updated PR #965's description to be precise about which strict-mode invariants the PR preserves vs. relaxes. The original claim "strict mode bit-for-bit unchanged" was over-broad — valid programs ARE bit-identical and invalid programs still exit 1, but the first-error IDENTITY can differ for invalid programs with both an argument error AND a callee/format/interface error (because args are now visited BEFORE lookup, which is the compiler-convention order and what lets collecting mode surface child errors). Replied on the 4 strict-mode reorder Copilot threads with the same rationale. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* Phase 3: pass-level tolerance classification in Analyzer.cs Stacks on Phase 2 (#965). Closes the multi-error type-checker initiative by adding per-item try/catch isolation to the analyzer's gathering passes and a HasErrors gate before the analysis passes that assume a clean AST. ## What changed ### `TolerantStep` helper New private helper in `Analyzer.cs`: private static void TolerantStep(ITranslationErrorHandler handler, Action body) { try { body(); } catch (TranslationException e) when (handler.Diagnostics.ContinueOnError) { handler.Diagnostics.Report(e); } } In strict mode (ContinueOnError == false) the `when` filter is false so the exception propagates exactly as before — bit-for-bit preservation of today's throw-on-first-error semantics. In collecting mode the catch captures the TranslationException into the collector and the loop continues with the next item. Only TranslationException is caught; NREs and other runtime exceptions still propagate so genuine bugs surface loudly. ### Tolerant gathering passes (2a, 2b, 3, 3b, 7) Each iteration wrapped in `TolerantStep`: - **2a MachineChecker.Validate** — per-machine. One bad machine no longer aborts MachineChecker for siblings. - **3 FunctionBody + Validator** — per-function. One bad function (TypeResolver throw on a bad var decl, missing-paths-return, etc.) no longer aborts pass 3 for the rest of the program. - **3b Invariants / Axioms / AssumeOnStarts / Pures / Foreign** — per-item across all five subgroups. One bad invariant no longer blocks axiom checking. - **2b ValidateNoStaticHandlers** — per-machine. - **7 InferMachineCreates** — per-machine. Already gated by HasErrors in practice, but the wrapper is cheap insurance. ### Gathering→analysis boundary New HasErrors gate placed after pass 2b (the last gathering pass) and before pass 4 (ApplyPropagations): if (handler.Diagnostics.HasErrors) return globalScope; If any gathering pass collected an error, the analysis passes (4-7) and the module-system passes (8-10) are skipped. The duplicate gate before pass 8 (added in Phase 2 robustness) is kept as defense-in-depth — once Phase 4+ converts pass-5 capability checks to Report, that second gate will earn its keep. ## Why this matters Without Phase 3, a single machine with a bad start state aborts MachineChecker mid-pass, so a sibling machine's MoreThanOneParameterForHandlers error never gets reported. A single function with a malformed var decl aborts pass 3, so none of the OTHER functions' bodies get type-checked at all. Combined with Phase 2's expression-level recovery, Phase 3 means a user with N independent errors (in any combination of expressions, statements, machines, or functions) sees ALL of them in one compile. ## Tests ### New: `MultiMachineErrors.p` + pinned counts Three machines, each with one independent error (bool->int, missing declaration, int+string binop). Pinned in MultiErrorAcceptanceTest: strict=1, collecting=3. Validates that one machine's error doesn't clobber its siblings' diagnostics in collecting mode. ### Existing: All Phase 2 acceptance tests unchanged MultipleErrors (1/4), NestedExprErrors (1/2), ForeachBodyErrors (1/2), SpecReceiveBodyError (1/3). Same counts as Phase 2 — Phase 3 doesn't change behavior within a single machine. ### Existing: Phase1DormancyTest baseline The "collecting >= strict count" invariant on every Correct/StaticError test still holds. Phase 3 only *adds* errors (when multi-machine / multi-function scenarios exist); it never removes. ## Phase 4 (deferred, NOT in this PR) Convert pass 5 (capability checks) and pass 6 (ControlFlowChecker) throw sites to Report so capability and control-flow violations also accumulate across machines instead of aborting on first. The current HasErrors gate already prevents these passes from running on a broken AST, so they remain RequiresClean for now. Estimated: ~1 day if warranted by user feedback. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * Multi-agent review: harness fixes + 4 new pinned coverage tests Multi-agent review of PRs #957/#965/#967 identified 3 HIGH-severity harness fragilities (could silently mask real bugs) and ~10 coverage gaps. This commit applies the harness fixes + the 4 highest-priority new pinned tests + 1 doc fix. ## Harness fixes ### 1. Count ALL diagnostic markers, not just `[Error:]` Previous: `CountOccurrences(stderr, "[Error:]") + CountOccurrences(stderr, "[Parser Error:]")`. Problem: Compiler.cs also emits `[NotSupportedError:]`, `[NotImplementedError:]`, and per-backend `[<entry> Compiling Generated Code:]` markers. A regression that converts a real diagnostic into a `NotSupportedException` would be counted as zero errors, making `CollectingReportsAtLeastAsManyErrorsAsStrict`'s `>=` invariant trivially pass with both sides zero — masking the regression entirely. We already hit this once when the ForeachInvariantError.p test tripped a `NotSupportedException` on the PVerifier `invariant` keyword. Fix: new `CountErrorMarkers(stderr)` uses two regexes — one for `[<text>Error<text>:]` (catches all five marker variants) and one for the explicit `Compiling Generated Code` marker. Applied to both Phase1DormancyTest.cs and MultiErrorAcceptanceTest.cs. ### 2. Catch ONLY TranslationException Previous: `catch (Exception e)` in both fixture's RunOnce helpers. Problem: Phase 2's MultiAgent audit explicitly avoided this pattern. NREs, `Debug.Assert` failures, and other runtime exceptions get folded into `exitCode = -1` and identical stderr text — two parallel NREs produce equal output, equal exit codes, equal error counts (0), and the `StrictAndCollectingAgreeOnValidPrograms` equality assertion passes green despite both runs crashing. Fix: catch only `TranslationException`. Other exceptions propagate so genuine bugs surface as test failures with full stack traces. ### 3. Mark env-var test [NonParallelizable] Previous: `DiagnosticCollectorTest.ContinueOnError_ReadsEnvVar` mutates the process-global `P_COMPILER_COLLECT_ERRORS` env var. Problem: `RegressionTests.CompileOnlyRegressionTests` is `[Parallelizable(ParallelScope.Children)]`. With `dotnet test` running fixtures concurrently, a parallel CompilerConfiguration construction could read the temporarily-set env var and run a strict-mode regression test in collecting mode — silently masking a strict-mode regression as a flaky-pass. Fix: add `[NonParallelizable]` to the env-var test. ## Doc fix `MultipleErrors.p` line 11 said the 4th error was "arg count mismatch". The actual mechanism is "payload TYPE mismatch via single-arg CheckArgument branch" because `(1,2,3)` parses as a single unnamed-tuple argument. Count is 1 either way, but the prose was actively misleading. ## 4 new pinned coverage tests The audit identified ~12 throw sites in ExprVisitor/StatementVisitor with NO multi-error coverage at all. The four highest-priority additions: 1. **CtorArgErrors.p (1/3)** — VisitCtorExpr arg-pre-visit recovery. Validates that constructor args are visited BEFORE interface lookup, so missing-decl errors in args surface even when the interface itself is unknown. 2. **FunCallArgErrors.p (1/4)** — VisitFunCallExpr Function branch. Two calls combining argument errors with call-shape errors (arity + type mismatch). Validates cascade-suppression of arg[0] while arg[1] mismatch still surfaces. 3. **MultiFunctionPerMachine.p (1/3)** — Phase 3 per-FUNCTION isolation distinct from per-machine. Three functions in ONE machine, one error each. Catches a regression where Phase 3's per-function TolerantStep wrapper stops iterating after the first failing function. 4. **GotoRecoveryErrors.p (1/4)** — VisitGotoStmt rvalue-pre-visit recovery in both the missing-state branch and the arity-mismatch branch. Validates that argument errors surface even when the goto itself is malformed. ## Deferred (lower-priority audit findings) - 6 more coverage gaps (CastExpr, ChooseExpr, FormatString, dup tuple field, RaiseStmt non-void, etc.). Less critical because each is covered by Phase1DormancyTest's weak `>=` baseline. - Harness #5/#7 (scratch-dir GetHashCode collisions, parallel dotnet build). The Ubuntu flake from earlier may relate but hasn't recurred since the projectRoot fix. Tracked separately. - Harness #8 (test timeouts). Nice-to-have; current CI runs reliably. - Identity-aware assertions (stderr substring matches) — would lock in which error fires first, but adds maintenance cost. Defer. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * Round-2 multi-agent: typo fixes + source-order diagnostics + null-safety Five fixes from a second multi-agent audit covering 5 new dimensions (backwards-compat, PVerifier integration, error message quality, performance/scale, cross-PR consistency, PeasyAI integration). The audit produced ~30 findings; this commit applies the 5 quick-wins. The rest are tracked as deferred follow-ups (see PR description). ## 1. Error message typo: "is not undeclared" reverses meaning `DefaultTranslationErrorHandler.UndeclaredGlobalParam` previously emitted `'global param X' is not undeclared` — literally says the param IS declared, confusing the user reading the diagnostic. Fixed to `global param 'X' is undeclared` (also cleaned up the inner quoting). ## 2. Error message typo: "Value M has no start state" → "machine M..." `DefaultTranslationErrorHandler.MissingStartState` said `Value {machine.Name} has no start state`. The word "Value" is residual from a long-ago refactor; the entity is a Machine. Fixed. ## 3. Source-order diagnostic flushing Previously `FlushCollectedDiagnostics` iterated `Diagnostics` in insertion order (visitor traversal order). With multi-error collecting mode the user expects errors in source reading order so IDE jump-to-line flows naturally. Now sorts by `(file, line, column)` with a stable tiebreaker. Diagnostics without a parseable location prefix sort last. Cached compiled regex (`LocationPrefixPattern`) to avoid per-diagnostic regex compilation on the flush path. ## 4. SourceLocation.ToString() null-safety `SourceLocation.ToString()` previously did `return File == null ? throw new ArgumentException() : ...`. The throw fires partway through `FlushCollectedDiagnostics`'s loop if any diagnostic has File=null (reachable for `EmptyContext` per `DefaultLocationResolver.GetLocation`), partial-flushing the user's diagnostics with a confusing ArgumentException stack trace. Fixed: fall back to `<no source>:line:col` instead of throwing. ## 5. Cached compiled regexes in test harness CountErrorMarkers `Phase1DormancyTest.CountErrorMarkers` and the parallel helper in `MultiErrorAcceptanceTest` previously constructed two `new Regex(...)` per call. With ~280 test dirs × 2 modes × 2 fixtures, that's ~1000+ regex compilations per suite. Negligible in absolute terms (~100-200ms total) but mechanical to fix — moved to static fields with `RegexOptions.Compiled`. Also documented why the harness's own `[Test harness caught uncaught TranslationException:]` marker is substring-free of "Error" (so it deliberately doesn't inflate counts). ## Deferred (multi-agent findings tracked but not applied) - **PR #963 ↔ Phase 2 conflict resolution** — needs explicit merge-order decision (cherry-pick #963 into Phase 2 stack vs. land #963 first then rebase Phase 2's VisitTestExpr rework on top). - **Wrap pass 5 capability + pass 6 ControlFlowChecker in TolerantStep** — semantic change; deserves its own PR with new pinned tests. Could be a "Phase 4". - **PVerifier end-to-end CI smoke** — multi-agent flagged ZERO CI coverage of PVerifier-mode compile. PR #963 noted the existing `Tutorial/Advanced/4_Paxos/PVerifiedPaxos` already crashes on bare `p compile`. Needs separate fix + test addition. - **PeasyAI collecting-mode integration** — multi-agent produced concrete 5-file integration plan (compilation.py, fixer.py, generation.py, server.py, fixing.py). The parser already supports multiple errors; PeasyAI just needs to set `P_COMPILER_COLLECT_ERRORS=1` and call the existing `get_all_errors` instead of `parse_error`. Separate PR scope. - **Parallelize Phase1DormancyTest** — biggest CI speedup (~9min → ~2-3min) but needs scratch-dir race-condition validation. - **"N errors suppressed" footer** — UX improvement; new infrastructure. - **Truncation cap** — needed for pathological N>50 cases. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * Address Copilot round-2 review: InsertStmt location + doc accuracy Real bug + 3 doc accuracy improvements from Copilot's second-round review of PR #965 (1 code fix) and first-round review of PR #967 (2 doc updates): ## 1. VisitInsertStmt key-check pointed at the wrong AST node `StatementVisitor.VisitInsertStmt` reported the key/index TypeMismatch at `context.rvalue()` (the VALUE token of `insert lvalue[index] = value`) when the key actually lives in `context.expr()`. Diagnostics like "got type: string, expected: int" would point the user at the value expression instead of the index expression — confusingly off-target. The bug pre-existed the Phase 2 conversion (master had the same wrong context), but Copilot caught it because Phase 2 made the diff visible when converting `throw → CheckAssignable`. Fixed at line 215 to use context.expr() for the key check; context.rvalue() remains correct for the value check. ## 2. TolerantStep XML doc no longer claims cascade-suppression The previous doc said TolerantStep "runs an analysis step under cascade- suppression rules" — but it's used for both gathering AND analysis passes, and it doesn't implement cascade suppression at all (cascade suppression lives in CheckAssignable + the ErrorType IsSameTypeAs short-circuit; TolerantStep is the per-pass-item analog: same goal, different mechanism). Rewritten to: - Acknowledge dual-use across gathering and analysis passes - Distinguish from the expression-level cascade-suppression machinery - Make the strict vs collecting behavior contracts explicit ## 3. MultiMachineErrors test claim narrowed to what it actually tests The .p file header and the MultiErrorAcceptanceTest description both claimed the test validates "TolerantStep wrappers around pass 2a (MachineChecker) and pass 3". In reality, each machine in the file has a valid start state and payload type, so pass 2a never throws — the test only exercises pass 3 (function body type-checking) errors flowing through Phase 2's Report-and-continue plus Phase 3's per-function TolerantStep. Both descriptions updated to accurately call out: - What the test DOES exercise (per-machine isolation across pass 3) - What it DOESN'T cover (per-machine pass-2a throws like MissingStartState — would need a separate fixture) ## 4. PR description: precise strict-mode invariants Also updated PR #965's description to be precise about which strict-mode invariants the PR preserves vs. relaxes. The original claim "strict mode bit-for-bit unchanged" was over-broad — valid programs ARE bit-identical and invalid programs still exit 1, but the first-error IDENTITY can differ for invalid programs with both an argument error AND a callee/format/interface error (because args are now visited BEFORE lookup, which is the compiler-convention order and what lets collecting mode surface child errors). Replied on the 4 strict-mode reorder Copilot threads with the same rationale. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…omments (#968) Final multi-agent audit on master found that the multi-error compilation feature (PRs #957, #963, #965) shipped without any user-facing docs and with stale "Phase 1 wiring only" / "Phase 2 will..." comments in the public XML API. Users can't discover the env var; contributors have no convention to follow when adding new throw sites. ## User-facing docs - **README.md** — new "Multi-Error Compilation" section in "What's New" with a worked example showing strict vs collecting output and pointing at the AI-fix-loop use case. - **ChangeList.md** — entry under "P 3.0 Changes" listing the env var, cascade-suppression semantics, the #963 state-lookup narrowing, and explicitly noting strict-mode bit-for-bit preservation. ## Contributor docs - **CLAUDE.md** — new "Multi-Error Type Checking (Compiler)" top-level section. Covers architecture (IDiagnosticCollector, ErrorType/ErrorExpr, CheckAssignable, TolerantStep), the per-visitor convention checklist (visit children first → short-circuit on ErrorType → Report+ErrorExpr), the analyzer pass-tolerance pattern, and the test-fixture layout. Mentions the env var in the "Working with P Programs" command examples. ## MCP tool descriptions - **peasy-ai-compile** — append a Tip about setting P_COMPILER_COLLECT_ERRORS=1 to receive all errors in one response. - **peasy-ai-fix-all** — append the same Tip with a note that batched errors converge in N/k iterations instead of N. (The PeasyAI subprocess invocations themselves still need to actually PROPAGATE the env var — separate PR per the audit's recommendation.) ## Stale XML-doc refresh - **IDiagnosticCollector** — strike "Phase 1 wiring only — no visitor currently calls Report" (Phase 2 lit it up). Add a Lifecycle list pointing at ReadContinueOnErrorEnvVar, the visitors, TolerantStep, and FlushCollectedDiagnostics. Cross-reference the three test files that exercise the contract. - **ErrorType** — strike the "Phase 2 will additionally add a CheckAssignable helper" forward reference (it landed). Promote the helper to a peer of the IsAssignableFrom override + IsSameTypeAs short-circuit so the three-piece cascade-suppression story is told once and accurately. - **ErrorExpr** — strike "Phase 1 introduces this class; no visitor produces it yet." Rephrase the IExprTerm-leak protection clause to cite Compiler.Compile (not the generic <see cref="Compiler"/>). Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Summary
Phase 2 of the multi-error type checker initiative. Builds on Phase 1 (#957) by lighting up the diagnostic collector across the two largest visitor files.
ExprVisitorand ~44 inStatementVisitorconverted tohandler.Diagnostics.Report(...)+ recovery valueTypeCheckingUtils.CheckAssignablehelper implements the cascade-suppression rule (anErrorTypeoperand silently passes compatibility checks, so one upstream error doesn't generate a chain of downstream noise)P_COMPILER_COLLECT_ERRORS=1) reports all independent errors in one passRecovery conventions
ExprVisitor: every method visits children first, short-circuits on
ErrorType, and converts each throw toReport + return new ErrorExpr(context).VisitBinExprnotably collapses to a single guard at the top of the switch covering all 14 inner throws.StatementVisitor: statements either continue building the node with what they have (most type-mismatch cases) or return
new NoStmt(context)when a critical declaration is missing (interface / function / state / event).Helpers: the new
CheckAssignableis the single place whereErrorTypecascade suppression forIsAssignableFromhappens.PLanguageType.IsSameTypeAsalready had the symmetric short-circuit from Phase 1.Tests
Updated: Phase1DormancyTest split
The Phase 1 fixture asserted bit-identical stderr in both modes. That's no longer the right invariant for
StaticError/inputs in Phase 2 since collecting mode can legitimately report MORE errors. Split into:StrictAndCollectingAgreeOnValidPrograms(Correct/ only): both modes empty stderr + exit 0CollectingReportsAtLeastAsManyErrorsAsStrict(StaticError/ only): both exit 1, collecting count ≥ strict count. Catches the regression that matters most — collecting silently suppressing a strict error.New: MultipleErrors.p + MultiErrorAcceptanceTest.cs
A curated P file with 4 deliberately-independent errors (bool→int, missing decl, int+string, send-arity mismatch). Pinned counts:
A change in either count signals a specific regression — see the test's class doc for diagnostic guidance.
Phase 3 (next, deferred)
Pass-level tolerance classification in
Analyzer.cs— decide per pass whether it can run on a partially-broken AST (capability check and module-system passes need a clean AST; control-flow check can tolerate per-function errors). Estimated 2–3 days. Will be a separate PR on top of this one.Test plan
Phase1Dormancycategory (split fixture covers Correct + StaticError invariants)MultiErrorAcceptanceTest(pinned 1 vs 4 error counts)P_COMPILER_COLLECT_ERRORS=1 p compileon MultipleErrors.p prints 4 errors🤖 Generated with Claude Code
Strict-mode contract — precise invariants
Following Copilot's review of round 2, the original "strict mode is bit-for-bit unchanged" claim was over-broad. The precise invariants that hold (and that we test):
VisitCtorExpr,VisitFunCallExpr,VisitFormatedString,VisitCtorStmtnow visit child arguments BEFORE the lookup that uses them. If both layers have errors, strict mode now reports the child's first.The reordering is deliberate — visiting children before parents is the natural compiler convention (rustc/swiftc/tsc) and is what lets collecting mode surface internal arg errors when the parent lookup fails (which is the whole point of Phase 2). Gating the reordering behind
ContinueOnErrorwould require 4× duplicated code paths in the affected visitors; the maintenance burden isn't worth the marginal benefit of identical first-error wording in invalid programs.Existing
StaticError/regression tests (the broadest baseline) pass unchanged becauseStaticErrorValidatoronly checksexit code == 1, not the diagnostic text.