github
diff --git a/‎.github/workflows/daily-security-observability.lock.yml‎
Lines changed: 1 addition & 1 deletion b/‎.github/workflows/daily-security-observability.lock.yml‎
Lines changed: 1 addition & 1 deletion
diff --git a/‎docs/adr/40126-formal-compliance-test-suite-for-forecast-engine.md‎
Lines changed: 39 additions & 0 deletions b/‎docs/adr/40126-formal-compliance-test-suite-for-forecast-engine.md‎
Lines changed: 39 additions & 0 deletions
@@ -0,0 +1,39 @@
+# ADR-40126: Formal Compliance Test Suite for the Monte Carlo Forecast Engine
+
+**Date**: 2026-06-19
+**Status**: Draft
+
+## Context
+
+The Monte Carlo forecast engine (`pkg/cli/forecast.go`, `forecast_montecarlo.go`) is governed by a formal specification under `specs/forecast-compliance-fixtures/README.md`, with numbered requirements (R-MC-*, R-FC-*, R-CLI-*) and acceptance tests T-FC-031–T-FC-040. The canonical fixture `run_summary_minimal.json` set `total_effective_tokens` but omitted `total_aic`; because `forecast.go:593` gates on `runAIC ≤ 0 → continue`, every fixture-backed run was silently skipped and produced no Monte Carlo input. This field-name gap between what the fixture published and what the engine reads (`TokenUsage.TotalAIC`) went undetected because no test bound the fixture schema to the engine's read path. The engine's numeric invariants (Bernoulli success collapse, observed-rate and yield formulas, the λ=15 Poisson crossover, the n≥10 reliability threshold, nil-on-degenerate-λ behavior) were likewise unguarded by executable tests mapped to the spec.
+
+## Decision
+
+We will add a formal compliance test suite, `pkg/cli/forecast_compliance_fixtures_formal_test.go`, that maps thirteen predicates (P1–P13) one-to-one onto the formal model in `specs/forecast-compliance-fixtures/README.md`. Each predicate is expressed as a table-driven Go test that exercises the real engine functions (`runMonteCarlo`, `useNormalApproximationForPoisson`, `RunForecast`) and the real fixture files, with the formal cross-reference (TLA+ invariant, F* pre/post condition, or Z3-SMT schema gap) recorded in the test doc comment. We will also fix the fixture by adding `"total_aic": 0.0054`, and P12/P13 explicitly guard against regression of that schema gap. This follows the existing repository convention for formal conformance suites (see [ADR-38166](38166-formal-conformance-test-suite-for-compiler-threat-detection.md)).
+
+## Alternatives Considered
+
+### Alternative 1: Fix the fixture only, no test suite
+Add `total_aic` to the fixture and stop. This restores correct behavior but leaves the schema-mapping gap unguarded — the same field could be dropped or renamed again with no failing test. Rejected because the original defect was caused precisely by the absence of a test binding fixture keys to the engine's read path.
+
+### Alternative 2: Generic unit tests without spec mapping
+Write conventional unit tests covering the engine's numeric behavior without the explicit P1–P13 predicate-to-specification mapping. This would catch regressions but would not make conformance to the formal model auditable, and would diverge from the established formal-conformance pattern already used elsewhere in the codebase. Rejected for weaker traceability between requirement and test.
+
+## Consequences
+
+### Positive
+- The fixture-to-engine field mapping (`total_aic`) is now enforced by P1/P12/P13; the silent-skip defect cannot regress unnoticed.
+- Each formal requirement has a named, executable test, giving auditable traceability between `specs/forecast-compliance-fixtures/README.md` and `pkg/cli`.
+- Numeric invariants (Poisson crossover at λ=15, n≥10 reliability, degenerate-λ nil handling) are now pinned to exact boundary values.
+
+### Negative
+- The test file replicates a small amount of engine logic (e.g. the duration derivation in P10 mirrors `forecast.go:573-574`); the replica must be kept in sync if the engine changes.
+- Thirteen predicates add maintenance surface that must track future edits to the formal specification.
+
+### Neutral
+- Tests are gated behind `//go:build !integration` and run against in-repo fixtures, so they require no GitHub authentication.
+- P11 tolerates non-validation errors from `RunForecast` (e.g. missing auth) to remain runnable in CI sandboxes, asserting only on the days-validation message.
+
+---
+
+*This is a DRAFT ADR generated by the [Design Decision Gate](https://github.com/github/gh-aw/actions/runs/27797333197) workflow. The PR author must review, complete, and finalize this document before the PR can merge.*