feat(spark): axiom_tracker SPARK companion — run #1 (closes #40)#47
Conversation
Proves two key invariants of AxiomTracker::enforce_policy: - 4-level DangerLevel total order (definitional via Ada enum ordering) - cap-at-Reject monotonicity: any Reject usage always produces Rejected Full post-condition covers all four determinism branches (a-d). gnatprove --mode=prove --level=1 summary: 18 checks, 0 unproved, 0 justified Run-time checks: 9 proved (CVC5/Z3/altergo) Assertions: 2 proved Contracts: 3 proved Termination: 4 proved (flow) Files added: spark/README.adoc — module index + how-to-run spark/axiom_tracker/axiom_tracker.ads — spec + contracts spark/axiom_tracker/axiom_tracker.adb — body + loop invariant spark/axiom_tracker/axiom_tracker_proof.adb — PO catalogue (9 obligations) Closes: #40 Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
Codex usage limits have been reached for code reviews. Please check with the admins of this repo to increase the limits by adding credits. |
- rsr-antipattern.yml: remove orphaned second Python heredoc block that leaked into shell after the TS-check PYEOF, causing BUILTIN_GLOBS: command not found (exit 127) - quality.yml: update TruffleHog pin to current HEAD of main (7ee2e0fd was a stale snapshot; ba0a524d is current) Same class of bugs already fixed on boj-server; pre-existing on main, not introduced by this PR.
Local
|
| Category | Total | Flow | Provers | Justified | Unproved |
|---|---|---|---|---|---|
| Run-time Checks | 9 | . | 9 (CVC5 46%, Z3 54%) | . | . |
| Assertions | 2 | . | 2 (CVC5 17%, Z3 75%, altergo 8%) | . | . |
| Functional Contracts | 3 | . | 3 (CVC5 43%, Trivial 29%, altergo 29%) | . | . |
| Termination | 4 | 4 (flow) | . | . | . |
| Total | 18 | 4 (22%) | 14 (78%) | 0 | 0 |
Max steps used for successful proof: 69. Analyzed 5 subprograms in Axiom_Tracker (Enforce_Policy: 12 checks, Is_Acceptable: 1, Worst_Danger: 1, package: 0, Policy_Dominates_Usage ghost: 0). axiom_tracker_proof.adb correctly skipped under SPARK_Mode => Off — it's a documentation catalogue, not a SPARK unit.
Drift from PR body
Total counts (9/2/3/4 = 18) match exactly. Prover-percentage split differs by a few points (solver heuristics aren't perfectly deterministic) but Run-time and Assertions remain CVC5+Z3-dominated with Alt-Ergo making minor contributions — the topology is the same. The substantive claim — 18 checks, 0 unproved, 0 justified — holds.
Test plan box 1 (gnatprove --mode=prove --level=1 clean) is now verified.
Test plan box 2 (toolchain prereqs: gnat_native ^14, gprbuild ^22, gnatprove 14.1.1) is satisfied on the runner via #40's PATH wiring in ~/.profile.
CI triage — all 7 failures pre-existing, none introduced by this PRCross-referenced the failing checks against the most recent
What this PR actually changesThe Greens this PR cares about
RecommendationMerge as admin / with required-status overrides. The 7 reds are all repo-level pre-existing issues that should be filed/fixed separately (and several already have natural owners: bump MSRV with |
## Summary SPARK 2014/2022 formal companion for `src/rust/integrity/solver_integrity.rs`, the SHAKE3-512 + BLAKE3 binary integrity layer. Advances the Rust→Rust/SPARK migration roadmap to run #2 (axiom_tracker was run #1, #47). - `spark/solver_integrity/solver_integrity.ads` — Ada spec mirroring `IntegrityStatus` + `IntegrityChecker::verify_solver`'s decision tree, with case-arm `Post`-conditions and named ghost lemmas - `spark/solver_integrity/solver_integrity.adb` — body as a 5-arm if-chain mirroring the Rust source line-for-line - `spark/solver_integrity/solver_integrity_proof.adb` — catalogue of 11 proof obligations with discharge notes - `spark/solver_integrity/solver_integrity.gpr` — GPRbuild project (gnat2022, --level=1) - `spark/README.adoc` — module index updated ## Proved properties | # | Property | Method | |---|---|---| | PO-1 | `Integrity_Status` enumeration totality | Definitional (Ada enum) | | PO-2 | `Classify_Verification` → Verified case-arm | CVC5/Z3 path substitution | | PO-3 | **Hash-mismatch ⇒ Tampered** (issue #40 load-bearing) | CVC5/Z3 path substitution | | PO-4 | `Classify_Verification` → Uninitialized case-arm | CVC5/Z3 path substitution | | PO-5 | `Classify_Verification` → Missing case-arm (disjunction) | CVC5/Z3 path substitution | | PO-6 | `Classify_Verification` never returns Unchecked | Unsat path; SMT immediate | | PO-7 | `Is_Safe` post-condition | Definitional (expression function) | | PO-8 | `Quick_Reverify` Unchanged ⇔ Cached_Match | Trivial | | PO-9 | `Aggregate_Safe` post-condition | Definitional (expression function) | | PO-10 | `Cap_At_Tampered` ghost lemma (Has_Tampered ⇒ ¬Aggregate_Safe) | Quantifier instantiation, --level=1 | | PO-11 | `Hash_Mismatch_Means_Tampered` named ghost lemma | Post-condition application, --level=1 | ## gnatprove summary ``` Total 11 checks, 0 unproved, 0 justified Run-time Checks 2 (CVC5 25%, altergo 75%) Functional Contracts 4 (CVC5 44%, Z3 6%, altergo 50%) Termination 5 (flow analysis) ``` Max steps used: **53**. Toolchain on the agent runner: gnatprove 14.1.1 (Why3 1.6.0, Alt-Ergo 2.4.0, CVC5 1.1.2, Z3), wired via #40. ## SPARK modeling note SPARK doesn't model file IO, async, or cryptographic primitives, so the companion is intentionally a *pure-decision mirror*. The Rust `find_solver_binary` / `compute_shake3_512` / `compute_blake3` / `RwLock` layer is abstracted into four boolean inputs of `Classify_Verification` (`Binary_Found`, `Expected_Is_Placeholder`, `Compute_Succeeded`, `Hashes_Match`). The crypto guarantees come from the `tiny_keccak` (SHAKE3) and `blake3` crates. SPARK's contribution is proving the **decision tree above the crypto layer** is correct and that hash mismatch unambiguously yields `Tampered`. For the multi-solver path (`verify_all`), the SPARK model is `Aggregate_Safe` / `Has_Tampered`: a fleet is safe iff every member is safe, and `Cap_At_Tampered` proves any single Tampered member forces the aggregate unsafe. ## Test plan - [x] `gnatprove --mode=prove --level=1 -P spark/solver_integrity/solver_integrity.gpr` — 11/0/0 (run locally on the wired toolchain at branch tip) - [x] Toolchain prereqs: `gnat_native ^14`, `gprbuild ^22`, `gnatprove 14.1.1` — satisfied via #40 ## CI note Repo-level CI failures on `main` (cargo fmt diffs in `tests/stage1_integration_test.rs`, rustdoc intra-doc-link breaks in `src/rust/provers/pvs.rs`, `Cargo.lock` v4 vs MSRV, missing `just` on MVP Smoke runner, missing `.clusterfuzzlite/Dockerfile`, ubuntu24 vs `erlef/setup-beam`) are pre-existing and unrelated to this PR — see the triage table on #47. This PR adds only `spark/solver_integrity/` and one line to `spark/README.adoc`; touches no Rust, Julia, Elixir, Dockerfile, or Cargo.lock. Expecting admin-merge with the same justification as #47. 🤖 Generated with [Claude Code](https://claude.com/claude-code)
… roadmap) (#50) ## Summary SPARK 2014/2022 formal companion for `src/rust/verification/certificates.rs`, the Alethe / DRAT / TSTP / OpenTheory cert parsing layer. **Completes the initial 3-module Rust→Rust/SPARK roadmap from #40** (run #1 was axiom_tracker #47, run #2 was solver_integrity #49). - `spark/certificates/certificates.ads` — Ada spec for `Certificate_Format`, `Format_Extension`, `Alethe_Step`, `Verify_Alethe` + 3 ghost lemmas - `spark/certificates/certificates.adb` — body (`Get_Format_Extension` six-arm case; ghost lemmas have null bodies) - `spark/certificates/certificates_proof.adb` — 8 proof obligations - `spark/certificates/certificates.gpr`, `alire.toml`, `.gitignore` - `spark/README.adoc` — module index updated ## Proved properties | # | Property | Method | |---|---|---| | PO-1 | `Certificate_Format` / `Format_Extension` enumeration totality | Definitional | | PO-2 | `Get_Format_Extension` exhaustive case-arm Post | CVC5 path substitution | | PO-3 | `Check_Alethe_Step` exact characterisation (Assume ⇒ raw nonempty; Step ⇒ has `:rule`) | Expression function, definitional | | PO-4 | `Verify_Alethe` = ∀ step. valid | Expression function, definitional | | PO-5 | `Has_Malformed` ghost predicate | Expression function, definitional | | PO-6 | **`Cap_At_Malformed`** — any malformed step ⇒ `¬Verify_Alethe` | Quantifier duality, --level=1 | | PO-7 | `Empty_Verifies` — empty input vacuously valid | Empty-range axiom, --level=0 | | PO-8 | **`Format_Extension_Is_Injective`** — distinct formats ⇒ distinct extensions | Six-arm case-split (30 unsat subcases), --level=1 | ## gnatprove summary ``` Total 10 checks, 0 unproved, 0 justified Run-time Checks 2 (CVC5 75%, altergo 25%) Functional Contracts 4 (CVC5 56%, altergo 44%) Termination 4 (flow analysis) ``` Max steps: **230** (the case-split injectivity proof is the work-heaviest VC, as expected). Toolchain on the agent runner: gnatprove 14.1.1 (Why3 1.6.0, Alt-Ergo 2.4.0, CVC5 1.1.2, Z3) — wired via #40. ## Scope and out-of-scope **In scope (mirrored):** - `CertificateFormat` enum (6 variants, same positional order as Rust) - `format_extension` six-arm match (Rust line 253-262) - `check_alethe_step` two-arm match on `AletheStepKind` (Rust line 230-233) - `verify_alethe`'s scan-all-and-reject-on-any-failure decision (Rust line 103-109) **Out of scope (cannot be modeled in SPARK):** - `verify_drat` (Rust line 127-179): shells out to `drat-trim`, depends on filesystem + subprocess + stdout substring parse - Lean4 / Coq kernel checks: subprocess-based - `parse_alethe_steps`: string-level `line.starts_with` predicates; SPARK abstracts the parse result into `Alethe_Step` records with `Raw_Is_Nonempty` / `Has_Rule_Marker` set - BLAKE3 hashing in `ProofCertificate::new`: opaque crypto Equivalence to Rust `verify_alethe` is **strong over the structural validator**; the Rust-side parser-adapter must compute `Raw_Is_Nonempty` and `Has_Rule_Marker` faithfully at the boundary (a constructor obligation, not a SPARK obligation). ## Test plan - [x] `gnatprove --mode=prove --level=1 -P spark/certificates/certificates.gpr` — 10/0/0 locally - [x] Toolchain prereqs satisfied via #40 ## CI note Same pre-existing repo-level failures as #47 / #49 (cargo fmt diffs in untouched test files, rustdoc intra-doc-link breaks in `pvs.rs`, `Cargo.lock` v4 vs MSRV, missing `just` on MVP Smoke, missing `.clusterfuzzlite/Dockerfile`, ubuntu24 vs `erlef/setup-beam`). This PR adds only `spark/certificates/` and one line to `spark/README.adoc` — touches no Rust, Julia, Elixir, Dockerfile, or Cargo.lock. Expecting admin-merge with the same justification as the prior two SPARK runs. 🤖 Generated with [Claude Code](https://claude.com/claude-code)
… 3 new candidates) (#52) ## Summary Refresh the `[migration-roadmap.rust-to-rust-spark]` entry in `STATE.a2ml` so the next monthly routine run has work to do (rather than no-op'ing against the completed initial slate). ### Marks done | Module | PR | Commit | Result | |---|---|---|---| | `axiom_tracker.rs` | #47 | `f18cd5c` | 18 checks, 0 unproved | | `solver_integrity.rs` | #49 | `1f92de7` | 11 checks, 0 unproved | | `certificates.rs` | #50 | `54c03aa` | 10 checks, 0 unproved | ### Adds new candidates (in suggested order) 1. **`src/rust/verification/confidence.rs`** — TrustLevel 5-level hierarchy + `compute_trust_level` priority chain. Same pattern as `axiom_tracker` and *composes* with it via the `DangerLevel` dependency, so the proof chain is now layered, not isolated. 2. **`src/rust/verification/portfolio.rs`** — PortfolioConfidence classification (CrossChecked / SingleSolver / Inconclusive / AllTimedOut). Load-bearing obligation: solver-disagreement ⇒ Inconclusive (never silently passed as CrossChecked). 3. **`src/rust/verification/pareto.rs`** — Pareto dominance order properties + frontier dominance-free invariant. ### Revisions to the note + estimate - **Modeling pattern** documented explicitly: abstract IO / async / strings / crypto as boolean inputs at the boundary, prove only the decision tree above that layer. Focus contracts on totality + cap-at-bad-state monotonicity. - **Per-module estimate revised** from "3-7 days each" to "1-2 hours each" based on actual experience with the first slate. - **Non-candidates expanded** to record what got ruled out during scouting: `mutation.rs` (string-heavy), `statistics.rs` (floats + Bayes), `proof.rs` (data + JSON), `integrity/io.rs` (IO-bound). 🤖 Generated with [Claude Code](https://claude.com/claude-code)
## Summary Three-commit cleanup of pre-existing CI infrastructure debt that surfaced while merging #47. None of these touch product logic. - `ci(rust)`: bump MSRV `1.75 → 1.85` (Cargo.lock v4 needs ≥1.78) and `julia-actions/cache v2 → v3.1.0` (v2 pulled in the deprecated `actions/cache@0c45773b` and GitHub auto-fails such runs). - `docs(rustdoc)`: fix 20 `unresolved-link` errors that `RUSTDOCFLAGS=-D warnings` turns hard. Two patterns — math notation `k[vars]` / `w[n]` / `set[T]` etc. escaped to `\[...\]`; bracketed code spans like `` [`prover_kind`] `` whose target was not in scope had the brackets dropped (kept the code span). Files in `coprocessor/`, `provers/`, `disciplines/mod.rs`, `gnn/{client,guided_search}.rs`. - `style`: `cargo fmt --all` (150 files). Pure formatting; reproduce locally with `cargo fmt --all`. ## Test plan - [ ] CI green on this branch (Rust CI Test Suite / Documentation / Julia Integration / MSRV) - [ ] No diff vs `cargo fmt --all` after merge - [ ] `cargo doc --no-deps --all-features` clean with `RUSTDOCFLAGS=-D warnings` - [ ] Clippy still red — `cargo clippy --all-targets --all-features -- -D warnings` reports 29 errors (unused imports, dead code, idiomatic suggestions). Left for a follow-up; surface area is wider than this PR's scope.
Summary
SPARK 2014/2022 formal companion for
src/rust/verification/axiom_tracker.rs, advancing the Rust→Rust/SPARK migration roadmap to run #1.spark/axiom_tracker/axiom_tracker.ads— Ada spec mirroring the Rust public surface, withPre/Postcontractsspark/axiom_tracker/axiom_tracker.adb— body withLoop_Invariantannotations enabling automatic proofspark/axiom_tracker/axiom_tracker_proof.adb— catalogue of 9 proof obligations with discharge notesspark/axiom_tracker/axiom_tracker.gpr— GPRbuild project (gnat2022, --level=1)spark/README.adoc— module index and toolchain how-toProved properties
DangerLeveltotal order (Safe < Noted < Warning < Reject)Rejectusage →RejectedpolicyWarningpriority branch correctNotedpriority branch correctCleanIs_Acceptableexact boolean characterisationWorst_Dangerexhaustive case equationgnatprove summary
Test plan
gnatprove --mode=prove --level=1 -P spark/axiom_tracker/axiom_tracker.gpr— all 18 checks proved, 0 unprovedgnat_native ^14,gprbuild ^22,gnatprove 14.1.1(via Alire)Closes #40
🤖 Generated with Claude Code