Skip to content

feat(spark): axiom_tracker SPARK companion — run #1 (closes #40)#47

Merged
hyperpolymath merged 2 commits into
mainfrom
spark/axiom-tracker-run-1
May 11, 2026
Merged

feat(spark): axiom_tracker SPARK companion — run #1 (closes #40)#47
hyperpolymath merged 2 commits into
mainfrom
spark/axiom-tracker-run-1

Conversation

@hyperpolymath
Copy link
Copy Markdown
Owner

Summary

SPARK 2014/2022 formal companion for src/rust/verification/axiom_tracker.rs, advancing the Rust→Rust/SPARK migration roadmap to run #1.

  • spark/axiom_tracker/axiom_tracker.ads — Ada spec mirroring the Rust public surface, with Pre/Post contracts
  • spark/axiom_tracker/axiom_tracker.adb — body with Loop_Invariant annotations enabling automatic proof
  • spark/axiom_tracker/axiom_tracker_proof.adb — catalogue of 9 proof obligations with discharge notes
  • spark/axiom_tracker/axiom_tracker.gpr — GPRbuild project (gnat2022, --level=1)
  • spark/README.adoc — module index and toolchain how-to

Proved properties

# Property Method
PO-1 DangerLevel total order (Safe < Noted < Warning < Reject) Definitional (Ada enum ordering)
PO-2 Cap-at-Reject monotonicity: any Reject usage → Rejected policy Loop invariant + CVC5/Z3
PO-3 Warning priority branch correct Loop invariant + CVC5/Z3
PO-4 Noted priority branch correct Loop invariant + CVC5/Z3
PO-5 Empty input → Clean SMT empty-range axiom
PO-6/7 Loop invariant establishment + preservation CVC5/Z3
PO-8 Is_Acceptable exact boolean characterisation Trivial
PO-9 Worst_Danger exhaustive case equation Trivial

gnatprove summary

Total  18 checks, 0 unproved, 0 justified
  Run-time Checks      9  (CVC5 42%, Z3 54%, altergo 4%)
  Assertions           2  (CVC5 25%, Z3 67%, altergo 8%)
  Functional Contracts 3  (CVC5 29%, Trivial 29%, Z3 7%, altergo 36%)
  Termination          4  (flow analysis)

Test plan

  • gnatprove --mode=prove --level=1 -P spark/axiom_tracker/axiom_tracker.gpr — all 18 checks proved, 0 unproved
  • Toolchain prereqs: gnat_native ^14, gprbuild ^22, gnatprove 14.1.1 (via Alire)

Closes #40

🤖 Generated with Claude Code

Proves two key invariants of AxiomTracker::enforce_policy:

  - 4-level DangerLevel total order (definitional via Ada enum ordering)

  - cap-at-Reject monotonicity: any Reject usage always produces Rejected
Full post-condition covers all four determinism branches (a-d).

gnatprove --mode=prove --level=1 summary:

  18 checks, 0 unproved, 0 justified

  Run-time checks: 9 proved (CVC5/Z3/altergo)

  Assertions:      2 proved

  Contracts:       3 proved

  Termination:     4 proved (flow)

Files added:
  spark/README.adoc                         — module index + how-to-run
  spark/axiom_tracker/axiom_tracker.ads     — spec + contracts
  spark/axiom_tracker/axiom_tracker.adb     — body + loop invariant
  spark/axiom_tracker/axiom_tracker_proof.adb — PO catalogue (9 obligations)

Closes: #40

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@chatgpt-codex-connector
Copy link
Copy Markdown

Codex usage limits have been reached for code reviews. Please check with the admins of this repo to increase the limits by adding credits.
Credits must be used to enable repository wide code reviews.

- rsr-antipattern.yml: remove orphaned second Python heredoc block
  that leaked into shell after the TS-check PYEOF, causing
  BUILTIN_GLOBS: command not found (exit 127)
- quality.yml: update TruffleHog pin to current HEAD of main
  (7ee2e0fd was a stale snapshot; ba0a524d is current)

Same class of bugs already fixed on boj-server; pre-existing on main,
not introduced by this PR.
@hyperpolymath
Copy link
Copy Markdown
Owner Author

Local gnatprove reproduction — confirms 18/0/0

Reproduced the proof claim against this branch (spark/axiom-tracker-run-1 @ d043d6e) with the toolchain newly wired on the runner (gnatprove 14.1.1, Why3 1.6.0, Alt-Ergo 2.4.0, CVC5 1.1.2, Z3 — all via Alire). Toolchain unblock tracked in #40.

$ gnatprove -P spark/axiom_tracker/axiom_tracker.gpr --mode=prove --level=1 --report=all -j0
Phase 1 of 3: generation of data representation information ...
Phase 2 of 3: generation of Global contracts ...
Phase 3 of 3: flow analysis and proof ...
axiom_tracker.adb:36:13: info: loop invariant preservation proved
axiom_tracker.adb:36:13: info: loop invariant initialization proved
axiom_tracker.adb:37:38: info: index check proved
axiom_tracker.adb:40:38: info: index check proved
axiom_tracker.adb:43:38: info: index check proved
axiom_tracker.ads:61:13: info: implicit aspect Always_Terminates on "Enforce_Policy" has been proved, subprogram will terminate
axiom_tracker.ads:64:08: info: postcondition proved
axiom_tracker.ads:64:51: info: index check proved
axiom_tracker.ads:68:50: info: index check proved
axiom_tracker.ads:69:54: info: index check proved
axiom_tracker.ads:73:50: info: index check proved
axiom_tracker.ads:74:53: info: index check proved
axiom_tracker.ads:75:54: info: index check proved
axiom_tracker.ads:85:13: info: implicit aspect Always_Terminates on "Is_Acceptable" has been proved, subprogram will terminate
axiom_tracker.ads:86:19: info: postcondition proved
axiom_tracker.ads:92:13: info: implicit aspect Always_Terminates on "Worst_Danger" has been proved, subprogram will terminate
axiom_tracker.ads:94:08: info: postcondition proved
axiom_tracker.ads:107:13: info: implicit aspect Always_Terminates on "Policy_Dominates_Usage" has been proved, subprogram will terminate
$ echo $?
0

.obj/gnatprove/gnatprove.out summary

Category Total Flow Provers Justified Unproved
Run-time Checks 9 . 9 (CVC5 46%, Z3 54%) . .
Assertions 2 . 2 (CVC5 17%, Z3 75%, altergo 8%) . .
Functional Contracts 3 . 3 (CVC5 43%, Trivial 29%, altergo 29%) . .
Termination 4 4 (flow) . . .
Total 18 4 (22%) 14 (78%) 0 0

Max steps used for successful proof: 69. Analyzed 5 subprograms in Axiom_Tracker (Enforce_Policy: 12 checks, Is_Acceptable: 1, Worst_Danger: 1, package: 0, Policy_Dominates_Usage ghost: 0). axiom_tracker_proof.adb correctly skipped under SPARK_Mode => Off — it's a documentation catalogue, not a SPARK unit.

Drift from PR body

Total counts (9/2/3/4 = 18) match exactly. Prover-percentage split differs by a few points (solver heuristics aren't perfectly deterministic) but Run-time and Assertions remain CVC5+Z3-dominated with Alt-Ergo making minor contributions — the topology is the same. The substantive claim — 18 checks, 0 unproved, 0 justified — holds.

Test plan box 1 (gnatprove --mode=prove --level=1 clean) is now verified.
Test plan box 2 (toolchain prereqs: gnat_native ^14, gprbuild ^22, gnatprove 14.1.1) is satisfied on the runner via #40's PATH wiring in ~/.profile.

@hyperpolymath
Copy link
Copy Markdown
Owner Author

CI triage — all 7 failures pre-existing, none introduced by this PR

Cross-referenced the failing checks against the most recent Rust CI run on main (run 25478566351, 2026-05-07) and against the repo state at main. The four Rust CI failures reproduce identically on main; the remaining three are runner-image / missing-file issues that this PR doesn't touch.

# Check Root cause Where it lives Pre-exists on main?
1 Rust CI / Test Suite cargo fmt --check diffs: use ordering + match-arm trailing commas in tests/stage1_integration_test.rs and the proptest strategy file. tests/stage1_integration_test.rs:18, 50, 57 (file untouched by this PR) Yes — same fmt diffs on main's copy
2 Rust CI / Documentation rustdoc parses set[T] / sequence[T] in doc comments as intra-doc-links → unresolved link to T. src/rust/provers/pvs.rs:165, 167 (file untouched by this PR) Yes
3 Rust CI / Julia Integration GitHub auto-fails workflows using actions/cache@0c45773b… — that SHA is on the v1/v2 deprecation list. Job uses julia-actions/cache which pulls deprecated actions/cache transitively Yes — pure infra
4 Rust CI / Minimum Supported Rust Version Cargo.lock is version 4 on main; the MSRV-pinned cargo can't parse it: "lock file version 4 was found, but this version of Cargo does not understand this lock file". Cargo.lock head says version = 4 (untouched by this PR) Yes — MSRV needs bumping
5 Hypatia Neurosymbolic Analysis erlef/setup-beam only knows ubuntu18/20/22; runner image is ubuntu-24.04. Action errors before any project code runs. .github/workflows/ Hypatia job Yes — runner-image vs action incompat
6 MVP Smoke just: command not found — workflow runs just --justfile Justfile mvp-env but just is not installed on the runner (the Rust crate builds fine first). MVP Smoke workflow setup step missing just install Yes
7 ClusterFuzzLite PR (address) Docker build fails: failed to read dockerfile: open Dockerfile: no such file or directory. Repo has .clusterfuzzlite/Containerfile, build.sh, project.yaml but no Dockerfile. .clusterfuzzlite/ Yes — missing file on main

What this PR actually changes

M  .github/workflows/quality.yml         (1 line: trufflehog SHA bump)
M  .github/workflows/rsr-antipattern.yml (-77 lines: removed duplicate TS-allowlist check block)
A  spark/README.adoc
A  spark/axiom_tracker/.gitignore
A  spark/axiom_tracker/alire.toml
A  spark/axiom_tracker/axiom_tracker.adb
A  spark/axiom_tracker/axiom_tracker.ads
A  spark/axiom_tracker/axiom_tracker.gpr
A  spark/axiom_tracker/axiom_tracker_proof.adb

The rsr-antipattern.yml reduction is dedup — the file had two copies of the "TypeScript files outside allowlist" check (count went from 2 → 1, canonical one earlier in the file survives). antipattern-check job passes on this PR.

Greens this PR cares about

  • Live Provers (all 9): z3, cvc5, vampire, eprover, spass, alt-ergo, glpk, minizinc, chuffed — all pass
  • CodeQL (rust + js/ts) — pass
  • Security Audit — pass
  • lint (Code Quality) — pass
  • docs (Code Quality) — pass
  • Security Policy, Guix/Nix Package Policy, NPM/Bun Blocker, TypeScript/JavaScript Blocker, antipattern-check — all pass
  • Code Coverage — pass

Recommendation

Merge as admin / with required-status overrides. The 7 reds are all repo-level pre-existing issues that should be filed/fixed separately (and several already have natural owners: bump MSRV with Cargo.lock v4, install just on MVP Smoke runner, swap erlef/setup-beam for an ubuntu24-aware version, add .clusterfuzzlite/Dockerfile or repoint to Containerfile, run cargo fmt, fix the pvs.rs doc-link escapes, bump julia-actions/cache). None of them gate the correctness of an Ada/SPARK module that gnatprove has already discharged at 18/0/0.

@hyperpolymath hyperpolymath merged commit f18cd5c into main May 11, 2026
32 of 39 checks passed
@hyperpolymath hyperpolymath deleted the spark/axiom-tracker-run-1 branch May 11, 2026 06:43
hyperpolymath added a commit that referenced this pull request May 11, 2026
## Summary

SPARK 2014/2022 formal companion for
`src/rust/integrity/solver_integrity.rs`, the SHAKE3-512 + BLAKE3 binary
integrity layer. Advances the Rust→Rust/SPARK migration roadmap to run
#2 (axiom_tracker was run #1, #47).

- `spark/solver_integrity/solver_integrity.ads` — Ada spec mirroring
`IntegrityStatus` + `IntegrityChecker::verify_solver`'s decision tree,
with case-arm `Post`-conditions and named ghost lemmas
- `spark/solver_integrity/solver_integrity.adb` — body as a 5-arm
if-chain mirroring the Rust source line-for-line
- `spark/solver_integrity/solver_integrity_proof.adb` — catalogue of 11
proof obligations with discharge notes
- `spark/solver_integrity/solver_integrity.gpr` — GPRbuild project
(gnat2022, --level=1)
- `spark/README.adoc` — module index updated

## Proved properties

| # | Property | Method |
|---|---|---|
| PO-1 | `Integrity_Status` enumeration totality | Definitional (Ada
enum) |
| PO-2 | `Classify_Verification` → Verified case-arm | CVC5/Z3 path
substitution |
| PO-3 | **Hash-mismatch ⇒ Tampered** (issue #40 load-bearing) | CVC5/Z3
path substitution |
| PO-4 | `Classify_Verification` → Uninitialized case-arm | CVC5/Z3 path
substitution |
| PO-5 | `Classify_Verification` → Missing case-arm (disjunction) |
CVC5/Z3 path substitution |
| PO-6 | `Classify_Verification` never returns Unchecked | Unsat path;
SMT immediate |
| PO-7 | `Is_Safe` post-condition | Definitional (expression function) |
| PO-8 | `Quick_Reverify` Unchanged ⇔ Cached_Match | Trivial |
| PO-9 | `Aggregate_Safe` post-condition | Definitional (expression
function) |
| PO-10 | `Cap_At_Tampered` ghost lemma (Has_Tampered ⇒ ¬Aggregate_Safe)
| Quantifier instantiation, --level=1 |
| PO-11 | `Hash_Mismatch_Means_Tampered` named ghost lemma |
Post-condition application, --level=1 |

## gnatprove summary

```
Total  11 checks, 0 unproved, 0 justified
  Run-time Checks      2  (CVC5 25%, altergo 75%)
  Functional Contracts 4  (CVC5 44%, Z3 6%, altergo 50%)
  Termination          5  (flow analysis)
```

Max steps used: **53**. Toolchain on the agent runner: gnatprove 14.1.1
(Why3 1.6.0, Alt-Ergo 2.4.0, CVC5 1.1.2, Z3), wired via #40.

## SPARK modeling note

SPARK doesn't model file IO, async, or cryptographic primitives, so the
companion is intentionally a *pure-decision mirror*. The Rust
`find_solver_binary` / `compute_shake3_512` / `compute_blake3` /
`RwLock` layer is abstracted into four boolean inputs of
`Classify_Verification` (`Binary_Found`, `Expected_Is_Placeholder`,
`Compute_Succeeded`, `Hashes_Match`). The crypto guarantees come from
the `tiny_keccak` (SHAKE3) and `blake3` crates. SPARK's contribution is
proving the **decision tree above the crypto layer** is correct and that
hash mismatch unambiguously yields `Tampered`.

For the multi-solver path (`verify_all`), the SPARK model is
`Aggregate_Safe` / `Has_Tampered`: a fleet is safe iff every member is
safe, and `Cap_At_Tampered` proves any single Tampered member forces the
aggregate unsafe.

## Test plan

- [x] `gnatprove --mode=prove --level=1 -P
spark/solver_integrity/solver_integrity.gpr` — 11/0/0 (run locally on
the wired toolchain at branch tip)
- [x] Toolchain prereqs: `gnat_native ^14`, `gprbuild ^22`, `gnatprove
14.1.1` — satisfied via #40

## CI note

Repo-level CI failures on `main` (cargo fmt diffs in
`tests/stage1_integration_test.rs`, rustdoc intra-doc-link breaks in
`src/rust/provers/pvs.rs`, `Cargo.lock` v4 vs MSRV, missing `just` on
MVP Smoke runner, missing `.clusterfuzzlite/Dockerfile`, ubuntu24 vs
`erlef/setup-beam`) are pre-existing and unrelated to this PR — see the
triage table on #47. This PR adds only `spark/solver_integrity/` and one
line to `spark/README.adoc`; touches no Rust, Julia, Elixir, Dockerfile,
or Cargo.lock. Expecting admin-merge with the same justification as #47.

🤖 Generated with [Claude Code](https://claude.com/claude-code)
hyperpolymath added a commit that referenced this pull request May 11, 2026
… roadmap) (#50)

## Summary

SPARK 2014/2022 formal companion for
`src/rust/verification/certificates.rs`, the Alethe / DRAT / TSTP /
OpenTheory cert parsing layer. **Completes the initial 3-module
Rust→Rust/SPARK roadmap from #40** (run #1 was axiom_tracker #47, run #2
was solver_integrity #49).

- `spark/certificates/certificates.ads` — Ada spec for
`Certificate_Format`, `Format_Extension`, `Alethe_Step`, `Verify_Alethe`
+ 3 ghost lemmas
- `spark/certificates/certificates.adb` — body (`Get_Format_Extension`
six-arm case; ghost lemmas have null bodies)
- `spark/certificates/certificates_proof.adb` — 8 proof obligations
- `spark/certificates/certificates.gpr`, `alire.toml`, `.gitignore`
- `spark/README.adoc` — module index updated

## Proved properties

| # | Property | Method |
|---|---|---|
| PO-1 | `Certificate_Format` / `Format_Extension` enumeration totality
| Definitional |
| PO-2 | `Get_Format_Extension` exhaustive case-arm Post | CVC5 path
substitution |
| PO-3 | `Check_Alethe_Step` exact characterisation (Assume ⇒ raw
nonempty; Step ⇒ has `:rule`) | Expression function, definitional |
| PO-4 | `Verify_Alethe` = ∀ step. valid | Expression function,
definitional |
| PO-5 | `Has_Malformed` ghost predicate | Expression function,
definitional |
| PO-6 | **`Cap_At_Malformed`** — any malformed step ⇒ `¬Verify_Alethe`
| Quantifier duality, --level=1 |
| PO-7 | `Empty_Verifies` — empty input vacuously valid | Empty-range
axiom, --level=0 |
| PO-8 | **`Format_Extension_Is_Injective`** — distinct formats ⇒
distinct extensions | Six-arm case-split (30 unsat subcases), --level=1
|

## gnatprove summary

```
Total  10 checks, 0 unproved, 0 justified
  Run-time Checks      2  (CVC5 75%, altergo 25%)
  Functional Contracts 4  (CVC5 56%, altergo 44%)
  Termination          4  (flow analysis)
```

Max steps: **230** (the case-split injectivity proof is the
work-heaviest VC, as expected). Toolchain on the agent runner: gnatprove
14.1.1 (Why3 1.6.0, Alt-Ergo 2.4.0, CVC5 1.1.2, Z3) — wired via #40.

## Scope and out-of-scope

**In scope (mirrored):**
- `CertificateFormat` enum (6 variants, same positional order as Rust)
- `format_extension` six-arm match (Rust line 253-262)
- `check_alethe_step` two-arm match on `AletheStepKind` (Rust line
230-233)
- `verify_alethe`'s scan-all-and-reject-on-any-failure decision (Rust
line 103-109)

**Out of scope (cannot be modeled in SPARK):**
- `verify_drat` (Rust line 127-179): shells out to `drat-trim`, depends
on filesystem + subprocess + stdout substring parse
- Lean4 / Coq kernel checks: subprocess-based
- `parse_alethe_steps`: string-level `line.starts_with` predicates;
SPARK abstracts the parse result into `Alethe_Step` records with
`Raw_Is_Nonempty` / `Has_Rule_Marker` set
- BLAKE3 hashing in `ProofCertificate::new`: opaque crypto

Equivalence to Rust `verify_alethe` is **strong over the structural
validator**; the Rust-side parser-adapter must compute `Raw_Is_Nonempty`
and `Has_Rule_Marker` faithfully at the boundary (a constructor
obligation, not a SPARK obligation).

## Test plan

- [x] `gnatprove --mode=prove --level=1 -P
spark/certificates/certificates.gpr` — 10/0/0 locally
- [x] Toolchain prereqs satisfied via #40

## CI note

Same pre-existing repo-level failures as #47 / #49 (cargo fmt diffs in
untouched test files, rustdoc intra-doc-link breaks in `pvs.rs`,
`Cargo.lock` v4 vs MSRV, missing `just` on MVP Smoke, missing
`.clusterfuzzlite/Dockerfile`, ubuntu24 vs `erlef/setup-beam`). This PR
adds only `spark/certificates/` and one line to `spark/README.adoc` —
touches no Rust, Julia, Elixir, Dockerfile, or Cargo.lock. Expecting
admin-merge with the same justification as the prior two SPARK runs.

🤖 Generated with [Claude Code](https://claude.com/claude-code)
hyperpolymath added a commit that referenced this pull request May 11, 2026
… 3 new candidates) (#52)

## Summary

Refresh the `[migration-roadmap.rust-to-rust-spark]` entry in
`STATE.a2ml` so the next monthly routine run has work to do (rather than
no-op'ing against the completed initial slate).

### Marks done

| Module | PR | Commit | Result |
|---|---|---|---|
| `axiom_tracker.rs` | #47 | `f18cd5c` | 18 checks, 0 unproved |
| `solver_integrity.rs` | #49 | `1f92de7` | 11 checks, 0 unproved |
| `certificates.rs` | #50 | `54c03aa` | 10 checks, 0 unproved |

### Adds new candidates (in suggested order)

1. **`src/rust/verification/confidence.rs`** — TrustLevel 5-level
hierarchy + `compute_trust_level` priority chain. Same pattern as
`axiom_tracker` and *composes* with it via the `DangerLevel` dependency,
so the proof chain is now layered, not isolated.
2. **`src/rust/verification/portfolio.rs`** — PortfolioConfidence
classification (CrossChecked / SingleSolver / Inconclusive /
AllTimedOut). Load-bearing obligation: solver-disagreement ⇒
Inconclusive (never silently passed as CrossChecked).
3. **`src/rust/verification/pareto.rs`** — Pareto dominance order
properties + frontier dominance-free invariant.

### Revisions to the note + estimate

- **Modeling pattern** documented explicitly: abstract IO / async /
strings / crypto as boolean inputs at the boundary, prove only the
decision tree above that layer. Focus contracts on totality +
cap-at-bad-state monotonicity.
- **Per-module estimate revised** from "3-7 days each" to "1-2 hours
each" based on actual experience with the first slate.
- **Non-candidates expanded** to record what got ruled out during
scouting: `mutation.rs` (string-heavy), `statistics.rs` (floats +
Bayes), `proof.rs` (data + JSON), `integrity/io.rs` (IO-bound).

🤖 Generated with [Claude Code](https://claude.com/claude-code)
hyperpolymath added a commit that referenced this pull request May 11, 2026
## Summary

Three-commit cleanup of pre-existing CI infrastructure debt that
surfaced while merging #47. None of these touch product logic.

- `ci(rust)`: bump MSRV `1.75 → 1.85` (Cargo.lock v4 needs ≥1.78) and
`julia-actions/cache v2 → v3.1.0` (v2 pulled in the deprecated
`actions/cache@0c45773b` and GitHub auto-fails such runs).
- `docs(rustdoc)`: fix 20 `unresolved-link` errors that `RUSTDOCFLAGS=-D
warnings` turns hard. Two patterns — math notation `k[vars]` / `w[n]` /
`set[T]` etc. escaped to `\[...\]`; bracketed code spans like ``
[`prover_kind`] `` whose target was not in scope had the brackets
dropped (kept the code span). Files in `coprocessor/`, `provers/`,
`disciplines/mod.rs`, `gnn/{client,guided_search}.rs`.
- `style`: `cargo fmt --all` (150 files). Pure formatting; reproduce
locally with `cargo fmt --all`.

## Test plan

- [ ] CI green on this branch (Rust CI Test Suite / Documentation /
Julia Integration / MSRV)
- [ ] No diff vs `cargo fmt --all` after merge
- [ ] `cargo doc --no-deps --all-features` clean with `RUSTDOCFLAGS=-D
warnings`
- [ ] Clippy still red — `cargo clippy --all-targets --all-features --
-D warnings` reports 29 errors (unused imports, dead code, idiomatic
suggestions). Left for a follow-up; surface area is wider than this PR's
scope.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

infra: install GNAT/SPARK toolchain on Claude Code runner (blocks Rust→Rust/SPARK migration routine)

1 participant