Create STATE.scm project documentation file#2
Merged
hyperpolymath merged 1 commit intoDec 8, 2025
Conversation
Create stateful context tracking file following the hyperpolymath/STATE.scm format to maintain continuity across Claude sessions. Includes: - Current position: 9/12 provers (75%) complete - Route to MVP v1: GitLab deployment, prover installation - Issues: 5 tracked blockers with severity - Questions: 4 decisions for user input - 12-month roadmap with milestones - History snapshots for velocity tracking
hyperpolymath
added a commit
that referenced
this pull request
Apr 18, 2026
ACL2 (277 → 170 455, 615×): - Sparse clone acl2/acl2 at external_corpora/acl2_full/ with `books/` checked out (~ 13 478 .lisp files, 1.5 GB). - Walk recursively alongside legacy curated cache. - Accept .lisp and .acl2 files. - Fix UTF-8 byte-index slice crash (switch to first(s, n)). - ACL2 vaults to #2 corpus after Lean — Community Books are dense with named defthm declarations across hundreds of books. MiniZinc (29 → 327, 11×): - Clone MiniZinc/minizinc-benchmarks at external_corpora/minizinc_full/ (~ 300 .mzn models + many .dzn data files). - Walk recursively. - Wrap three eachmatch calls in try/catch — large amalgamated data files caused PCRE catastrophic backtracking. Count growth limited by the extractor's gate (requires at least one variable or constraint); further uplift needs per-file lowering or data-file handling. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
hyperpolymath
added a commit
that referenced
this pull request
Apr 18, 2026
Tier C (6 provers) lifted from 0 records each — TLAPS 103, Alloy 136, Boogie 1 319, Frama-C 1 317, Viper 3 541, Tamarin 4 595 = +11 011 records total. Extractors already existed; just needed upstream corpora vendored at the expected paths. Tier A boost: ACL2 277 → 170 455 (#2 corpus after Lean) from acl2/acl2 books/; MiniZinc 29 → 327. Both committed in 5348c71. Note: training_data/*.jsonl outputs are large (some > 100 MB) and cannot be pushed to GitHub with the current on-repo storage policy. Extractors + cloned corpora are the reproducible source of truth — regenerate with `just extract-<prover>` when needed. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
hyperpolymath
added a commit
that referenced
this pull request
Apr 18, 2026
End-of-session summary. Top-3 corpus ranking flipped: ACL2 (170 K) now #2 after Lean (131 K), ahead of the previous #2 (Isabelle 50 K). Tier breakdown of session work: Tier A (6 provers): +211 188 Tier B (4 provers): +41 096 ACL2 + MiniZinc (2): +170 476 Tier C wave 1 (6 — TLAPS/Alloy/Boogie/Frama-C/Viper/Tamarin): +11 011 Tier C wave 2 (4 — Spin/SeaHorn/KeY/Prism): +1 920 Tier C wave 3 (5 — dReal/Cameleer/Abella/Dedukti/Arend): +6 532 Still-missing: ProVerif (clone empty), NuSMV/UPPAAL (binary distros), CBMC (1.5 GB), ABC/CaDiCaL/Kissat/MiniSat (SAT), Isabelle_ZF/Nitpick/ Nunchaku/Mercury/Athena. Low-yield-needs-extractor-tuning: Naproche (19 files → 0), Matita (238 → 17), Arend (1 → 7). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
hyperpolymath
added a commit
that referenced
this pull request
Apr 20, 2026
Create stateful context tracking file following the hyperpolymath/STATE.scm format to maintain continuity across Claude sessions. Includes: - Current position: 9/12 provers (75%) complete - Route to MVP v1: GitLab deployment, prover installation - Issues: 5 tracked blockers with severity - Questions: 4 decisions for user input - 12-month roadmap with milestones - History snapshots for velocity tracking Co-authored-by: Claude <noreply@anthropic.com>
hyperpolymath
added a commit
that referenced
this pull request
Apr 20, 2026
ACL2 (277 → 170 455, 615×): - Sparse clone acl2/acl2 at external_corpora/acl2_full/ with `books/` checked out (~ 13 478 .lisp files, 1.5 GB). - Walk recursively alongside legacy curated cache. - Accept .lisp and .acl2 files. - Fix UTF-8 byte-index slice crash (switch to first(s, n)). - ACL2 vaults to #2 corpus after Lean — Community Books are dense with named defthm declarations across hundreds of books. MiniZinc (29 → 327, 11×): - Clone MiniZinc/minizinc-benchmarks at external_corpora/minizinc_full/ (~ 300 .mzn models + many .dzn data files). - Walk recursively. - Wrap three eachmatch calls in try/catch — large amalgamated data files caused PCRE catastrophic backtracking. Count growth limited by the extractor's gate (requires at least one variable or constraint); further uplift needs per-file lowering or data-file handling. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
hyperpolymath
added a commit
that referenced
this pull request
Apr 20, 2026
Tier C (6 provers) lifted from 0 records each — TLAPS 103, Alloy 136, Boogie 1 319, Frama-C 1 317, Viper 3 541, Tamarin 4 595 = +11 011 records total. Extractors already existed; just needed upstream corpora vendored at the expected paths. Tier A boost: ACL2 277 → 170 455 (#2 corpus after Lean) from acl2/acl2 books/; MiniZinc 29 → 327. Both committed in 618d6b9. Note: training_data/*.jsonl outputs are large (some > 100 MB) and cannot be pushed to GitHub with the current on-repo storage policy. Extractors + cloned corpora are the reproducible source of truth — regenerate with `just extract-<prover>` when needed. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
hyperpolymath
added a commit
that referenced
this pull request
Apr 20, 2026
End-of-session summary. Top-3 corpus ranking flipped: ACL2 (170 K) now #2 after Lean (131 K), ahead of the previous #2 (Isabelle 50 K). Tier breakdown of session work: Tier A (6 provers): +211 188 Tier B (4 provers): +41 096 ACL2 + MiniZinc (2): +170 476 Tier C wave 1 (6 — TLAPS/Alloy/Boogie/Frama-C/Viper/Tamarin): +11 011 Tier C wave 2 (4 — Spin/SeaHorn/KeY/Prism): +1 920 Tier C wave 3 (5 — dReal/Cameleer/Abella/Dedukti/Arend): +6 532 Still-missing: ProVerif (clone empty), NuSMV/UPPAAL (binary distros), CBMC (1.5 GB), ABC/CaDiCaL/Kissat/MiniSat (SAT), Isabelle_ZF/Nitpick/ Nunchaku/Mercury/Athena. Low-yield-needs-extractor-tuning: Naproche (19 files → 0), Matita (238 → 17), Arend (1 → 7). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2 tasks
hyperpolymath
added a commit
that referenced
this pull request
May 11, 2026
## Summary SPARK 2014/2022 formal companion for `src/rust/integrity/solver_integrity.rs`, the SHAKE3-512 + BLAKE3 binary integrity layer. Advances the Rust→Rust/SPARK migration roadmap to run #2 (axiom_tracker was run #1, #47). - `spark/solver_integrity/solver_integrity.ads` — Ada spec mirroring `IntegrityStatus` + `IntegrityChecker::verify_solver`'s decision tree, with case-arm `Post`-conditions and named ghost lemmas - `spark/solver_integrity/solver_integrity.adb` — body as a 5-arm if-chain mirroring the Rust source line-for-line - `spark/solver_integrity/solver_integrity_proof.adb` — catalogue of 11 proof obligations with discharge notes - `spark/solver_integrity/solver_integrity.gpr` — GPRbuild project (gnat2022, --level=1) - `spark/README.adoc` — module index updated ## Proved properties | # | Property | Method | |---|---|---| | PO-1 | `Integrity_Status` enumeration totality | Definitional (Ada enum) | | PO-2 | `Classify_Verification` → Verified case-arm | CVC5/Z3 path substitution | | PO-3 | **Hash-mismatch ⇒ Tampered** (issue #40 load-bearing) | CVC5/Z3 path substitution | | PO-4 | `Classify_Verification` → Uninitialized case-arm | CVC5/Z3 path substitution | | PO-5 | `Classify_Verification` → Missing case-arm (disjunction) | CVC5/Z3 path substitution | | PO-6 | `Classify_Verification` never returns Unchecked | Unsat path; SMT immediate | | PO-7 | `Is_Safe` post-condition | Definitional (expression function) | | PO-8 | `Quick_Reverify` Unchanged ⇔ Cached_Match | Trivial | | PO-9 | `Aggregate_Safe` post-condition | Definitional (expression function) | | PO-10 | `Cap_At_Tampered` ghost lemma (Has_Tampered ⇒ ¬Aggregate_Safe) | Quantifier instantiation, --level=1 | | PO-11 | `Hash_Mismatch_Means_Tampered` named ghost lemma | Post-condition application, --level=1 | ## gnatprove summary ``` Total 11 checks, 0 unproved, 0 justified Run-time Checks 2 (CVC5 25%, altergo 75%) Functional Contracts 4 (CVC5 44%, Z3 6%, altergo 50%) Termination 5 (flow analysis) ``` Max steps used: **53**. Toolchain on the agent runner: gnatprove 14.1.1 (Why3 1.6.0, Alt-Ergo 2.4.0, CVC5 1.1.2, Z3), wired via #40. ## SPARK modeling note SPARK doesn't model file IO, async, or cryptographic primitives, so the companion is intentionally a *pure-decision mirror*. The Rust `find_solver_binary` / `compute_shake3_512` / `compute_blake3` / `RwLock` layer is abstracted into four boolean inputs of `Classify_Verification` (`Binary_Found`, `Expected_Is_Placeholder`, `Compute_Succeeded`, `Hashes_Match`). The crypto guarantees come from the `tiny_keccak` (SHAKE3) and `blake3` crates. SPARK's contribution is proving the **decision tree above the crypto layer** is correct and that hash mismatch unambiguously yields `Tampered`. For the multi-solver path (`verify_all`), the SPARK model is `Aggregate_Safe` / `Has_Tampered`: a fleet is safe iff every member is safe, and `Cap_At_Tampered` proves any single Tampered member forces the aggregate unsafe. ## Test plan - [x] `gnatprove --mode=prove --level=1 -P spark/solver_integrity/solver_integrity.gpr` — 11/0/0 (run locally on the wired toolchain at branch tip) - [x] Toolchain prereqs: `gnat_native ^14`, `gprbuild ^22`, `gnatprove 14.1.1` — satisfied via #40 ## CI note Repo-level CI failures on `main` (cargo fmt diffs in `tests/stage1_integration_test.rs`, rustdoc intra-doc-link breaks in `src/rust/provers/pvs.rs`, `Cargo.lock` v4 vs MSRV, missing `just` on MVP Smoke runner, missing `.clusterfuzzlite/Dockerfile`, ubuntu24 vs `erlef/setup-beam`) are pre-existing and unrelated to this PR — see the triage table on #47. This PR adds only `spark/solver_integrity/` and one line to `spark/README.adoc`; touches no Rust, Julia, Elixir, Dockerfile, or Cargo.lock. Expecting admin-merge with the same justification as #47. 🤖 Generated with [Claude Code](https://claude.com/claude-code)
2 tasks
hyperpolymath
added a commit
that referenced
this pull request
May 11, 2026
… roadmap) (#50) ## Summary SPARK 2014/2022 formal companion for `src/rust/verification/certificates.rs`, the Alethe / DRAT / TSTP / OpenTheory cert parsing layer. **Completes the initial 3-module Rust→Rust/SPARK roadmap from #40** (run #1 was axiom_tracker #47, run #2 was solver_integrity #49). - `spark/certificates/certificates.ads` — Ada spec for `Certificate_Format`, `Format_Extension`, `Alethe_Step`, `Verify_Alethe` + 3 ghost lemmas - `spark/certificates/certificates.adb` — body (`Get_Format_Extension` six-arm case; ghost lemmas have null bodies) - `spark/certificates/certificates_proof.adb` — 8 proof obligations - `spark/certificates/certificates.gpr`, `alire.toml`, `.gitignore` - `spark/README.adoc` — module index updated ## Proved properties | # | Property | Method | |---|---|---| | PO-1 | `Certificate_Format` / `Format_Extension` enumeration totality | Definitional | | PO-2 | `Get_Format_Extension` exhaustive case-arm Post | CVC5 path substitution | | PO-3 | `Check_Alethe_Step` exact characterisation (Assume ⇒ raw nonempty; Step ⇒ has `:rule`) | Expression function, definitional | | PO-4 | `Verify_Alethe` = ∀ step. valid | Expression function, definitional | | PO-5 | `Has_Malformed` ghost predicate | Expression function, definitional | | PO-6 | **`Cap_At_Malformed`** — any malformed step ⇒ `¬Verify_Alethe` | Quantifier duality, --level=1 | | PO-7 | `Empty_Verifies` — empty input vacuously valid | Empty-range axiom, --level=0 | | PO-8 | **`Format_Extension_Is_Injective`** — distinct formats ⇒ distinct extensions | Six-arm case-split (30 unsat subcases), --level=1 | ## gnatprove summary ``` Total 10 checks, 0 unproved, 0 justified Run-time Checks 2 (CVC5 75%, altergo 25%) Functional Contracts 4 (CVC5 56%, altergo 44%) Termination 4 (flow analysis) ``` Max steps: **230** (the case-split injectivity proof is the work-heaviest VC, as expected). Toolchain on the agent runner: gnatprove 14.1.1 (Why3 1.6.0, Alt-Ergo 2.4.0, CVC5 1.1.2, Z3) — wired via #40. ## Scope and out-of-scope **In scope (mirrored):** - `CertificateFormat` enum (6 variants, same positional order as Rust) - `format_extension` six-arm match (Rust line 253-262) - `check_alethe_step` two-arm match on `AletheStepKind` (Rust line 230-233) - `verify_alethe`'s scan-all-and-reject-on-any-failure decision (Rust line 103-109) **Out of scope (cannot be modeled in SPARK):** - `verify_drat` (Rust line 127-179): shells out to `drat-trim`, depends on filesystem + subprocess + stdout substring parse - Lean4 / Coq kernel checks: subprocess-based - `parse_alethe_steps`: string-level `line.starts_with` predicates; SPARK abstracts the parse result into `Alethe_Step` records with `Raw_Is_Nonempty` / `Has_Rule_Marker` set - BLAKE3 hashing in `ProofCertificate::new`: opaque crypto Equivalence to Rust `verify_alethe` is **strong over the structural validator**; the Rust-side parser-adapter must compute `Raw_Is_Nonempty` and `Has_Rule_Marker` faithfully at the boundary (a constructor obligation, not a SPARK obligation). ## Test plan - [x] `gnatprove --mode=prove --level=1 -P spark/certificates/certificates.gpr` — 10/0/0 locally - [x] Toolchain prereqs satisfied via #40 ## CI note Same pre-existing repo-level failures as #47 / #49 (cargo fmt diffs in untouched test files, rustdoc intra-doc-link breaks in `pvs.rs`, `Cargo.lock` v4 vs MSRV, missing `just` on MVP Smoke, missing `.clusterfuzzlite/Dockerfile`, ubuntu24 vs `erlef/setup-beam`). This PR adds only `spark/certificates/` and one line to `spark/README.adoc` — touches no Rust, Julia, Elixir, Dockerfile, or Cargo.lock. Expecting admin-merge with the same justification as the prior two SPARK runs. 🤖 Generated with [Claude Code](https://claude.com/claude-code)
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Create stateful context tracking file following the hyperpolymath/STATE.scm format to maintain continuity across Claude sessions.
Includes: