feat(corpus): resolved & expanded training corpus (700K+ proofs) by hyperpolymath · Pull Request #25 · hyperpolymath/echidna

hyperpolymath · 2026-04-20T07:05:44Z

This PR resolves the issues from the closed #23.

Key changes:

Fixed syntax errors in Mizar and Isabelle extraction scripts.
Resolved ID conflicts in the Dafny corpus.
Expanded UNIFIED training corpus to 702,099 proofs.
Expanded model vocabulary to 141,517 terms.
Verified with full test suite (613/613 passing).
Cleaned git history of accidentally tracked large stale files.

Convert Guile Scheme state files to A2ML format, preserving project-specific content as structured sections. Part of global TODO cleanup (2026-03-16). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Added from rsr-template-repo: standardizing CI/CD across all repos. Part of global TODO cleanup (2026-03-16). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

…files - Convert Trustfile.hs to standardized Trustfile.a2ml format - Remove Trustfile example files - Remove legacy AI.djot files (migrated to 0-AI-MANIFEST.a2ml) - Remove duplicate root AI.a2ml files Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Part of gv-clade-index Phase 1: every repo declares its identity, primary clade, and forge mappings for the VeriSimDB central registry. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Pin all GitHub Actions to specific commit SHAs instead of mutable tags to prevent supply chain attacks. Tags preserved as comments. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Adds the forbid(unsafe_code) crate-level attribute to all Rust crates that do not use unsafe code, hardening the safety guarantee at compile time. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

…ooling Refine workflow permissions from read-all to granular contents:read across all 21 CI/CD workflows. Add FFI wrapper modules and integration test for GraphQL/gRPC/REST interfaces. Expand training corpus with Metamath, CoqGym, and mathlib4 extraction scripts (Julia + Python tooling). Add algebra proof examples for Coq, Isabelle, and Lean. Update .gitignore for external corpora and large intermediate training artifacts. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

# Conflicts: # .github/workflows/generator-generic-ossf-slsa3-publish.yml # .github/workflows/guix-nix-policy.yml # .github/workflows/jekyll-gh-pages.yml # .github/workflows/quality.yml # .github/workflows/rsr-antipattern.yml # .github/workflows/rust-ci.yml # .github/workflows/rust.yml # .github/workflows/scorecard.yml # .github/workflows/security-policy.yml # .github/workflows/ts-blocker.yml # .github/workflows/wellknown-enforcement.yml # STATE.scm

Wire up the dormant Chapel parallel proof search into the build system: - Add `chapel` feature flag to Cargo.toml - Add build.rs to link Zig FFI library when feature enabled - Update build.zig for Zig 0.14+/0.15+ API compatibility - Add C header (chapel_ffi_exports.h) defining Chapel ↔ Zig ABI - Add stub implementations for building/testing without Chapel runtime - Add Justfile recipes (build-chapel-ffi, test-chapel, chapel-all, etc.) - Add chapel-ci.yml workflow (Zig FFI build + Rust feature gate CI) - Fix var→const in chapel_bridge.zig for Zig 0.15 compliance The Chapel accelerator remains optional (default off). Enable with: just build-chapel # or: cargo build --features chapel Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Integrate frontier language models (Claude, GPT-4, etc.) as proof advisors through BoJ Server's cartridge system. The LLM suggests tactics, ranks provers, decomposes goals, and generates auxiliary lemmas — but CANNOT influence trust levels. Architecture: ProofState → LlmAdvisor → BoJ REST API → Frontier LLM LLM suggestions → ProverDispatcher (trust pipeline unchanged) New module: src/rust/llm.rs - LlmAdvisor: connects to BoJ (localhost:7700) for LLM guidance - TacticSuggestionRequest/Response: structured JSON protocol - ProverRecommendation: LLM-ranked prover selection - GoalDecomposition: LLM-suggested subgoal splitting - DispatchOptimisation: LLM-guided dispatch configuration - In-context learning via proof attempt history - Cost-aware model routing via BoJ model-router cartridge - 8 tests (all passing) Integration into dispatch.rs: - ProverDispatcher.with_llm_advisor(): attach LLM advisor - verify_proof_llm_guided(): LLM-optimised dispatch path - Falls back to standard dispatch if LLM unavailable Trust invariant: LLM optimises WHICH PATH to take, never WHAT THE ANSWER IS. Trust levels computed solely by dispatch pipeline. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Integrates ECHIDNA with VeriSimDB for persistent, queryable, federated proof storage across all 8 modalities (semantic, temporal, provenance, document, graph, vector, tensor, spatial). New modules (behind `verisimdb` feature flag): - proof_encoding.rs: CBOR encode/decode + SHA-256 proof identity hashing - verisimdb_bridge.rs: ProofOctadBuilder maps ProofState → 8-modality octad, VeriSimDBClient for async HTTP, provenance hash-chain builder Updated: - agent/memory.rs: VeriSimDBProofStore implements ProofMemory trait, stores proofs as octads with graceful fallback to in-memory cache - Cargo.toml: optional serde_cbor + sha2 deps behind verisimdb feature - lib.rs: conditional module exports ECHIDNA Priority 3 (VeriSimDB integration) — Rust side complete. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Formal interface proofs and C-ABI implementations for all 30 ECHIDNA prover backends, organized by protocol category: Idris2 ABI (dependent type proofs): - InteractiveAssistants.idr: 10 provers (Agda, Coq, Lean, Isabelle, Idris2, F*, HOL4, HOL Light, Nuprl, Minlog) — session state machine, logic foundation proofs, tactic typing - SmtSolvers.idr: 3 provers (Z3, CVC5, Alt-Ergo) — SMT-LIB2 protocol, push/pop stack discipline, certificate trust proofs, portfolio safety - FirstOrderAtp.idr: 3 provers (Vampire, E Prover, SPASS) — TPTP protocol, SZS ontology, TSTP proof validity - DeclarativeProvers.idr: 7 provers (Metamath, Mizar, PVS, ACL2, TLAPS, Twelf, Imandra) — file verification, kernel trust, axiom transparency - AutoActive.idr: 2 provers (Dafny, Why3) — VC pipeline, backend delegation proofs, annotation tracking - ConstraintSolvers.idr: 5 provers (GLPK, SCIP, MiniZinc, Chuffed, OR-Tools) — solver state machine, problem class capabilities - Provers.idr: top-level re-export with universal dispatcher Zig FFI (C-ABI, thread-safe, mutex-guarded): - interactive.zig, smt.zig, atp.zig, declarative.zig, autoactive.zig, constraint.zig — matching session management, state validation, tests 2,693 lines. Every state transition proven in Idris2, enforced in Zig. ECHIDNA Priority 4 (per-prover ABI + FFI) — complete. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Implements ECHIDNA's VQL-UT integration for type-safe federated proof queries across all 30 prover backends via VeriSimDB octads. New files: - VqlUt.idr: Idris2 ABI with proofs for all 10 type safety levels (parse, schema, type-compat, null, injection, result-type, cardinality, effect-tracking, temporal, linearity) - vql_ut.zig: Zig FFI query session engine (parse → typecheck → plan → execute → collect pipeline, thread-safe) - vql_ut.rs: Rust CrossProverQueryBuilder (fluent API enforcing progressive type safety), QueryExecutor routing to VeriSimDB, 8 query operations, 9 tests Query types: FindProof, FindSimilar, CrossProverSearch, ProvenanceTrace, TemporalHistory, DependencyGraph, AxiomUsage, TacticStats. ECHIDNA Priority 5 (VQL-UT cross-prover queries) — complete. All 5 Prover Wars priorities now done. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Closes all integration gaps in the ECHIDNA Prover Wars stack: - vql_ut.rs: executor methods now call VeriSimDB REST endpoints (/api/v1/search/text, /api/v1/search/vector, /api/v1/search/related) instead of returning empty stubs. Includes URL encoding, search result parsing, and Julia embedding fetch for vector similarity. - agent/memory.rs: VeriSimDBProofStore.store_success() now fetches a 512-dim goal embedding from the Julia inference service (port 8090) and passes it to ProofOctadBuilder.with_embedding() before storing the octad. Graceful fallback if Julia is unreachable. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

QueryExecutor now calls TypeLL /api/v1/vql-ut/check before executing queries, validating that the query meets its declared safety level. Falls back gracefully if TypeLL is unreachable (3s timeout). Configurable via TYPELL_URL env var. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Add AI manifest, Trustfile, Dustfile, and assail recipe as part of the Floor Raise campaign to establish baseline tooling across all repos. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

…rates Three unrelated CI failures triaged: 1. `security-scan.yml` referenced `hyperpolymath/panic-attacker` — the repo is `panic-attack`. GitHub follows the repo rename for API calls, but `uses:` does not, so the workflow failed to resolve. 2. `hypatia-scan.yml` pinned `actions/upload-artifact` at a SHA that no longer resolves; replace with the canonical v4 SHA already used by chapel-ci.yml in this same repo. 3. `scorecard.yml` pinned `github/codeql-action/upload-sarif` at a SHA that doesn't exist on the upstream repo. Replace with the v3.28.1 SHA used by codeql.yml and scorecard-enforcer.yml for consistency. Also runs `cargo update -p rustls-webpki -p rand` to pull in the security-patched versions flagged by the six low-severity dependabot alerts (rustls-webpki 0.103.10 → 0.103.12, rand 0.8.5 → 0.8.6, rand 0.9.2 → 0.9.4). Remaining CI failures are secrets/infra issues, not code issues: - `Mirror to Git Forges`: Radicle SSH key secret not provisioned in this repo; mirror-gitlab job succeeds, mirror-radicle fails at `~/.radicle/keys/radicle: No such file or directory`. - `Instant Sync`: GH_PAT-equivalent token for the propagation workflow returns "Bad credentials" when dispatching to `.git-private-farm`. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…+ 6 dependabot alerts resolved

…docs

…t bust adjust dust) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Fifth AI-WORK §1.6 residual item: cargo check --lib --features verisim compiled with 22 errors; default (gate off) build was unaffected. This commit closes all 22 with a small surgical patch. Breakdown: - 7× E0616 private-field access (VeriSimDBClient.base_url / .http): vcl_ut composes queries directly against these two fields in six execute_* methods (find_proof, find_similar, cross_prover, provenance, dependency, fetch_goal_embedding). Exposing them to the rest of the crate via pub(crate) is cheaper and more honest than narrowing each call site through an accessor pair that would do exactly the same thing. - 3× E0422 Goal-type not found + 1× E0425 theorem-value not found: vcl_ut was missing `use crate::core::Goal;` under the verisim gate, and execute_find_proof bound the theorem name as `_theorem` (unused marker) but then referenced `theorem` at proof_identity — rename to `theorem`. - 4× cannot-find-macro `warn`: Two files imported `tracing::{debug, info}` but called `warn!` in the body. Add `warn` to the `use tracing::{...}` line in vcl_ut.rs and agent/memory.rs. - 7× E0282 type-annotations needed: Downstream fallout from the E0616 errors — once the field accesses type-check, reqwest's Result<Response, Error> inference completes without hints. No annotations actually needed. Verified: cargo check --lib clean cargo check --lib --features verisim clean (was 22 errors) Three unrelated pre-existing warnings remain (1 unreachable pattern in provers/mod.rs:1365, 2 unused-import). Not touched. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

train_models.jl walks each training_data/*.jsonl line with a hand- rolled regex because the script avoids JSON.jl as a dependency. The patterns for "goal", "prover", and "tactic" required zero whitespace after the colon — i.e. `"goal":"..."` — but the generated corpus files have `"goal": "..."` (one space). Result: the pre-fix run loaded zero proof states, zero tactics, and saved a zero-word vocabulary while reporting "complete". Fix: allow optional `\s*` after each colon in all three patterns. Equivalent to using a real JSON parser but keeps the zero-dep property the script values. Post-fix run (COMPLETE corpus, PID 3994289 at commit time): premises_COMPLETE.jsonl 300 lines proof_states_COMPLETE.jsonl 66,611 lines tactics_COMPLETE.jsonl 179,933 lines Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Julia's default ConsoleLogger wraps stderr in a stream that only flushes on tty writes. Under nohup/systemd/Claude-background runs the whole training session stays silent — no @info output, no epoch progress, no "saved models" — which hid the earlier 0-data-loaded failure (the regex parser bug fixed in e93ab25) for 17 minutes before it was caught. This commit swaps the default logger for `SimpleLogger` writing directly to stderr, and adds a defensive 1-second flush loop as belt-and-braces. Startup now prints @info lines within ~30-60s (Julia JIT precompile) instead of never. No behavioural change to the training itself — only observability. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Adds two env-var knobs to train_models.jl so multiple training runs can execute concurrently against the same source without stomping on each other's output: ECHIDNA_MODELS_DIR default "models" (output artefact root) ECHIDNA_TRAIN_EPOCHS default 50 (logistic epochs) Use case: epoch-variant sweep where a quick-baseline (5 epochs), intermediate (20 epochs), and full (50 epochs) run launch in parallel and the shortest gives a usable MRR-comparable model within minutes, while the longer runs refine behind it. Example: ECHIDNA_TRAIN_EPOCHS=5 ECHIDNA_MODELS_DIR=models/e5 \ julia src/julia/train_models.jl & ECHIDNA_TRAIN_EPOCHS=20 ECHIDNA_MODELS_DIR=models/e20 \ julia src/julia/train_models.jl & ECHIDNA_TRAIN_EPOCHS=50 ECHIDNA_MODELS_DIR=models/e50 \ julia src/julia/train_models.jl & No behavioural change when both env vars are unset. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

The `.-` in the weight-update line created a new matrix and discarded it; only the bias was being mutated. Training loss plateaued at 0.7906 across all epochs because the weights never moved. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Promotes Echidna from mock-only CI to real subprocess validation against canonical micro-goals. Wave-1 covers Tier-1 backends (apt-installable) and seeds Tier-2 (build-from-source) entries that subsequent waves will fill in. - manifests/live-provers.scm Guix manifest, primary reproducible path - flake.nix liveProverDeps + .#live-provers devShell, fallback - .github/workflows/live-provers.yml Tiered workflow: T1/PR, T2/nightly, T3/weekly, T4/quarterly + Guix-resolution allow-fail check - tests/live_prover_suite.rs Cargo-feature-gated suite; auto-skip when binary absent; version() smoke test per backend - Cargo.toml new live-provers feature flag - .machine_readable/6a2/STATE.a2ml session-2026-04-19-production-wiring-kickoff logged with locked decisions D1..D6 Decisions locked this session (full plan in ~/Desktop/ECHIDNA-PRODUCTION-WIRING-PLAN.md): D1 Cap'n Proto chosen over Bebop3 (dependability + zero-copy + maturity) D2 Chapel POC promoted to first-class layer across 7 sub-waves (L2) D3 Guix primary, Nix fallback (per project CLAUDE.md) D4 Order: L3 (this) -> L1 (capnp) -> L2 (chapel) D5 Live-prover CI tiered: T1/PR, T2/nightly, T3/weekly, T4/quarterly D6 No-JSON rule enforced; capnp kills HTTP+JSON on Rust<->Julia hot path Continuation prompts on Desktop: ECHIDNA-PRODUCTION-WIRING-PLAN.md master plan + session log ECHIDNA-L3-LIVE-PROVER-CI-PROMPT.md continuation for L3 Wave-2..4 ECHIDNA-L1-CAPNPROTO-PROMPT.md continuation for protocol swap ECHIDNA-L2-CHAPEL-PROMPT.md continuation for Chapel maximisation Note: Wave-1 is scaffolding. CI will surface any naming drift between the test harness ProverKind variants and the backend default_executable strings; those drifts are exactly the mock-vs-reality gaps this suite exists to expose. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…tandalone Cargo build --features chapel was broken end-to-end even though symbol names and struct layouts across Chapel→Zig→Rust were already correct. The static lib shipped by src/zig_ffi had unresolved chapel_* externs (chapel_stubs.c was only compiled into the test binary), so Rust's link step failed whenever no real libechidna_chapel was on the system. - src/zig_ffi/build.zig: add -Dstubs=<bool> (default true); when true, bundle chapel_stubs.c into both the shared and static echidna_chapel_ffi with -fno-sanitize=undefined (Zig's default UBSan on C pulled in __ubsan_* that Rust's linker couldn't resolve). Pass -Dstubs=false when linking against real Chapel built from chapel_poc/. - src/rust/proof_search.rs: add `use anyhow::Context;` inside the chapel_ffi submodule — pre-existing compile error exposed once the lib linked. - .machine_readable/6a2/STATE.a2ml: new session entry correcting the stale gap list — Tamarin (tamarin.rs, 592 LoC) and ProVerif (proverif.rs, 799 LoC) bridges are fully wired through ProverFactory and kind_from_u8/kind_to_u8, not "planned"; 0 TODO/FIXME in src/rust/. Verified: cargo build --features chapel OK; cargo test --features chapel --lib proof_search → 6/6 pass. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…tputs Follow-up to 2392923 — human/machine parity. - QUICKSTART-DEV.adoc: new section describing the zig build -Dstubs=true default and the cargo build --features chapel two-step; plus the -Dstubs=false path for linking against real libechidna_chapel. - chapel_poc/README.md: mark "Add FFI bindings to call from Rust" as DONE, pointing to the Chapel → Zig → Rust chain that's now wired. - .gitignore: add models/e*/, /models_e*/ (per-epoch training checkpoints, regenerable) and src/zig_ffi/zig-out/, .zig-cache/ (regenerated by `zig build`). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…tlaps tests Tier-2 (nightly) live-prover CI now has real provisioning, not TODO placeholders, for every backend except hol-light (which defers to Wave-3 container provisioning — no prebuilt binary, opam build is ~20 min and pulls camlp5). - idris2: bootstrap from source tarball against Chez Scheme. - isabelle: Isabelle2024 tarball, symlink /usr/local/bin/isabelle. - dafny: dotnet tool install --global Dafny (ubuntu-latest has sdk-8). - fstar: GitHub release tarball; binary is fstar.exe even on Linux. - tlaps: self-extracting installer from tlapm releases; binary is tlapm. - hol-light: deferred with explicit exit 0 + comment. Also adds live_fstar_version and live_tlaps_version to the Rust test suite (matches provers/mod.rs executable names: fstar.exe, tlapm) and fills in the missing library_paths field on ProverConfig that had made the test file fail to compile. Local verification: `cargo test --test live_prover_suite --features live-provers` runs all 18 tests; 13 real binaries (Z3, CVC5, EProver, Alt-Ergo, Why3, Vampire, Dafny, Isabelle, Idris2, F*, Coq, Agda, Lean4) return real versions; 5 auto-skip (GLPK/SPASS/MiniZinc/TLAPS/Chuffed not on this host). No mock tests were modified. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Session entry session-2026-04-19-l3-wave-2 captures what was wired in 8bbde25 (Tier-2 provisioning commands + 2 new Rust tests) and lists the remaining work per wave so the next agent doesn't have to re-derive it: - Wave-1: DONE (9 Tier-1 backends, 45a7a95). - Wave-2: DONE for 9; hol-light deferred. - Wave-3: scaffold only — 9 Tier-3 backends need per-backend Containerfiles. Handover hints added for each. - Wave-4: scaffold only — 19 Tier-4 backends retained as mock-only. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Makes the production-wiring plan + L1/L2/L3 continuation prompts part of the repo so a fresh clone has the full handover context without needing Desktop access. - docs/handover/PRODUCTION-WIRING-PLAN.md (← ECHIDNA-PRODUCTION-WIRING-PLAN.md) - docs/handover/L1-CAPNPROTO-PROMPT.md - docs/handover/L2-CHAPEL-PROMPT.md - docs/handover/L3-LIVE-PROVER-CI-PROMPT.md - docs/handover/README.md — drift-handling policy (both copies updated in same commit; in-repo copy wins on conflict). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Same treatment as the coord-MCP synthesis: merges the four Desktop echidna continuation prompts (L1/L2/L3 + master plan) into two tight summaries while keeping the full prompts as deeper reference. Added: - docs/handover/TODO.md — P0→P4 backlog across all three phases. P0: watch next nightly + real-Chapel CI job. P1: L3 Wave-3 + Wave-4 + Dafny deepen + VeriSimDB. P2: L1 Cap'n Proto (gated on L3 hand-off). P3: L2 Chapel 7 sub-waves (gated on L1). P4: adjacent/deferred. Includes 3 open questions (Imandra licence, CapnProto.jl vs shim, Chapel default-on threshold). - docs/handover/STATE.md — per-phase status honest: L3 Wave-1+2 DONE, Wave-3+4 scaffold only; L1 not started; L2 Zig self-link fix landed but no src/chapel/ yet. Locks decisions D1-D6 from the master plan. Corrects stale-listed Tamarin/ProVerif as fully wired. Updated: - docs/handover/README.md — index now points to TODO + STATE first; the full L1/L2/L3 prompts remain as deeper reference. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Grow the per-prover corpora for the four provers with no public downloadable training data (tiny communities, small libraries): Nuprl: 50 -> 152 (+102, +204%) Minlog: 46 -> 148 (+102, +221%) Imandra: 47 -> 133 (+86, +183%) Twelf: 33 -> 118 (+85, +258%) New categories added to scripts/generate_synthetic_provers.jl: Nuprl: sets, relations, number_theory, real_analysis, computability, category_theory, topology, algebra, probability, bishop_analysis, proof_theory, information, quotient_types, w_types Minlog: lists, trees, classical_logic, sequent_calculus, induction_advanced, higher_order, program_verification, modal_logic, intuitionistic, games, realizers_advanced, bounded_arithmetic, finite_logic Twelf: binary_trees, substitution, session_types, linear_lf, metatheory, polymorphism, modules, records, traces, bigstep, hoas, parsing, automata, effects, numerics Imandra: trees, numerics, records, state_machines, crypto, blockchain, algebra, concurrency, databases, networking, security, ml_correctness Regenerated proof_states_{nuprl,minlog,twelf,imandra}.jsonl match the script. https://claude.ai/code/session_0173ntsBsELMiXaTWvtjXdN8

Second pass: grow per-prover corpora for provers using extract scripts with synthetic fallback sections: MiniZinc: 29 -> 52 (+23, timetabling/network/logistics/puzzle/production) Dafny: 48 -> 88 (+40, maps/loops/induction/ghost/types/concurrency) Mizar: 66 -> 128 (+62, sequences/lattices/metric/finseq/ordinals/measure) F*: 76 -> 92 (+16, sequences/buffers/monotonic/tactics/dependent) Idris2: 87 -> 95 (+8, maybe/either/nat_order/stream/universe/type_level) Isabelle: 97 -> 105 (+8, sets/analysis/algebra/logic/lists/matrix/induction/auto) Combined with pass 1 (Nuprl/Minlog/Twelf/Imandra), total corpus grows from 10,645 to 11,177 entries (+532, +5%). All 17 JSONL files validate as well-formed JSON. https://claude.ai/code/session_0173ntsBsELMiXaTWvtjXdN8

Corpus expansion (pass 3): F*: 92 -> 136 (+44, machine_integers/parsers/state_machine/protocol/ arrays/monad_laws/string_processing) Idris2: 95 -> 132 (+37, bool/pair/snoclist/proof_search/with_views/cong) MiniZinc: 52 -> 82 (+30, energy/healthcare/education/telecom/agriculture/ manufacturing) Vocabulary expansion: tactic_vocab.txt: 422 -> 6,130 (14.5× growth) premise_vocab.txt: 422 -> 60,046 (142× growth) Both vocabs extracted from all 17 corpus JSONL files (context arrays for tactics, theorem names for premises). All JSONL files validate as well-formed. https://claude.ai/code/session_0173ntsBsELMiXaTWvtjXdN8

…to be regenerated)

…ontent)

chatgpt-codex-connector · 2026-04-20T07:05:53Z

Codex usage limits have been reached for code reviews. Please check with the admins of this repo to increase the limits by adding credits.
Credits must be used to enable repository wide code reviews.

hyperpolymath and others added 30 commits March 16, 2026 02:06

migrate: .scm → .a2ml state files in .machine_readable/

12b6437

Convert Guile Scheme state files to A2ML format, preserving project-specific content as structured sections. Part of global TODO cleanup (2026-03-16). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

ci: deploy missing standard workflows (2 added)

25cae21

Added from rsr-template-repo: standardizing CI/CD across all repos. Part of global TODO cleanup (2026-03-16). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

chore: restructure 6a2ml files into .machine_readable/6a2/

4aa717f

chore: add SPDX headers to source files

6041c4a

chore: restructure nested 6a2ml files into .machine_readable/6a2/

1277311

feat: add CLADE.a2ml — clade taxonomy declaration

78a38fa

Part of gv-clade-index Phase 1: every repo declares its identity, primary clade, and forge mappings for the VeriSimDB central registry. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

chore: update manifest paths to .machine_readable/6a2/

aae7907

chore: SHA-pin GitHub Actions for supply chain security

9e8bfeb

Pin all GitHub Actions to specific commit SHAs instead of mutable tags to prevent supply chain attacks. Tags preserved as comments. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

chore: add #![forbid(unsafe_code)] to safe Rust crates

f40d67a

Adds the forbid(unsafe_code) crate-level attribute to all Rust crates that do not use unsafe code, hardening the safety guarantee at compile time. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

chore: RSR sync and mass repository update

a7f7d99

chore: Big Unification — attach to BoJ Server / Casket architecture

e28695b

chore: Big Unification — attach to BoJ Server / Casket architecture

00ead12

chore(ci): maximize ci/cd values via dependabot and permissions

25ecb8b

chore(security): harden ML API URL handling to mitigate SSRF

4308063

fix(ci): Resolve workflow-linter self-matching and metadata issues

0d127eb

chore(ci): maximize ci/cd values via dependabot and permissions

1c8f877

fix(ci): Resolve workflow-linter self-matching and metadata issues

cf8468c

Chore/cicd optimizations (#8)

c55390b

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

chore(floor-raise): add foundational tool integrations

23e001a

Add AI manifest, Trustfile, Dustfile, and assail recipe as part of the Floor Raise campaign to establish baseline tooling across all repos. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

hyperpolymath and others added 27 commits April 18, 2026 13:42

chore(state): record CI hygiene close (e62bf25) — code-side failures …

abc1d23

…+ 6 dependabot alerts resolved

Standardize workspace: Justfile migration and A2ML directive cleanup

ef8f959

Standardize workspace: Capitalize Justfile references in scripts and …

46fb5b7

…docs

chore(contractiles): remove lust (not in 6-verb set: intend trust mus…

4d79298

…t bust adjust dust) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

chore: checkpoint local changes

eec182c

Standardize workspace: Justfile migration and A2ML directive cleanup

06af550

chore: resolve merge conflicts by favoring PR branch for data files (…

97617dd

…to be regenerated)

fix: resolve merge conflicts and fix syntax errors in extraction scripts

1b5f6d3

style: apply cargo fmt

0ac1cbe

chore: resolve accidental conflicts from stash pop

d63b4eb

chore: reconnect branch history with main (favoring verified branch c…

f1f92ae

…ontent)

hyperpolymath merged commit 03ae8ba into main Apr 20, 2026
26 of 40 checks passed

hyperpolymath deleted the claude/expand-vocabulary-corpus-1SFwy branch April 20, 2026 07:07

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat(corpus): resolved & expanded training corpus (700K+ proofs)#25

feat(corpus): resolved & expanded training corpus (700K+ proofs)#25
hyperpolymath merged 449 commits intomainfrom
claude/expand-vocabulary-corpus-1SFwy

hyperpolymath commented Apr 20, 2026

Uh oh!

chatgpt-codex-connector Bot commented Apr 20, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

hyperpolymath commented Apr 20, 2026

Uh oh!

chatgpt-codex-connector Bot commented Apr 20, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants