Skip to content

v1.3.0 — Swarm operations hardening

Choose a tag to compare

@cgbarlow cgbarlow released this 11 Jun 23:23
· 2386 commits to main since this release
44f3c89

The release between the chain and the compounding: everything the first real decomposition runs taught us, shipped as machinery.

Proof-surface policy

  • ADR-013 — model/effort policy: prove and decompose calls default to the most capable model (fable), env-overridable, fail-soft on older CLIs.
  • ADR-015 — progressive effort escalation: attempts climb highxhighmax (default budget 3, one per rung; decomposition always at the top rung). Deep reasoning is spent only on statements that resisted a cheaper pass — in production this release's lemmas closed overwhelmingly at the first rung.

Compounding

  • ADR-014 — cross-goal dependency reuse: a goal's proved dependencies (declared deps + its own decomposition's subs) surface in the prove prompt as importable Unsorry.* modules. Merged lemmas now make later proofs cheaper by mechanism, not aspiration.

Resilience (born from three real quota outages in one day)

  • ADR-016 — infrastructure-failure guard: a claude call that dies fast and fails a cheap health probe is infrastructure, not goal evidence — claim released, no demote, no decompose, agent exits 3. Queue state now only ever encodes model evidence about goals.
  • ADR-017 — swarm supervisor: swarm/supervise.sh drives a goal tree to closure across outages (exponential backoff on exit 3), cycle failures, and merge latency; PR hygiene closes duplicate prove PRs and loudly flags CONFLICTING ones (GitHub runs no checks on a conflicted PR — an armed auto-merge would otherwise wait forever in silence). Selection skips goals with an open prove PR.

Integrity fixes

  • Decomposition and index records reference statements by content address — raw Lean statements contain braces the record grammar reserves; Gate B recomputes shas from the goal files (strictly stronger integrity).

Repo hygiene

  • 11-label PR taxonomy, deterministically classified from machine-generated titles, auto-applied in CI; all historical PRs retro-labelled.
  • Mathlib upstream plan (thread C): machine-prepared packets, human-sponsored PRs, autonomous mathlib PRs an explicit non-goal.

32 hermetic agent self-tests + 3 supervisor self-tests; shellcheck-clean; 258 pytest.

Next: v1.4.0 ships Thread A — the forced decompose → prove-subs → recompose chain carrying a real proof end-to-end (11/13 of the target tree already merged). v1.5.0 ships Thread B — merged work reused as an importable dependency.