Headline: the complete automatic template-optimization loop for SkillOpt, plus runtime-reliability and orchestration improvements. Every new SkillOpt capability is off by default, additive, and keeps promotion manual.
SkillOpt — automatic template optimization (new, off by default)
Agent templates can now self-improve from real usage, feeding the existing gitmoot-skillopt optimizer — no weight training, manual promotion preserved.
- Mode A — outcome harvest (#465/#468): verifiable job outcomes (merge/CI/review/revert) become auto-trace feedback; automatic revert detection flips earlier positives (#467); deterministic tool checkers (dupl/jscpd/golangci-lint/gocyclo/diff-size) add objective dimensions (#485).
- Cross-family review signal (#469/#470): a different-model-family reviewer scores quality + scope-fidelity as a soft, secondary signal.
- Mode B — champion/challenger (#473): Thompson-sampling bandit +
gitmoot skillopt ab, live-traffic A/B interception for ask agents (#482), and a cross-family LLM-judge auto-pairwise (#483). - Promotion (#471/#472): configurable auto-promote with a confidence guardrail +
candidate.*notifications; canary + auto-rollback (#484) routes a sampled fraction of traffic to a canary and rolls back on regression. - Judge hardening (epic #344): cross-family judge jury with median/majority/minority-veto + disagreement flag (#349); live pairwise evaluation — a blinded paired review packet (gitmoot-skillopt v0.4.0) ingested into canonical feedback (#508). (Trajectory digest for the judge ships in gitmoot-skillopt v0.4.0.)
Runtime reliability
- Claude transient-401 ("socket connection closed") retry with exponential backoff (#487/#509).
- Bounded re-ask loop on malformed agent output instead of a hard failure (#495).
- More reliable read-only
agent ask(no implement-guard block, clear timeout) (#496).
Orchestration & dashboard
--recipeflag for deterministic fan-out (#477).- Delegation worktree + branch cleanup (#492); job-kill releases stranded locks and terminalizes queued children (#491).
- Dashboard surfaces in-flight jobs as an
active_jobsview (#505).
Tests
- Full-chain E2Es: canary lifecycle (#504), Mode A harvest (#465), and a cross-repo pairwise round-trip (#514).
Pairs with gitmoot-skillopt v0.4.0. Upgrade: gitmoot update.