ultraswarm v2.3.0 — Claude-Model Token Optimization
Token-optimizes ultraswarm's internal Claude-model usage — the part you actually pay for — without losing quality. Informed by a deep analysis of the skill + router against the state of the art in LLM model routing (RouteLLM, NotDiamond, FrugalGPT cascades, GPT-5 router, Claude effort); the design already matched the dominant patterns, and this release sharpens it.
Changed
- Per-phase routing is now real, not aspirational. Phases 3 (merge) and 4 (report) delegate mechanical work to
Agent({ model: 'haiku' })subagents (merge escalates tosonnetonly on conflict). The old "Use Haiku for merge/report" note was inert — inline phases run on the session model (typically Opus) and a skill can't downshift its own main loop, so mechanical work was billed at Opus rates. This is the dominant share of a routine run's ~70–80k tokens. - High-risk adversarial QA → cost-aware cascade (FrugalGPT-style). Security lens always Opus (asymmetric risk); correctness/regression run Sonnet-first and escalate to Opus only on refute/borderline (
<75). Quorum (≥2), score (≥60), and zero-critical-refutation guarantees unchanged. Cuts most of the ~250–550k high-risk path on clean work. - Trimmed
enhancedImplPrompt~in half — the Bash-only wrapper never needed the intelligence scaffolding.
Added
- Fable 5 as an opt-in ceiling via
intelligence.maxIntelligence(default off). Flips only the security lens + expert-escalation Opus→Fable. Out of the hot path by default (Fable ≈ +30% tokens + premium price).fableis now a validclaudeModelsvalue.
Fixed
router.mjs: clarified thatcomplexityThresholds.expertis a validation ordering anchor only —getTiernever reads it. Validation message now listsfable.
Verification: router 18/18, harness 17/17, validate.sh 11/11.