Skip to content

ultraswarm v2.3.0 — Claude-Model Token Optimization

Choose a tag to compare

@fubak fubak released this 12 Jun 14:59
· 69 commits to main since this release
fc65762

Token-optimizes ultraswarm's internal Claude-model usage — the part you actually pay for — without losing quality. Informed by a deep analysis of the skill + router against the state of the art in LLM model routing (RouteLLM, NotDiamond, FrugalGPT cascades, GPT-5 router, Claude effort); the design already matched the dominant patterns, and this release sharpens it.

Changed

  • Per-phase routing is now real, not aspirational. Phases 3 (merge) and 4 (report) delegate mechanical work to Agent({ model: 'haiku' }) subagents (merge escalates to sonnet only on conflict). The old "Use Haiku for merge/report" note was inert — inline phases run on the session model (typically Opus) and a skill can't downshift its own main loop, so mechanical work was billed at Opus rates. This is the dominant share of a routine run's ~70–80k tokens.
  • High-risk adversarial QA → cost-aware cascade (FrugalGPT-style). Security lens always Opus (asymmetric risk); correctness/regression run Sonnet-first and escalate to Opus only on refute/borderline (<75). Quorum (≥2), score (≥60), and zero-critical-refutation guarantees unchanged. Cuts most of the ~250–550k high-risk path on clean work.
  • Trimmed enhancedImplPrompt ~in half — the Bash-only wrapper never needed the intelligence scaffolding.

Added

  • Fable 5 as an opt-in ceiling via intelligence.maxIntelligence (default off). Flips only the security lens + expert-escalation Opus→Fable. Out of the hot path by default (Fable ≈ +30% tokens + premium price). fable is now a valid claudeModels value.

Fixed

  • router.mjs: clarified that complexityThresholds.expert is a validation ordering anchor only — getTier never reads it. Validation message now lists fable.

Verification: router 18/18, harness 17/17, validate.sh 11/11.