Stop waiting for Opus on every grep.
93.8% of Claude Code tokens go to Opus unnecessarily. better-model routes tasks to the right model — up to 40% faster AI responses, same code quality.
npx better-model initYou pay for Max or Team Premium. You get Opus on every task. Sounds great — until you notice:
- File search? Opus. 3–5 seconds wait.
- Grep for a function name? Opus. 3–5 seconds wait.
- Write a single test? Opus. 10+ seconds wait.
- Rename a variable? Opus. 10+ seconds wait.
Sonnet handles all of these just as well — in half the time.
| Metric | Opus 4.6 | Sonnet 4.6 | Difference |
|---|---|---|---|
| SWE-bench (coding) | 80.9% | 79.6% | -1.3 points |
| Response speed | baseline | ~1.4x faster | you feel this |
| Rate limit (TPM) | 30K | 90K | 3x headroom |
| GPQA Diamond (reasoning) | 91.3% | 74.1% | Opus wins here |
The gap only matters for architecture, security audits, multi-file refactoring, and novel problem-solving. That's ~20% of tasks. better-model routes the other 80% to where they belong.
Step 1. Run npx better-model init in your project.
Step 2. It creates two optimized agents (sonnet-coder and haiku-explorer), drops a decision matrix into docs/BETTER-MODEL.md, adds a CRITICAL routing block to CLAUDE.md, and injects model: frontmatter into any existing .claude/agents/ and .claude/skills/.
Step 3. Claude Code reads the routing block at session start and dispatches subagent tasks to the right model — Sonnet for coding, Haiku for search, Opus for architecture and code review.
That's it. No dependencies, no proxies, no hooks. Two agents, one decision matrix, correct frontmatter.
| Mode | Command | What it does |
|---|---|---|
| Enforcement (default) | npx better-model init |
Agents + routing block + inject model: into agents/skills |
| Soft | npx better-model init --soft |
Matrix as reference only — no agents, no frontmatter changes |
Tip
In a field test, a Claude Code session read the decision matrix in soft mode and proactively updated agent configs on its own — applying the correct model to all 8 agents and skills without audit --fix being run.
| Command | Description |
|---|---|
npx better-model init |
Install with enforcement (default) |
npx better-model init --soft |
Install soft mode — reference only |
npx better-model audit |
Report agents/skills missing model settings |
npx better-model audit --fix |
Auto-inject model frontmatter |
npx better-model reset |
Remove better-model and restore defaults |
npx better-model status |
Check installation status |
The decision matrix organizes tasks into three tiers based on published benchmarks:
Tier 1 — Haiku (~20% of tasks)
Codebase exploration, file search, pattern matching. Short, focused subagent tasks that require no reasoning.
Limitation: unreliable beyond ~15 turns. Use only for quick subagent bursts.
Tier 2 — Sonnet (~60% of tasks)
The default for most coding: code generation, feature implementation, test writing, simple refactoring (1–2 files), single-file debugging.
Sonnet delivers 98% of Opus's coding quality at 1.4x the speed.
Tier 3 — Opus (~20% of tasks)
Reserved for tasks where Sonnet has documented failure modes: multi-file refactoring (3+ files with behavioral dependencies), cross-file debugging, architecture design, security audits, code review, novel algorithm design, large-context analysis (>200K tokens).
The GPQA gap (17.2 points) and ARC-AGI-2 gap (10.5 points) are real — Opus earns its place here.
- Default to Sonnet + medium effort — covers ~60% of tasks.
- Escalate to Opus when the task spans 3+ files, requires expert reasoning, or has security implications.
- Downgrade to Haiku for search and pattern-matching subagents.
- On Sonnet failure, escalate to Opus — don't retry at higher effort. A stronger model at lower effort outperforms a weaker model at higher effort.
See the full decision matrix for complete details and evidence.
You can! better-model is just a well-researched starting point:
- Evidence-based: every routing rule cites published benchmarks, not vibes
- Ships ready-to-use agents:
sonnet-coderandhaiku-explorerwithmodel:frontmatter — 100% compliance vs ~70% from CLAUDE.md alone - Inference engine: maps agent names to the right tier automatically (review → Opus, migrate → Opus, scan → Haiku)
- Maintained: as models and benchmarks evolve,
npx better-model@latest initgets you the updated matrix - Reversible:
npx better-model resetremoves everything cleanly
- SWE-bench — Opus 80.9% vs Sonnet 79.6%
- GPQA Diamond — Opus 91.3% vs Sonnet 74.1%
- ARC-AGI-2 — Opus 28.7% vs Sonnet 18.2%
- SonarSource — AI code security analysis
- CodeRabbit — LLM code review quality
- RouteLLM — model routing research (ICLR)
- Claude Code #27665 — real token usage data
- Anthropic docs — official model specs
npx better-model initThen start a Claude Code session. Watch it pick Sonnet for your next grep.
npx better-model reset && npx better-model@latest initThis replaces the old single-line reference with the new routing block and installs the agents.
Found it useful? Star the repo — it helps others find it.
Found a bug? Open an issue.
Want to improve the matrix? See CONTRIBUTING.md.
- Node.js 18+
- A project using Claude Code