better-model

Stop waiting for Opus on every grep.

93.8% of Claude Code tokens go to Opus unnecessarily. better-model routes tasks to the right model — up to 40% faster AI responses, same code quality.

npx better-model init

The problem

You pay for Max or Team Premium. You get Opus on every task. Sounds great — until you notice:

File search? Opus. 3–5 seconds wait.
Grep for a function name? Opus. 3–5 seconds wait.
Write a single test? Opus. 10+ seconds wait.
Rename a variable? Opus. 10+ seconds wait.

Sonnet handles all of these just as well — in half the time.

Metric	Opus 4.6	Sonnet 4.6	Difference
SWE-bench (coding)	80.9%	79.6%	-1.3 points
Response speed	baseline	~1.4x faster	you feel this
Rate limit (TPM)	30K	90K	3x headroom
GPQA Diamond (reasoning)	91.3%	74.1%	Opus wins here

The gap only matters for architecture, security audits, multi-file refactoring, and novel problem-solving. That's ~20% of tasks. better-model routes the other 80% to where they belong.

How it works

Step 1. Run npx better-model init in your project.

Step 2. It creates two optimized agents (sonnet-coder and haiku-explorer), drops a decision matrix into docs/BETTER-MODEL.md, adds a CRITICAL routing block to CLAUDE.md, and injects model: frontmatter into any existing .claude/agents/ and .claude/skills/.

Step 3. Claude Code reads the routing block at session start and dispatches subagent tasks to the right model — Sonnet for coding, Haiku for search, Opus for architecture and code review.

That's it. No dependencies, no proxies, no hooks. Two agents, one decision matrix, correct frontmatter.

Two modes

Mode	Command	What it does
Enforcement (default)	`npx better-model init`	Agents + routing block + inject `model:` into agents/skills
Soft	`npx better-model init --soft`	Matrix as reference only — no agents, no frontmatter changes

Tip

In a field test, a Claude Code session read the decision matrix in soft mode and proactively updated agent configs on its own — applying the correct model to all 8 agents and skills without audit --fix being run.

Commands

Command	Description
`npx better-model init`	Install with enforcement (default)
`npx better-model init --soft`	Install soft mode — reference only
`npx better-model audit`	Report agents/skills missing model settings
`npx better-model audit --fix`	Auto-inject model frontmatter
`npx better-model reset`	Remove better-model and restore defaults
`npx better-model status`	Check installation status

The algorithm

The decision matrix organizes tasks into three tiers based on published benchmarks:

Tier 1 — Haiku (~20% of tasks)

Codebase exploration, file search, pattern matching. Short, focused subagent tasks that require no reasoning.

Limitation: unreliable beyond ~15 turns. Use only for quick subagent bursts.

Tier 2 — Sonnet (~60% of tasks)

The default for most coding: code generation, feature implementation, test writing, simple refactoring (1–2 files), single-file debugging.

Sonnet delivers 98% of Opus's coding quality at 1.4x the speed.

Tier 3 — Opus (~20% of tasks)

Reserved for tasks where Sonnet has documented failure modes: multi-file refactoring (3+ files with behavioral dependencies), cross-file debugging, architecture design, security audits, code review, novel algorithm design, large-context analysis (>200K tokens).

The GPQA gap (17.2 points) and ARC-AGI-2 gap (10.5 points) are real — Opus earns its place here.

Key rules

Default to Sonnet + medium effort — covers ~60% of tasks.
Escalate to Opus when the task spans 3+ files, requires expert reasoning, or has security implications.
Downgrade to Haiku for search and pattern-matching subagents.
On Sonnet failure, escalate to Opus — don't retry at higher effort. A stronger model at lower effort outperforms a weaker model at higher effort.

See the full decision matrix for complete details and evidence.

Why not just write CLAUDE.md rules yourself?

You can! better-model is just a well-researched starting point:

Evidence-based: every routing rule cites published benchmarks, not vibes
Ships ready-to-use agents: sonnet-coder and haiku-explorer with model: frontmatter — 100% compliance vs ~70% from CLAUDE.md alone
Inference engine: maps agent names to the right tier automatically (review → Opus, migrate → Opus, scan → Haiku)
Maintained: as models and benchmarks evolve, npx better-model@latest init gets you the updated matrix
Reversible: npx better-model reset removes everything cleanly

Evidence base

SWE-bench — Opus 80.9% vs Sonnet 79.6%
GPQA Diamond — Opus 91.3% vs Sonnet 74.1%
ARC-AGI-2 — Opus 28.7% vs Sonnet 18.2%
SonarSource — AI code security analysis
CodeRabbit — LLM code review quality
RouteLLM — model routing research (ICLR)
Claude Code #27665 — real token usage data
Anthropic docs — official model specs

Get started

npx better-model init

Then start a Claude Code session. Watch it pick Sonnet for your next grep.

Upgrading from v0.4.x

npx better-model reset && npx better-model@latest init

This replaces the old single-line reference with the new routing block and installs the agents.

Found it useful? Star the repo — it helps others find it.

Found a bug? Open an issue.

Want to improve the matrix? See CONTRIBUTING.md.

Requirements

Node.js 18+
A project using Claude Code

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 19 Commits
.github		.github
bin		bin
docs/superpowers/specs		docs/superpowers/specs
src		src
templates		templates
test		test
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
CLAUDE.md		CLAUDE.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
SECURITY.md		SECURITY.md
package.json		package.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

better-model

The problem

How it works

Two modes

Commands

The algorithm

Key rules

Why not just write CLAUDE.md rules yourself?

Evidence base

Get started

Upgrading from v0.4.x

Requirements

License

About

Uh oh!

Releases 2

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

better-model

The problem

How it works

Two modes

Commands

The algorithm

Key rules

Why not just write CLAUDE.md rules yourself?

Evidence base

Get started

Upgrading from v0.4.x

Requirements

License

About

Topics

Resources

License

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases 2

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages