Skip to content

talkstream/better-model

Repository files navigation

better-model

Stop waiting for Opus on every grep.

93.8% of Claude Code tokens go to Opus unnecessarily. better-model routes tasks to the right model — up to 40% faster AI responses, same code quality.

npm version zero dependencies install size license

npx better-model init

The problem

You pay for Max or Team Premium. You get Opus on every task. Sounds great — until you notice:

  • File search? Opus. 3–5 seconds wait.
  • Grep for a function name? Opus. 3–5 seconds wait.
  • Write a single test? Opus. 10+ seconds wait.
  • Rename a variable? Opus. 10+ seconds wait.

Sonnet handles all of these just as well — in half the time.

Metric Opus 4.6 Sonnet 4.6 Difference
SWE-bench (coding) 80.9% 79.6% -1.3 points
Response speed baseline ~1.4x faster you feel this
Rate limit (TPM) 30K 90K 3x headroom
GPQA Diamond (reasoning) 91.3% 74.1% Opus wins here

The gap only matters for architecture, security audits, multi-file refactoring, and novel problem-solving. That's ~20% of tasks. better-model routes the other 80% to where they belong.

How it works

Step 1. Run npx better-model init in your project.

Step 2. It creates two optimized agents (sonnet-coder and haiku-explorer), drops a decision matrix into docs/BETTER-MODEL.md, adds a CRITICAL routing block to CLAUDE.md, and injects model: frontmatter into any existing .claude/agents/ and .claude/skills/.

Step 3. Claude Code reads the routing block at session start and dispatches subagent tasks to the right model — Sonnet for coding, Haiku for search, Opus for architecture and code review.

That's it. No dependencies, no proxies, no hooks. Two agents, one decision matrix, correct frontmatter.

Two modes

Mode Command What it does
Enforcement (default) npx better-model init Agents + routing block + inject model: into agents/skills
Soft npx better-model init --soft Matrix as reference only — no agents, no frontmatter changes

Tip

In a field test, a Claude Code session read the decision matrix in soft mode and proactively updated agent configs on its own — applying the correct model to all 8 agents and skills without audit --fix being run.

Commands

Command Description
npx better-model init Install with enforcement (default)
npx better-model init --soft Install soft mode — reference only
npx better-model audit Report agents/skills missing model settings
npx better-model audit --fix Auto-inject model frontmatter
npx better-model reset Remove better-model and restore defaults
npx better-model status Check installation status

The algorithm

The decision matrix organizes tasks into three tiers based on published benchmarks:

Tier 1 — Haiku (~20% of tasks)

Codebase exploration, file search, pattern matching. Short, focused subagent tasks that require no reasoning.

Limitation: unreliable beyond ~15 turns. Use only for quick subagent bursts.

Tier 2 — Sonnet (~60% of tasks)

The default for most coding: code generation, feature implementation, test writing, simple refactoring (1–2 files), single-file debugging.

Sonnet delivers 98% of Opus's coding quality at 1.4x the speed.

Tier 3 — Opus (~20% of tasks)

Reserved for tasks where Sonnet has documented failure modes: multi-file refactoring (3+ files with behavioral dependencies), cross-file debugging, architecture design, security audits, code review, novel algorithm design, large-context analysis (>200K tokens).

The GPQA gap (17.2 points) and ARC-AGI-2 gap (10.5 points) are real — Opus earns its place here.

Key rules

  1. Default to Sonnet + medium effort — covers ~60% of tasks.
  2. Escalate to Opus when the task spans 3+ files, requires expert reasoning, or has security implications.
  3. Downgrade to Haiku for search and pattern-matching subagents.
  4. On Sonnet failure, escalate to Opus — don't retry at higher effort. A stronger model at lower effort outperforms a weaker model at higher effort.

See the full decision matrix for complete details and evidence.

Why not just write CLAUDE.md rules yourself?

You can! better-model is just a well-researched starting point:

  • Evidence-based: every routing rule cites published benchmarks, not vibes
  • Ships ready-to-use agents: sonnet-coder and haiku-explorer with model: frontmatter — 100% compliance vs ~70% from CLAUDE.md alone
  • Inference engine: maps agent names to the right tier automatically (review → Opus, migrate → Opus, scan → Haiku)
  • Maintained: as models and benchmarks evolve, npx better-model@latest init gets you the updated matrix
  • Reversible: npx better-model reset removes everything cleanly

Evidence base

Get started

npx better-model init

Then start a Claude Code session. Watch it pick Sonnet for your next grep.

Upgrading from v0.4.x

npx better-model reset && npx better-model@latest init

This replaces the old single-line reference with the new routing block and installs the agents.


Found it useful? Star the repo — it helps others find it.

Found a bug? Open an issue.

Want to improve the matrix? See CONTRIBUTING.md.

Requirements

License

MIT

About

Stop waiting for Opus on every grep — routes 80% of tasks to faster models. Zero dependencies.

Topics

Resources

License

Contributing

Security policy

Stars

Watchers

Forks

Packages

 
 
 

Contributors