Skip to content

feat(presets): add 'optimal' preset (multi-provider speed-first routing)#295

Merged
Destynova2 merged 1 commit intomainfrom
feat/optimal-preset
Apr 27, 2026
Merged

feat(presets): add 'optimal' preset (multi-provider speed-first routing)#295
Destynova2 merged 1 commit intomainfrom
feat/optimal-preset

Conversation

@Destynova2
Copy link
Copy Markdown
Contributor

Summary

  • Adds a new shipped preset optimal.toml alongside existing cheap/fast/perf/medium/local.
  • Targets the best price / perf / quality ratio for Claude Code users with Anthropic Max, validated by real benchmarks (2026-04-26).
  • Wires 6 providers (Anthropic OAuth Max, OpenRouter, DeepSeek direct, Groq, Mercury, xAI), 8 virtual models, and 9 routing tiers.
  • xAI Grok provider stub is included (enabled = false) with commented [[models.mappings]] blocks ready to uncomment after grob secrets add xai.

Routing strategy

Slot Primary Fallbacks
trivial / background Groq gpt-oss-20b (960 t/s mesuré) Anthropic Haiku → OR Haiku → Mercury 2
default DeepSeek V4-Flash (81% SWE-V, 1M ctx) V3.2 → Groq gpt-oss-120b → Sonnet 4.6
think Claude Opus 4.7 (87.6% SWE-V, 1549 LMArena coding) V4-Pro (80.6%) → GPT-5.5 → Sonnet 4.6
search Claude Sonnet 4.6 (tool-use, OAuth free) OR Sonnet → Gemini 3.1 Pro
Rust / k8s / Terraform Anthropic only (quality on unsafe/HCL/YAML)
long context (>150k) OR / DeepSeek (1M+ ctx)

Tier matchers cover max_tokens_below = 500 (trivial), min_input_tokens thresholds (30k / 150k), file-pattern + keyword combos for Rust, Terraform, Kubernetes, CI/CD YAML, and min_messages = 50 for big-history conversations.

Bench data baked into the preset (2026-04-26)

Model t/s effective RTT 200 tok
Groq openai/gpt-oss-20b 960 364ms
Groq openai/gpt-oss-120b 472 643ms
Groq llama-3.1-8b-instant 793 377ms
Mercury 2 (Inception Labs) 155 (reasoning overhead) 1250ms
DeepSeek V4-Flash (OR) ~83 743ms

Mercury's vendor-claimed 1109 t/s drops to ~155 t/s effective because the model is reasoning-heavy. It stays as a last-resort fallback.

Test plan

  • python3 -c "import tomllib; tomllib.loads(open('presets/optimal.toml').read())" — TOML parses
  • grob preset info optimal — preset lists 6 providers, 8 models, all router slots wired
  • grob preset apply optimal --dry-run — emits valid normalized TOML
  • No hardcoded version strings (grep -E '\bv[0-9]+\.[0-9]+\.[0-9]+\b' clean)
  • Docs lint passes in CI (markdownlint + lychee)
  • Manual smoke after merge: grob preset apply optimal --reload then a small request hits Groq via tier:trivial

🤖 Generated with Claude Code

New shipped preset alongside cheap/fast/perf/medium/local. Targets the
best price/perf/quality ratio for Claude Code users with Anthropic Max
subscription, validated by real benchmarks.

Strategy:
- Trivial / background : Groq gpt-oss-20b (LPU, 960 t/s measured)
- Default              : DeepSeek V4-Flash (81% SWE-V, 1M ctx)
- Think                : Claude Opus 4.7 (87.6% SWE-V, OAuth Max free)
- Search               : Claude Sonnet 4.6 (tool-use, OAuth free)
- Long context (>150k) : V4-Pro / OpenRouter (1M+ ctx providers)
- Rust / k8s / TF      : Anthropic only (quality on unsafe/HCL/YAML)

Includes commented blocks for xAI Grok activation (grok-4-fast,
grok-code-fast-1, grok-4) once a key is added. Mercury 2 included as
last-resort fallback with a note about the post-2026-02-24 model gating.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@Destynova2 Destynova2 enabled auto-merge April 27, 2026 06:43
@Destynova2 Destynova2 merged commit 4e869d9 into main Apr 27, 2026
28 checks passed
@Destynova2 Destynova2 deleted the feat/optimal-preset branch April 27, 2026 06:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant