feat(routing): wire Cerebras + drop byok-paid + deprecate localFirstAI#53
Merged
Merged
Conversation
The 'byok-paid' option in the routing policy union was never wired to a distinct selection path; it fell through to the same scoring code as 'auto-cheapest'. Removing it tightens the type, simplifies the Settings UI, and avoids advertising a feature that does not exist. Users with the value persisted (set via the Settings select before this change) are silently coerced to 'auto-cheapest' on load - the score-based selector already handles the "prefer paid BYOK keys" use case via its existing capability tier weights. If a use case for an explicit paid-only filter surfaces, it can be reintroduced behind a clearer name. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
`localFirstAI` and `routingPolicy === 'local-only'` describe the same
intent ("never route to the cloud"). Maintaining two switches is a
source of drift; this collapses them onto `routingPolicy` and keeps
`localFirstAI` as a read-only backward-compat input.
Migration runs once on settings load: when `routingPolicy` is unset
(typical for installs predating this work), `localFirstAI: true` is
translated to `routingPolicy: 'local-only'` and `localFirstAI: false`
to `'auto-cheapest'`. An explicitly-set `routingPolicy` always wins.
Routing call sites that previously read `localFirstAI` now read both
signals (`routingPolicy === 'local-only' || localFirstAI`) so installs
that haven't migrated yet keep working. The `localFirstAI` field is
marked `@deprecated` for removal in a future release.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Cerebras Cloud is the highest-quality free-tier provider on the router's
quality ladder (rank 100, ahead of Groq at 80) but the existing entry in
freeTierConstants.ts was a dead slot: cortexProviderName was null because
no underlying provider plumbing existed, so the router could never pick
it. This adds the missing wiring end to end:
- new 'cerebras' ProviderName with apiKey setting and OpenAI-compatible
endpoint (https://api.cerebras.ai/v1)
- four default models matching the Cerebras free-tier catalogue
(llama-4-scout, qwen-3-32b, deepseek-r1-distill-llama-70b, llama-3.3-70b)
with the 8K context cap baked into model options
- cerebras entry in modelSettingsOfProvider, settings display info,
apiKey placeholder, and subText link to https://cloud.cerebras.ai
- dispatch table entry routing chat through the OpenAI-compatible path
- freeTierConstants.ts: set cortexProviderName: 'cerebras' so the
ladder will rank a configured Cerebras key ahead of Groq/Gemini
Tests: extend freeTierLadder.test.ts with two Cerebras-specific cases
(top-of-ladder when both have quota; failover to Groq when exhausted).
Reasoning models (qwen-3, deepseek-r1-distill) declare reasoning
capabilities with <think> tag parsing; other models declare
reasoningCapabilities: false. Tool calling uses the standard openai-style
format documented in Cerebras's API reference.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Follow-up to the merged free-tier router (PR #51). Addresses three of the five items punted in that PR's body.
Changes
Still punted (separate work)
Verification