Wayfinder Router 2026.6.6 - Claude Code, and a budget that bends
Two ways the gateway goes further. It now speaks Anthropic's Messages API, so Claude Code - and anything else pointed at ANTHROPIC_BASE_URL - routes through Wayfinder with a single environment variable. And it grows a spend budget that, when you hit your cap, degrades to the cheaper tier instead of cutting you off. Both are pure additions around the same offline scorer: the routing decision is unchanged, still computed with no model call.
Claude Code, through Wayfinder
Claude Code speaks Anthropic's Messages API, not OpenAI Chat Completions, so a base_url swap never worked - until now. A new POST /v1/messages adapter translates Messages ⇄ Chat Completions in both directions (buffered and streaming, tool use included) and hands the request to the same router everything else uses. Point Claude Code at the gateway:
pip install -U "wayfinder-router[gateway]"
wayfinder-router serve
export ANTHROPIC_BASE_URL="http://localhost:8088" # the client appends /v1/messages
export ANTHROPIC_API_KEY="unused" # the gateway uses each upstream's own key
claudeWayfinder scores each turn and routes it to the configured tier; the Claude model id rides along but the decision is Wayfinder's (send a configured endpoint name to pin one call). Every x-wayfinder-router-* header you get on the OpenAI endpoint - model, score, mode, served-by, budget - rides along here too, because it is the same decision. Streaming and tool calls are translated end to end; image/vision blocks and extended thinking are not translated yet. Full recipe in docs/integrations.md.
A budget that bends instead of breaks
A spend cap that is true to a router. Set a limit, and when the period's realized cost reaches it Wayfinder doesn't fail your requests - it degrades them to the cheapest tier (the same degrade primitive failover uses, which never raises cost). Want a hard stop instead? on_breach = "block" returns an HTTP 402.
[routing]
threshold = 0.55 # below → local, at/above → cloud
[gateway.budget]
limit = 5.0 # in your price table's unit (USD when cost_per_1k is set)
window = "day" # day | month | all
on_breach = "degrade" # degrade to the cheapest tier (default) | block (HTTP 402)
[gateway.models.local]
base_url = "http://localhost:11434/v1"
model = "llama3.1:8b"
cost_per_1k = 0.0
[gateway.models.cloud]
base_url = "https://api.openai.com/v1"
model = "gpt-4o"
api_key_env = "OPENAI_API_KEY"
cost_per_1k = 0.0125With that config, once the day's spend hits $5 a prompt that would route to cloud is served by local instead - and the response says so: x-wayfinder-router-budget: degraded, mode: budget-degraded, with the true complexity score still in x-wayfinder-router-score. A breach is never silent. Budgets enforce only when real cost_per_1k prices are configured; a relative-units demo has nothing to cap.
The boundary still holds
Both features live in the invocation layer. The adapter scores nothing; the budget recomputes nothing - the complexity score and tier choice are the same deterministic, offline decision as always. A budget changes which tier a request is delivered to; the adapter changes only the request's shape. No model call enters the decision path.
Upgrading
pip install -U "wayfinder-router[gateway]"Fully additive: existing configs, clients, and the JSON/header contract are unchanged. [gateway.budget] is opt-in (no cap unless you set one), and /v1/messages is a new endpoint that leaves the OpenAI path untouched.
Full changelog: https://github.com/itsthelore/wayfinder-router/blob/main/CHANGELOG.md