The assistant runs Claude Opus/Sonnet through OpenRouter but never requests Claude's 1M-context beta, so the usable context window is Anthropic's default 200k rather than 1M. The token gauge now reports 200k honestly (it previously fell back to 128k for the concrete served model id — fixed in 870df25), and app/src/lib/model-capabilities.ts caps the Anthropic aliases at 200k with a note to bump them back once this is wired up.
To actually get the 1M window we need to:
- Send the 1M-context beta header on requests to Claude models via the OpenRouter provider. The AI SDK OpenRouter provider is created in
app/src/server/agent.ts (getModel → createOpenRouter({ apiKey }) → openrouter(model)); the beta needs to be attached there (or via providerOptions in streamText in app/src/routes/api.chat.ts) only for Claude models. Confirm the exact beta identifier (e.g. an anthropic-beta/context-1m value) and how OpenRouter expects it to be forwarded against the current OpenRouter + Anthropic docs before implementing — don't hardcode a stale header string.
- Verify OpenRouter actually routes the request to a 1M-capable Opus/Sonnet variant when the beta is set, and check the pricing implications (1M-context requests are typically billed at a higher rate above the 200k threshold).
- Once confirmed working, restore the Anthropic entries in
model-capabilities.ts to 1_000_000 (and consider gating the displayed window on whether the beta is actually enabled, so the gauge stays honest if the beta is toggled off).
Acceptance: a long assistant conversation can exceed 200k tokens without truncation/compaction kicking in at 200k, and the context gauge shows ~1M for Claude models.
The assistant runs Claude Opus/Sonnet through OpenRouter but never requests Claude's 1M-context beta, so the usable context window is Anthropic's default 200k rather than 1M. The token gauge now reports 200k honestly (it previously fell back to 128k for the concrete served model id — fixed in 870df25), and
app/src/lib/model-capabilities.tscaps the Anthropic aliases at 200k with a note to bump them back once this is wired up.To actually get the 1M window we need to:
app/src/server/agent.ts(getModel→createOpenRouter({ apiKey })→openrouter(model)); the beta needs to be attached there (or viaproviderOptionsinstreamTextinapp/src/routes/api.chat.ts) only for Claude models. Confirm the exact beta identifier (e.g. ananthropic-beta/context-1mvalue) and how OpenRouter expects it to be forwarded against the current OpenRouter + Anthropic docs before implementing — don't hardcode a stale header string.model-capabilities.tsto1_000_000(and consider gating the displayed window on whether the beta is actually enabled, so the gauge stays honest if the beta is toggled off).Acceptance: a long assistant conversation can exceed 200k tokens without truncation/compaction kicking in at 200k, and the context gauge shows ~1M for Claude models.