Skip to content

Advertise [1m] so configured 1M-capable Claude routes get the 1M window#11

Open
payne0420 wants to merge 1 commit into
OnlyTerp:mainfrom
payne0420:feat/advertise-1m
Open

Advertise [1m] so configured 1M-capable Claude routes get the 1M window#11
payne0420 wants to merge 1 commit into
OnlyTerp:mainfrom
payne0420:feat/advertise-1m

Conversation

@payne0420
Copy link
Copy Markdown
Contributor

Why

Builds on the just-merged #8 (launcher appends [1m]) and #10 (proxy strips [1m] before routing). Those give the 1M context window to a launch-time pick of a stock id — but an in-session /model switch to a configured real-Claude route (e.g. the shipped claude-opus route → claude-opus-4-8) uses the bare gateway id, so /context shows 200k and auto-compaction is mis-keyed.

Claude Code sizes its meter to 1M only when the model id it holds carries the [1m] suffix — and (now verified live) it honors that suffix on a custom gateway id, not just native ids.

What

The proxy now advertises the [1m] suffix on /v1/models + /healthz for real-Claude passthrough routes whose upstream model is 1M-capable. The /model picker id then carries [1m], so the 1M window engages even on an in-session switch.

  • The suffix is stripped before routing (the inline strip from Strip the [1m] context-window suffix before routing (companion to #8) #10 is refactored into a shared _strip_1m helper) and normalized off the sticky orchestrator/worker selection, so internal route ids stay clean (claude-opus[1m]claude-opusclaude-opus-4-8).
  • Scope: real-Claude passthrough routes only. Worker → … entries and non-passthrough routes (openai_compat / codex / cursor) are never suffixed.
  • Launcher: add the shipped claude-opus route to the UC_1M_MODELS default so a launch/selector pick of it matches the /model behavior.

Toggles

UC_ADVERTISE_1M=0 disables advertising. UC_1M_UPSTREAM overrides which upstream model ids count as 1M-capable (default the Opus 4.6–4.8 + Sonnet 4.6 family).

Testing

  • Full self-test suite + scripts/doctor.py pass (added unit coverage for _advertise_id, _strip_1m, and that a [1m]-suffixed pick routes to its clean route).
  • Live: picking claude-opus via /model now shows /context = claude-opus[1m] · / 1M; /healthz + /v1/models advertise claude-opus[1m] (with claude-worker-opus left unsuffixed).
  • docs/TROUBLESHOOTING.md updated.

🤖 Generated with Claude Code

Follow-up to OnlyTerp#8 (launcher appends [1m]) and OnlyTerp#10 (proxy strips [1m] before
routing). Those give the 1M context window to a launch-time pick of a stock id,
but an in-session /model switch to a CONFIGURED real-Claude route -- e.g. the
shipped `claude-opus` route, which maps to claude-opus-4-8 -- used the bare
gateway id, so /context showed 200k and auto-compaction was mis-keyed.

Claude Code sizes its context meter to 1M only when the model id it holds carries
the [1m] suffix (verified: it honors the suffix on a custom gateway id, not just
native ids). So the proxy now ADVERTISES the suffix on /v1/models + /healthz for
real-Claude PASSTHROUGH routes whose upstream model is 1M-capable. The /model
picker id then carries [1m] and the 1M window engages even on in-session switches.

- The suffix is stripped before routing (the inline strip from OnlyTerp#10 is refactored
  into a shared _strip_1m helper) and normalized off the sticky orchestrator/worker
  selection, so internal route ids stay clean (claude-opus[1m] -> claude-opus).
- Scope: real-Claude passthrough routes only. Worker ("Worker -> ...") entries and
  non-passthrough routes (openai_compat / codex / cursor) are never suffixed.
- Launcher: add `claude-opus` (the shipped route) to the UC_1M_MODELS default so a
  launch/selector pick of it matches the /model behavior.

Toggles: UC_ADVERTISE_1M=0 (off), UC_1M_UPSTREAM (the 1M-capable upstream set).
Verified live: picking claude-opus via /model now shows /context = / 1M. Tests +
doctor pass; TROUBLESHOOTING updated.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant