Advertise [1m] so configured 1M-capable Claude routes get the 1M window#11
Open
payne0420 wants to merge 1 commit into
Open
Advertise [1m] so configured 1M-capable Claude routes get the 1M window#11payne0420 wants to merge 1 commit into
payne0420 wants to merge 1 commit into
Conversation
Follow-up to OnlyTerp#8 (launcher appends [1m]) and OnlyTerp#10 (proxy strips [1m] before routing). Those give the 1M context window to a launch-time pick of a stock id, but an in-session /model switch to a CONFIGURED real-Claude route -- e.g. the shipped `claude-opus` route, which maps to claude-opus-4-8 -- used the bare gateway id, so /context showed 200k and auto-compaction was mis-keyed. Claude Code sizes its context meter to 1M only when the model id it holds carries the [1m] suffix (verified: it honors the suffix on a custom gateway id, not just native ids). So the proxy now ADVERTISES the suffix on /v1/models + /healthz for real-Claude PASSTHROUGH routes whose upstream model is 1M-capable. The /model picker id then carries [1m] and the 1M window engages even on in-session switches. - The suffix is stripped before routing (the inline strip from OnlyTerp#10 is refactored into a shared _strip_1m helper) and normalized off the sticky orchestrator/worker selection, so internal route ids stay clean (claude-opus[1m] -> claude-opus). - Scope: real-Claude passthrough routes only. Worker ("Worker -> ...") entries and non-passthrough routes (openai_compat / codex / cursor) are never suffixed. - Launcher: add `claude-opus` (the shipped route) to the UC_1M_MODELS default so a launch/selector pick of it matches the /model behavior. Toggles: UC_ADVERTISE_1M=0 (off), UC_1M_UPSTREAM (the 1M-capable upstream set). Verified live: picking claude-opus via /model now shows /context = / 1M. Tests + doctor pass; TROUBLESHOOTING updated. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Why
Builds on the just-merged #8 (launcher appends
[1m]) and #10 (proxy strips[1m]before routing). Those give the 1M context window to a launch-time pick of a stock id — but an in-session/modelswitch to a configured real-Claude route (e.g. the shippedclaude-opusroute →claude-opus-4-8) uses the bare gateway id, so/contextshows 200k and auto-compaction is mis-keyed.Claude Code sizes its meter to 1M only when the model id it holds carries the
[1m]suffix — and (now verified live) it honors that suffix on a custom gateway id, not just native ids.What
The proxy now advertises the
[1m]suffix on/v1/models+/healthzfor real-Claude passthrough routes whose upstreammodelis 1M-capable. The/modelpicker id then carries[1m], so the 1M window engages even on an in-session switch._strip_1mhelper) and normalized off the sticky orchestrator/worker selection, so internal route ids stay clean (claude-opus[1m]→claude-opus→claude-opus-4-8).Worker → …entries and non-passthrough routes (openai_compat/codex/cursor) are never suffixed.claude-opusroute to theUC_1M_MODELSdefault so a launch/selector pick of it matches the/modelbehavior.Toggles
UC_ADVERTISE_1M=0disables advertising.UC_1M_UPSTREAMoverrides which upstream model ids count as 1M-capable (default the Opus 4.6–4.8 + Sonnet 4.6 family).Testing
scripts/doctor.pypass (added unit coverage for_advertise_id,_strip_1m, and that a[1m]-suffixed pick routes to its clean route).claude-opusvia/modelnow shows/context=claude-opus[1m]· / 1M;/healthz+/v1/modelsadvertiseclaude-opus[1m](withclaude-worker-opusleft unsuffixed).docs/TROUBLESHOOTING.mdupdated.🤖 Generated with Claude Code