🤖 fix: enable xhigh reasoning for gpt-5.2 #1117
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Enable
xhighthinking foropenai:gpt-5.2by updating the per-model thinking policy.gpt-5.2was falling back to the default policy (off/low/medium/high), so anyxhighselection got clamped before building OpenAI provider options.xhighingetThinkingPolicyForModel()forgpt-5.2(including version-suffixed and mux-gateway forms) and add tests.Validation:
bun test src/browser/utils/thinking/policy.test.tsmake typecheckmake static-check📋 Implementation Plan
Enable xhigh reasoning for
openai:gpt-5.2Context / Problem
The newly released
openai:gpt-5.2model supports OpenAI’sreasoningEffort: "xhigh", but mux currently cannot actually request xhigh for this model.Root cause (code-level)
Mux clamps the requested “thinking level” to a per-model capability subset via:
src/browser/utils/thinking/policy.ts→getThinkingPolicyForModel()/enforceThinkingPolicy()buildProviderOptions()which callsenforceThinkingPolicy()).Right now
gpt-5.2is not special-cased, so it falls into the default policy:["off", "low", "medium", "high"]xhighgets clamped (typically to"medium"), so OpenAI never receivesreasoningEffort: "xhigh".OpenAI request construction is already correct once
xhighis allowed:src/common/utils/ai/providerOptions.tsmapsThinkingLevel→ OpenAIreasoningEffort.src/common/types/thinking.tsincludesxhigh: "xhigh"inOPENAI_REASONING_EFFORT.Recommended approach (minimal change) — Update thinking policy for
gpt-5.2Net LoC estimate (product code only): ~+10–25 LoC (policy + comments; tests separate)
What to change
Allow
xhighforgpt-5.2ingetThinkingPolicyForModel():src/browser/utils/thinking/policy.tsgpt-5.2-proandgpt-5.1-codex-max.Suggested policy:
openai:gpt-5.2→["off", "low", "medium", "high", "xhigh"]Notes:
gpt-5.2-probranch above the newgpt-5.2branch.^gpt-5\.2(?!-[a-z])so it matchesgpt-5.2andgpt-5.2-2025-12-11but notgpt-5.2-pro.Update comments to match reality
src/browser/utils/thinking/policy.tsxhighis only for codex-max.src/common/types/thinking.tsOPENAI_REASONING_EFFORT.xhigh(currently says onlygpt-5.1-codex-max).src/common/utils/tokens/models-extra.tsgpt-5.2comment block doesn’t mention xhigh. Add a short note for consistency.Why this works
Once the policy allows
xhigh, the normal request path already does the right thing:xhigh.buildProviderOptions(modelString, thinkingLevel, ...).buildProviderOptions()will:xhigh(no clamping),openai.reasoningEffort = "xhigh", andreasoning.encrypted_contentso tool-use works correctly for reasoning models.Tests / Validation
Update/add unit tests for the policy:
src/browser/utils/thinking/policy.test.tsgetThinkingPolicyForModel("openai:gpt-5.2")returns 5 levels includingxhigh.getThinkingPolicyForModel("mux-gateway:openai/gpt-5.2")returns same.getThinkingPolicyForModel("openai:gpt-5.2-2025-12-11")returns same.enforceThinkingPolicy("openai:gpt-5.2", "xhigh") === "xhigh".Run targeted tests:
bun test src/browser/utils/thinking/policy.test.tsRun repo-wide correctness gates (expected in CI):
make typecheckmake lint(ormake lint-fixif needed)Rollout notes / UX impact
gpt-5.2once selected.xhigh; after this change, choosingxhighongpt-5.2will no longer silently clamp back tomedium.knownModels.tsetc.).Alternative approach (more scalable, higher scope)
Drive thinking policy from model metadata (e.g., a single authoritative model capabilities table that includes supported thinking levels).
Net LoC estimate (product code only): ~+80–200 LoC
This would reduce future “forgot to special-case model X” issues, but requires designing a shared model-capabilities schema and updating multiple call sites (policy derivation, UI, provider options clamping).
Execution checklist (when switching to Exec mode)
src/browser/utils/thinking/policy.tsto addgpt-5.2xhigh support.src/browser/utils/thinking/policy.test.tswith new assertions.bun test ...policy.test.ts.make typecheck.Generated with
mux