🤖 feat: add per-model minimum thinking level by ammar-agent · Pull Request #3427 · coder/mux

ammar-agent · 2026-05-29T16:29:03Z

Summary

Adds a per-model Minimum Thinking level setting, configured on the Models settings page. It defaults to Medium for reasoning-capable models, hiding off/low so the thinking slider toggle cycles only through levels people actually use. Users can override the floor per model (including lowering it back to off/low).

Background

Most users never want off/low thinking. Showing those levels on the thinking slider makes cycling less efficient and surfaces choices that are rarely desirable. A per-model minimum lets us default to a sensible floor (Medium) while keeping full control for power users.

Implementation

The minimum is a floor applied on top of each model's existing capability policy — capability detection (getThinkingPolicyForModel) is unchanged.

common/utils/thinking/policy.ts
- getDefaultMinimumThinkingLevel(model) → medium for reasoning-capable models, else off.
- resolveMinimumThinkingLevel(model, override?) → override ?? default.
- getAvailableThinkingLevels(model, minimum?) → capability filtered to levels ≥ floor (by canonical ordering), with a non-empty invariant (locks to the highest level if the floor exceeds the model's max).
- enforceThinkingPolicy(model, requested, minimum?) gains an optional floor; omitting it preserves the legacy capability-only behavior (so advisor/compaction/tasks/ACP are unaffected).
Config (minThinkingLevelByModel, keyed by canonical model id): schema (appConfigOnDisk, ProjectsConfig), Config load/save normalization (self-healing — invalid entries are dropped on load), and IPC (config.getConfig output + new config.updateMinThinkingLevels mutation).
Frontend: useMinThinkingLevels (mirrors useRouting) floors the ThinkingSlider, the keybind cycle (ThinkingContext), and the command-palette "Set Thinking Effort" picker.
Backend: the agentSession send path applies the floor to the effective thinking level, so all clients (desktop, mobile, ACP) honor the per-model minimum even though the persisted level stays untouched.
Models page: a per-model Min Thinking dropdown (ModelRow + ModelsSection) listing the model's full capability plus a Default (…) option that resolves the built-in floor (e.g. Default (medium), or Default (high) for gemini-3).

The floor is enforced at the provider boundary (backend) and mirrored in the UI, so a stored/below-floor off displays and runs as medium without mutating the saved value — set a model's minimum to off to restore the old behavior.

Validation

New policy.test.ts cases: default floor, ordering-based clamp (gemini-3 ["low","high"] + medium → ["high"]), non-empty invariant, below-floor clamp-up, and legacy no-floor compatibility.
Updated the creation-flow keybind test in ThinkingContext.test.tsx to reflect floored cycling (off → floored medium → high).
make typecheck and make static-check pass; full unit suite green for touched areas (remaining unrelated failures are pre-existing parallel-run flakes that pass in isolation).

Risks

Behavior change: reasoning-capable models now default to a Medium floor, so untouched workspaces that previously ran at off will think at medium until a user lowers that model's minimum. This is the intended default. Capability detection and the advisor/tasks/compaction/ACP thinking paths are untouched (they call enforceThinkingPolicy without a floor).
Mobile has no Models page to configure the floor, but the backend send-path enforcement keeps mobile model behavior consistent with the default.

Generated with mux • Model: anthropic:claude-opus-4-8 • Thinking: xhigh • Cost: $n/a

Add a per-model "Minimum Thinking level" setting configured on the Models settings page. It defaults to Medium for reasoning-capable models, hiding off/low so the thinking slider toggle is more efficient. Users can override the floor per model (including back down to off/low). - policy.ts: getDefaultMinimumThinkingLevel / resolveMinimumThinkingLevel / getAvailableThinkingLevels; enforceThinkingPolicy gains an optional minimum. - config: minThinkingLevelByModel map (schema, Config load/save, IPC getConfig output + updateMinThinkingLevels mutation). - frontend: useMinThinkingLevels hook floors the ThinkingSlider, keybind cycle, and command-palette thinking picker. - backend: agentSession send path floors the effective thinking level so all clients honor the per-model minimum. - Models page: per-model Min Thinking dropdown (ModelRow + ModelsSection).

ammar-agent · 2026-05-29T16:29:39Z

@codex review

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: bbc1b182e5

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Address Codex P1: the shared fallback thinking policy is used by non-reasoning models (gpt-4o, claude-3.5), so defaulting them to a medium floor would send unsupported reasoning params. Split the policy table into explicit rules vs the shared default; only explicitly-recognized reasoning models default to medium and expose the Models-page floor selector — others keep the legacy off behavior. Also simplify the Models-page picker per review: drop the "Default (…)" option and show the effective floor as a plain level, and hide the selector entirely for non-reasoning / unrecognized / single-level models.

ammar-agent · 2026-05-29T16:43:40Z

@codex Addressed the P1 in 7ac57fb (policy.ts:146).

Root cause was real: the shared fallback policy is used by non-reasoning models, so defaulting them to a medium floor would send unsupported reasoningEffort.

Fix: split getThinkingPolicyForModel into getExplicitThinkingPolicy (specific reasoning rules) vs the shared DEFAULT_THINKING_POLICY. getDefaultMinimumThinkingLevel now returns medium only when hasExplicitThinkingPolicy(model) is true; fallback-policy models (e.g. openai:gpt-4o, anthropic:claude-3.5) keep an off default, so agentSession does not floor them and buildProviderOptions still omits reasoning params. The Models-page selector is also gated on hasExplicitThinkingPolicy, so only recognized reasoning models expose the floor.

ammar-agent · 2026-05-29T16:43:46Z

@codex review
Please take another look.

ammar-agent · 2026-05-29T16:44:47Z

@codex review

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 7ac57fb174

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

- useMinThinkingLevels now reads an optional API context (new useOptionalAPI), so ThinkingProvider no longer crashes when mounted outside an APIProvider (e.g. the isolated useDraftWorkspaceSettings test harness); it degrades to default floors when no api is available. - settings.spec.ts: scope the add-form combobox assertion, since the model tables now render per-model "Min Thinking" comboboxes.

ammar-agent · 2026-05-29T16:58:48Z

@codex review

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: c545e8b686

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

ammar-agent · 2026-05-29T17:06:06Z

@codex Re: P2 "Detect thinking support through mapped models" (ModelRow.tsx:256) — intentionally keeping this gated on the raw fullId.

mappedToModel ("Treat as") is metadata-only (pricing, tokenizer, context window, capability flags). The thinking pipeline never resolves it:

agentSession send path clamps with the raw modelString (enforceThinkingPolicy(modelString, …, minThinkingLevelByModel[normalizeToCanonical(modelString)])) — no mapped lookup.
The provider request itself is sent to the raw custom id (providerModelFactory parses acme:fast → provider acme, model fast), so a custom model "treated as" gpt-5.5 still runs under the default policy ["off","low","medium","high"].
The runtime ThinkingSlider/keybind also key off the raw baseModel.

So detecting capability via mappedToModel in the settings UI only would make the selector show a medium default and levels (e.g. xhigh) that the runtime would never honor — a worse UI/runtime mismatch. Gating on the raw fullId keeps mapped custom models consistent end-to-end (no selector → legacy off behavior), matching the intended design.

Fully honoring mappedToModel for reasoning would require threading the mapping through the entire thinking pipeline (slider, keybind, backend), which is a separate, larger change and a pre-existing limitation. Resolving this thread as a deliberate design decision.

Mobile clients previously listed raw capability levels and sent enforceThinkingPolicy(model, level) without the floor, so a default-floored model (e.g. gpt-5.5 → medium) let mobile users pick Off/Low that the backend silently clamped up — the UI lied. - New mobile useMinThinkingLevels hook fetches minThinkingLevelByModel from api.config.getConfig (react-query, best-effort; falls back to default floor). - WorkspaceScreen threads the resolved floor through all enforceThinkingPolicy calls and passes it to RunSettingsSheet. - RunSettingsSheet lists getAvailableThinkingLevels(model, floor) so the picker hides levels below the floor, matching desktop + backend.

ammar-agent · 2026-05-29T17:15:14Z

@codex Fixed the mobile floor sync in 2f3d259 (P2 "Synchronize mobile thinking controls with the backend floor").

New mobile/src/hooks/useMinThinkingLevels.ts fetches minThinkingLevelByModel via api.config.getConfig() (react-query, best-effort; falls back to the built-in default floor when config is unavailable).
WorkspaceScreen threads the resolved floor through all four enforceThinkingPolicy(model, level, getMinimum(model)) call sites and passes it to RunSettingsSheet.
RunSettingsSheet now lists getAvailableThinkingLevels(model, minThinkingLevel), so the picker hides Off/Low below the floor — matching desktop and the backend clamp. A default-floored model (e.g. gpt-5.5 → medium) no longer lets mobile users select levels that get silently raised.

ammar-agent · 2026-05-29T17:15:34Z

@codex review

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 2f3d259435

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

If updateMinThinkingLevels fails, re-fetch config so the UI reverts to the backend's actual floor instead of showing an override the send path never applied. The optimistic update itself mirrors the existing routeOverrides pattern; the onConfigChanged subscription reconciles on success.

ammar-agent · 2026-05-29T17:25:04Z

@codex Addressed in 64ae2fb (P2 "Persist minimum thinking changes before exposing them").

Failed save: setMinThinkingLevel's .catch now re-fetches config, so a failed write reverts the UI to the backend's actual floor instead of showing an override that was never applied.
Optimistic window: the optimistic setMap mirrors the existing per-model routeOverrides pattern (useRouting), and on success the onConfigChanged subscription reconciles. The only race is "change a model's floor on the Models settings page, then send within the config-write round-trip" — the Models page is a separate surface from the composer, so this window is negligible and consistent with how route overrides already behave. Making it fully synchronous would diverge from that established pattern and add latency to the settings dropdown.

ammar-agent · 2026-05-29T17:25:30Z

@codex review

chatgpt-codex-connector · 2026-05-29T17:31:02Z

Codex Review: Didn't find any major issues. Keep it up!

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector Bot reviewed May 29, 2026

View reviewed changes

Comment thread src/common/utils/thinking/policy.ts Outdated

This comment has been minimized.

Sign in to view

chatgpt-codex-connector Bot reviewed May 29, 2026

View reviewed changes

Comment thread src/browser/features/Settings/Sections/ModelRow.tsx

chatgpt-codex-connector Bot reviewed May 29, 2026

View reviewed changes

Comment thread src/node/services/agentSession.ts

This comment has been minimized.

Sign in to view

chatgpt-codex-connector Bot reviewed May 29, 2026

View reviewed changes

Comment thread src/browser/hooks/useMinThinkingLevels.ts

This comment has been minimized.

Sign in to view

ammario merged commit 276e05a into main May 29, 2026
40 of 42 checks passed

ammario deleted the models-j9mc branch May 29, 2026 22:40

mux-bot Bot mentioned this pull request May 30, 2026

🤖 refactor: auto-cleanup #3291

Open

Conversation

ammar-agent commented May 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Background

Implementation

Validation

Risks

Uh oh!

ammar-agent commented May 29, 2026

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

ammar-agent commented May 29, 2026

Uh oh!

ammar-agent commented May 29, 2026

Uh oh!

This comment has been minimized.

ammar-agent commented May 29, 2026

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

ammar-agent commented May 29, 2026

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

ammar-agent commented May 29, 2026

Uh oh!

This comment has been minimized.

ammar-agent commented May 29, 2026

Uh oh!

This comment has been minimized.

ammar-agent commented May 29, 2026

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

ammar-agent commented May 29, 2026

Uh oh!

This comment has been minimized.

ammar-agent commented May 29, 2026

Uh oh!

chatgpt-codex-connector Bot commented May 29, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

ammar-agent commented May 29, 2026 •

edited

Loading