Skip to content

🤖 feat: add per-model minimum thinking level#3427

Merged
ammario merged 5 commits into
mainfrom
models-j9mc
May 29, 2026
Merged

🤖 feat: add per-model minimum thinking level#3427
ammario merged 5 commits into
mainfrom
models-j9mc

Conversation

@ammar-agent
Copy link
Copy Markdown
Collaborator

@ammar-agent ammar-agent commented May 29, 2026

Summary

Adds a per-model Minimum Thinking level setting, configured on the Models settings page. It defaults to Medium for reasoning-capable models, hiding off/low so the thinking slider toggle cycles only through levels people actually use. Users can override the floor per model (including lowering it back to off/low).

Background

Most users never want off/low thinking. Showing those levels on the thinking slider makes cycling less efficient and surfaces choices that are rarely desirable. A per-model minimum lets us default to a sensible floor (Medium) while keeping full control for power users.

Implementation

The minimum is a floor applied on top of each model's existing capability policy — capability detection (getThinkingPolicyForModel) is unchanged.

  • common/utils/thinking/policy.ts
    • getDefaultMinimumThinkingLevel(model)medium for reasoning-capable models, else off.
    • resolveMinimumThinkingLevel(model, override?)override ?? default.
    • getAvailableThinkingLevels(model, minimum?) → capability filtered to levels ≥ floor (by canonical ordering), with a non-empty invariant (locks to the highest level if the floor exceeds the model's max).
    • enforceThinkingPolicy(model, requested, minimum?) gains an optional floor; omitting it preserves the legacy capability-only behavior (so advisor/compaction/tasks/ACP are unaffected).
  • Config (minThinkingLevelByModel, keyed by canonical model id): schema (appConfigOnDisk, ProjectsConfig), Config load/save normalization (self-healing — invalid entries are dropped on load), and IPC (config.getConfig output + new config.updateMinThinkingLevels mutation).
  • Frontend: useMinThinkingLevels (mirrors useRouting) floors the ThinkingSlider, the keybind cycle (ThinkingContext), and the command-palette "Set Thinking Effort" picker.
  • Backend: the agentSession send path applies the floor to the effective thinking level, so all clients (desktop, mobile, ACP) honor the per-model minimum even though the persisted level stays untouched.
  • Models page: a per-model Min Thinking dropdown (ModelRow + ModelsSection) listing the model's full capability plus a Default (…) option that resolves the built-in floor (e.g. Default (medium), or Default (high) for gemini-3).

The floor is enforced at the provider boundary (backend) and mirrored in the UI, so a stored/below-floor off displays and runs as medium without mutating the saved value — set a model's minimum to off to restore the old behavior.

Validation

  • New policy.test.ts cases: default floor, ordering-based clamp (gemini-3 ["low","high"] + medium → ["high"]), non-empty invariant, below-floor clamp-up, and legacy no-floor compatibility.
  • Updated the creation-flow keybind test in ThinkingContext.test.tsx to reflect floored cycling (off → floored mediumhigh).
  • make typecheck and make static-check pass; full unit suite green for touched areas (remaining unrelated failures are pre-existing parallel-run flakes that pass in isolation).

Risks

  • Behavior change: reasoning-capable models now default to a Medium floor, so untouched workspaces that previously ran at off will think at medium until a user lowers that model's minimum. This is the intended default. Capability detection and the advisor/tasks/compaction/ACP thinking paths are untouched (they call enforceThinkingPolicy without a floor).
  • Mobile has no Models page to configure the floor, but the backend send-path enforcement keeps mobile model behavior consistent with the default.

Generated with mux • Model: anthropic:claude-opus-4-8 • Thinking: xhigh • Cost: $n/a

Add a per-model "Minimum Thinking level" setting configured on the Models
settings page. It defaults to Medium for reasoning-capable models, hiding
off/low so the thinking slider toggle is more efficient. Users can override
the floor per model (including back down to off/low).

- policy.ts: getDefaultMinimumThinkingLevel / resolveMinimumThinkingLevel /
  getAvailableThinkingLevels; enforceThinkingPolicy gains an optional minimum.
- config: minThinkingLevelByModel map (schema, Config load/save, IPC getConfig
  output + updateMinThinkingLevels mutation).
- frontend: useMinThinkingLevels hook floors the ThinkingSlider, keybind cycle,
  and command-palette thinking picker.
- backend: agentSession send path floors the effective thinking level so all
  clients honor the per-model minimum.
- Models page: per-model Min Thinking dropdown (ModelRow + ModelsSection).
@ammar-agent
Copy link
Copy Markdown
Collaborator Author

@codex review

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: bbc1b182e5

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread src/common/utils/thinking/policy.ts Outdated
Address Codex P1: the shared fallback thinking policy is used by non-reasoning
models (gpt-4o, claude-3.5), so defaulting them to a medium floor would send
unsupported reasoning params. Split the policy table into explicit rules vs the
shared default; only explicitly-recognized reasoning models default to medium
and expose the Models-page floor selector — others keep the legacy off behavior.

Also simplify the Models-page picker per review: drop the "Default (…)" option
and show the effective floor as a plain level, and hide the selector entirely
for non-reasoning / unrecognized / single-level models.
@ammar-agent
Copy link
Copy Markdown
Collaborator Author

@codex Addressed the P1 in 7ac57fb (policy.ts:146).

Root cause was real: the shared fallback policy is used by non-reasoning models, so defaulting them to a medium floor would send unsupported reasoningEffort.

Fix: split getThinkingPolicyForModel into getExplicitThinkingPolicy (specific reasoning rules) vs the shared DEFAULT_THINKING_POLICY. getDefaultMinimumThinkingLevel now returns medium only when hasExplicitThinkingPolicy(model) is true; fallback-policy models (e.g. openai:gpt-4o, anthropic:claude-3.5) keep an off default, so agentSession does not floor them and buildProviderOptions still omits reasoning params. The Models-page selector is also gated on hasExplicitThinkingPolicy, so only recognized reasoning models expose the floor.

@ammar-agent
Copy link
Copy Markdown
Collaborator Author

@codex review
Please take another look.

@chatgpt-codex-connector

This comment has been minimized.

@ammar-agent
Copy link
Copy Markdown
Collaborator Author

@codex review

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 7ac57fb174

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread src/browser/features/Settings/Sections/ModelRow.tsx
- useMinThinkingLevels now reads an optional API context (new useOptionalAPI),
  so ThinkingProvider no longer crashes when mounted outside an APIProvider
  (e.g. the isolated useDraftWorkspaceSettings test harness); it degrades to
  default floors when no api is available.
- settings.spec.ts: scope the add-form combobox assertion, since the model
  tables now render per-model "Min Thinking" comboboxes.
@ammar-agent
Copy link
Copy Markdown
Collaborator Author

@codex review

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: c545e8b686

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread src/node/services/agentSession.ts
@ammar-agent
Copy link
Copy Markdown
Collaborator Author

@codex Re: P2 "Detect thinking support through mapped models" (ModelRow.tsx:256) — intentionally keeping this gated on the raw fullId.

mappedToModel ("Treat as") is metadata-only (pricing, tokenizer, context window, capability flags). The thinking pipeline never resolves it:

  • agentSession send path clamps with the raw modelString (enforceThinkingPolicy(modelString, …, minThinkingLevelByModel[normalizeToCanonical(modelString)])) — no mapped lookup.
  • The provider request itself is sent to the raw custom id (providerModelFactory parses acme:fast → provider acme, model fast), so a custom model "treated as" gpt-5.5 still runs under the default policy ["off","low","medium","high"].
  • The runtime ThinkingSlider/keybind also key off the raw baseModel.

So detecting capability via mappedToModel in the settings UI only would make the selector show a medium default and levels (e.g. xhigh) that the runtime would never honor — a worse UI/runtime mismatch. Gating on the raw fullId keeps mapped custom models consistent end-to-end (no selector → legacy off behavior), matching the intended design.

Fully honoring mappedToModel for reasoning would require threading the mapping through the entire thinking pipeline (slider, keybind, backend), which is a separate, larger change and a pre-existing limitation. Resolving this thread as a deliberate design decision.

@chatgpt-codex-connector

This comment has been minimized.

Mobile clients previously listed raw capability levels and sent
enforceThinkingPolicy(model, level) without the floor, so a default-floored
model (e.g. gpt-5.5 → medium) let mobile users pick Off/Low that the backend
silently clamped up — the UI lied.

- New mobile useMinThinkingLevels hook fetches minThinkingLevelByModel from
  api.config.getConfig (react-query, best-effort; falls back to default floor).
- WorkspaceScreen threads the resolved floor through all enforceThinkingPolicy
  calls and passes it to RunSettingsSheet.
- RunSettingsSheet lists getAvailableThinkingLevels(model, floor) so the picker
  hides levels below the floor, matching desktop + backend.
@ammar-agent
Copy link
Copy Markdown
Collaborator Author

@codex Fixed the mobile floor sync in 2f3d259 (P2 "Synchronize mobile thinking controls with the backend floor").

  • New mobile/src/hooks/useMinThinkingLevels.ts fetches minThinkingLevelByModel via api.config.getConfig() (react-query, best-effort; falls back to the built-in default floor when config is unavailable).
  • WorkspaceScreen threads the resolved floor through all four enforceThinkingPolicy(model, level, getMinimum(model)) call sites and passes it to RunSettingsSheet.
  • RunSettingsSheet now lists getAvailableThinkingLevels(model, minThinkingLevel), so the picker hides Off/Low below the floor — matching desktop and the backend clamp. A default-floored model (e.g. gpt-5.5 → medium) no longer lets mobile users select levels that get silently raised.

@chatgpt-codex-connector

This comment has been minimized.

@ammar-agent
Copy link
Copy Markdown
Collaborator Author

@codex review

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 2f3d259435

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread src/browser/hooks/useMinThinkingLevels.ts
If updateMinThinkingLevels fails, re-fetch config so the UI reverts to the
backend's actual floor instead of showing an override the send path never
applied. The optimistic update itself mirrors the existing routeOverrides
pattern; the onConfigChanged subscription reconciles on success.
@ammar-agent
Copy link
Copy Markdown
Collaborator Author

@codex Addressed in 64ae2fb (P2 "Persist minimum thinking changes before exposing them").

  • Failed save: setMinThinkingLevel's .catch now re-fetches config, so a failed write reverts the UI to the backend's actual floor instead of showing an override that was never applied.
  • Optimistic window: the optimistic setMap mirrors the existing per-model routeOverrides pattern (useRouting), and on success the onConfigChanged subscription reconciles. The only race is "change a model's floor on the Models settings page, then send within the config-write round-trip" — the Models page is a separate surface from the composer, so this window is negligible and consistent with how route overrides already behave. Making it fully synchronous would diverge from that established pattern and add latency to the settings dropdown.

@chatgpt-codex-connector

This comment has been minimized.

@ammar-agent
Copy link
Copy Markdown
Collaborator Author

@codex review

@chatgpt-codex-connector
Copy link
Copy Markdown

Codex Review: Didn't find any major issues. Keep it up!

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

@ammario ammario merged commit 276e05a into main May 29, 2026
40 of 42 checks passed
@ammario ammario deleted the models-j9mc branch May 29, 2026 22:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants