Skip to content

Commit 27ef4af

Browse files
authored
🤖 fix: enable reasoning for Claude Opus 4.5 (#754)
## Summary Fixes missing reasoning traces in the UI when using Claude Opus 4.5. ## Problem Opus 4.5 supports two separate but complementary parameters for reasoning: 1. **`effort`** (new in Opus 4.5): Controls how eagerly Claude spends tokens across ALL output (text, tool calls, and thinking). Values: `low`, `medium`, `high`. 2. **`thinking`** (extended thinking): Enables visible reasoning traces with a token budget. This is what makes the "Thinking..." UI appear. We were only passing `effort`, which controls token spend but **doesn't enable the reasoning traces to be returned from the API**. The `thinking` parameter must also be set for reasoning to be visible. ## Solution For Opus 4.5, now pass both parameters: ```typescript { anthropic: { thinking: { type: "enabled", budgetTokens: 20000, // Enables visible reasoning traces }, effort: "high", // Controls token spend eagerness } } ``` ## How these parameters interact From [Anthropic's docs](https://docs.anthropic.com/en/docs/build-with-claude/extended-thinking#effort-with-extended-thinking): > The effort parameter works alongside the thinking token budget when extended thinking is enabled. These two controls serve different purposes: > - **Effort parameter**: Controls how Claude spends all tokens—including thinking tokens, text responses, and tool calls > - **Thinking token budget**: Sets a maximum limit on thinking tokens specifically > > The effort parameter can be used with or without extended thinking enabled. | Configuration | Behavior | |---------------|----------| | `effort` only | Controls token spend, no visible reasoning | | `thinking` only | Shows reasoning with default budget | | `effort` + `thinking` | Shows reasoning WITH token spend control | ## Testing - Verified reasoning traces now appear in UI when using Opus 4.5 - `make typecheck` passes _Generated with `mux`_ Signed-off-by: Thomas Kosiewski <tk@coder.com>
1 parent 08e02a6 commit 27ef4af

File tree

6 files changed

+50
-40
lines changed

6 files changed

+50
-40
lines changed

src/browser/components/ThinkingSlider.stories.tsx

Lines changed: 7 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -48,9 +48,7 @@ export const DifferentModels: Story = {
4848
</div>
4949

5050
<div>
51-
<div className="text-muted-light font-primary mb-2 text-xs">
52-
Claude Opus 4.5 (3 levels: low/medium/high)
53-
</div>
51+
<div className="text-muted-light font-primary mb-2 text-xs">Claude Opus 4.5 (4 levels)</div>
5452
<ThinkingSliderComponent modelString="anthropic:claude-opus-4-5" />
5553
</div>
5654

@@ -116,18 +114,19 @@ export const InteractiveDemo: Story = {
116114
},
117115
};
118116

119-
export const Opus45ThreeLevels: Story = {
117+
export const Opus45AllLevels: Story = {
120118
args: { modelString: "anthropic:claude-opus-4-5" },
121119
render: (args) => (
122120
<div className="bg-dark flex min-w-80 flex-col gap-[30px] p-10">
123121
<div className="text-bright font-primary mb-2.5 text-[13px]">
124-
Claude Opus 4.5 uses the effort parameter (low/medium/high only, no &ldquo;off&rdquo;):
122+
Claude Opus 4.5 uses the effort parameter with optional extended thinking:
125123
</div>
126124
<ThinkingSliderComponent modelString={args.modelString} />
127125
<div className="text-muted-light font-primary mt-2.5 text-[11px]">
128-
<strong>Low</strong>: Conservative token usage
129-
<br /><strong>Medium</strong>: Balanced usage (default)
130-
<br /><strong>High</strong>: Best results, more tokens
126+
<strong>Off</strong>: effort=&ldquo;low&rdquo;, no visible reasoning
127+
<br /><strong>Low</strong>: effort=&ldquo;low&rdquo;, visible reasoning
128+
<br /><strong>Medium</strong>: effort=&ldquo;medium&rdquo;, visible reasoning
129+
<br /><strong>High</strong>: effort=&ldquo;high&rdquo;, visible reasoning
131130
</div>
132131
</div>
133132
),

src/browser/utils/thinking/policy.test.ts

Lines changed: 10 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -33,13 +33,17 @@ describe("getThinkingPolicyForModel", () => {
3333
]);
3434
});
3535

36-
test("returns low/medium/high for Opus 4.5", () => {
36+
test("returns all levels for Opus 4.5 (uses default policy)", () => {
37+
// Opus 4.5 uses the default policy - no special case needed
38+
// The effort parameter handles the "off" case by setting effort="low"
3739
expect(getThinkingPolicyForModel("anthropic:claude-opus-4-5")).toEqual([
40+
"off",
3841
"low",
3942
"medium",
4043
"high",
4144
]);
4245
expect(getThinkingPolicyForModel("anthropic:claude-opus-4-5-20251101")).toEqual([
46+
"off",
4347
"low",
4448
"medium",
4549
"high",
@@ -95,19 +99,16 @@ describe("enforceThinkingPolicy", () => {
9599
});
96100
});
97101

98-
describe("Opus 4.5 (no off option)", () => {
99-
test("allows low/medium/high levels", () => {
102+
describe("Opus 4.5 (all levels supported)", () => {
103+
test("allows all levels including off", () => {
104+
expect(enforceThinkingPolicy("anthropic:claude-opus-4-5", "off")).toBe("off");
100105
expect(enforceThinkingPolicy("anthropic:claude-opus-4-5", "low")).toBe("low");
101106
expect(enforceThinkingPolicy("anthropic:claude-opus-4-5", "medium")).toBe("medium");
102107
expect(enforceThinkingPolicy("anthropic:claude-opus-4-5", "high")).toBe("high");
103108
});
104109

105-
test("falls back to high when off is requested", () => {
106-
expect(enforceThinkingPolicy("anthropic:claude-opus-4-5", "off")).toBe("high");
107-
});
108-
109-
test("falls back to high when off is requested (versioned model)", () => {
110-
expect(enforceThinkingPolicy("anthropic:claude-opus-4-5-20251101", "off")).toBe("high");
110+
test("allows off for versioned model", () => {
111+
expect(enforceThinkingPolicy("anthropic:claude-opus-4-5-20251101", "off")).toBe("off");
111112
});
112113
});
113114
});

src/browser/utils/thinking/policy.ts

Lines changed: 1 addition & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -25,7 +25,6 @@ export type ThinkingPolicy = readonly ThinkingLevel[];
2525
*
2626
* Rules:
2727
* - openai:gpt-5-pro → ["high"] (only supported level)
28-
* - anthropic:claude-opus-4-5 → ["low", "medium", "high"] (effort parameter only)
2928
* - gemini-3 → ["low", "high"] (thinking level only)
3029
* - default → ["off", "low", "medium", "high"] (all levels selectable)
3130
*
@@ -39,12 +38,6 @@ export function getThinkingPolicyForModel(modelString: string): ThinkingPolicy {
3938
return ["high"];
4039
}
4140

42-
// Claude Opus 4.5 only supports effort parameter: low, medium, high (no "off")
43-
// Match "anthropic:" followed by "claude-opus-4-5" with optional version suffix
44-
if (modelString.includes("opus-4-5")) {
45-
return ["low", "medium", "high"];
46-
}
47-
4841
// Gemini 3 Pro only supports "low" and "high" reasoning levels
4942
if (modelString.includes("gemini-3")) {
5043
return ["low", "high"];
@@ -59,8 +52,7 @@ export function getThinkingPolicyForModel(modelString: string): ThinkingPolicy {
5952
*
6053
* Fallback strategy:
6154
* 1. If requested level is allowed, use it
62-
* 2. For Opus 4.5: prefer "high" (best experience for reasoning model)
63-
* 3. Otherwise: prefer "medium" if allowed, else use first allowed level
55+
* 2. Otherwise: prefer "medium" if allowed, else use first allowed level
6456
*/
6557
export function enforceThinkingPolicy(
6658
modelString: string,
@@ -72,11 +64,6 @@ export function enforceThinkingPolicy(
7264
return requested;
7365
}
7466

75-
// Special case: Opus 4.5 defaults to "high" for best experience
76-
if (modelString.includes("opus-4-5") && allowed.includes("high")) {
77-
return "high";
78-
}
79-
8067
// Fallback: prefer "medium" if allowed, else use first allowed level
8168
return allowed.includes("medium") ? "medium" : allowed[0];
8269
}

src/common/types/thinking.ts

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -42,8 +42,8 @@ export const ANTHROPIC_THINKING_BUDGETS: Record<ThinkingLevel, number> = {
4242
*
4343
* @see https://www.anthropic.com/news/claude-opus-4-5
4444
*/
45-
export const ANTHROPIC_EFFORT: Record<ThinkingLevel, "low" | "medium" | "high" | undefined> = {
46-
off: undefined,
45+
export const ANTHROPIC_EFFORT: Record<ThinkingLevel, "low" | "medium" | "high"> = {
46+
off: "low",
4747
low: "low",
4848
medium: "medium",
4949
high: "high",

src/common/utils/ai/providerOptions.test.ts

Lines changed: 12 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -23,37 +23,46 @@ void mock.module("@/browser/utils/thinking/policy", () => ({
2323

2424
describe("buildProviderOptions - Anthropic", () => {
2525
describe("Opus 4.5 (effort parameter)", () => {
26-
test("should use effort parameter for claude-opus-4-5", () => {
26+
test("should use effort and thinking parameters for claude-opus-4-5", () => {
2727
const result = buildProviderOptions("anthropic:claude-opus-4-5", "medium");
2828

2929
expect(result).toEqual({
3030
anthropic: {
3131
disableParallelToolUse: false,
3232
sendReasoning: true,
33+
thinking: {
34+
type: "enabled",
35+
budgetTokens: 10000, // ANTHROPIC_THINKING_BUDGETS.medium
36+
},
3337
effort: "medium",
3438
},
3539
});
3640
});
3741

38-
test("should use effort parameter for claude-opus-4-5-20251101", () => {
42+
test("should use effort and thinking parameters for claude-opus-4-5-20251101", () => {
3943
const result = buildProviderOptions("anthropic:claude-opus-4-5-20251101", "high");
4044

4145
expect(result).toEqual({
4246
anthropic: {
4347
disableParallelToolUse: false,
4448
sendReasoning: true,
49+
thinking: {
50+
type: "enabled",
51+
budgetTokens: 20000, // ANTHROPIC_THINKING_BUDGETS.high
52+
},
4553
effort: "high",
4654
},
4755
});
4856
});
4957

50-
test("should omit effort when thinking is off for Opus 4.5", () => {
58+
test("should use effort 'low' with no thinking when off for Opus 4.5", () => {
5159
const result = buildProviderOptions("anthropic:claude-opus-4-5", "off");
5260

5361
expect(result).toEqual({
5462
anthropic: {
5563
disableParallelToolUse: false,
5664
sendReasoning: true,
65+
effort: "low", // "off" maps to effort: "low" for efficiency
5766
},
5867
});
5968
});

src/common/utils/ai/providerOptions.ts

Lines changed: 18 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -93,20 +93,34 @@ export function buildProviderOptions(
9393
const isOpus45 = modelName?.includes("opus-4-5") ?? false;
9494

9595
if (isOpus45) {
96-
// Opus 4.5: Use effort parameter for reasoning control
97-
const effort = ANTHROPIC_EFFORT[effectiveThinking];
96+
// Opus 4.5: Use effort parameter AND optionally thinking for visible reasoning
97+
// - "off" or "low" → effort: "low", no thinking (fast, no visible reasoning for off)
98+
// - "low" → effort: "low", thinking enabled (visible reasoning)
99+
// - "medium" → effort: "medium", thinking enabled
100+
// - "high" → effort: "high", thinking enabled
101+
const effortLevel = ANTHROPIC_EFFORT[effectiveThinking];
102+
const budgetTokens = ANTHROPIC_THINKING_BUDGETS[effectiveThinking];
98103
log.debug("buildProviderOptions: Anthropic Opus 4.5 config", {
99-
effort,
104+
effort: effortLevel,
105+
budgetTokens,
100106
thinkingLevel: effectiveThinking,
101107
});
102108

103109
const options: ProviderOptions = {
104110
anthropic: {
105111
disableParallelToolUse: false, // Always enable concurrent tool execution
106112
sendReasoning: true, // Include reasoning traces in requests sent to the model
113+
// Enable thinking to get visible reasoning traces (only when not "off")
114+
// budgetTokens sets the ceiling; effort controls how eagerly tokens are spent
115+
...(budgetTokens > 0 && {
116+
thinking: {
117+
type: "enabled",
118+
budgetTokens,
119+
},
120+
}),
107121
// Use effort parameter (Opus 4.5 only) to control token spend
108122
// SDK auto-adds beta header "effort-2025-11-24" when effort is set
109-
...(effort && { effort }),
123+
effort: effortLevel,
110124
},
111125
};
112126
log.debug("buildProviderOptions: Returning Anthropic Opus 4.5 options", options);

0 commit comments

Comments
 (0)