feat(anthropic): support Claude Opus 4.8 effort levels and adaptive thinking#360
Conversation
…hinking
Opus 4.8 and 4.7 require adaptive thinking ({type: "adaptive"}); manual
budget_tokens thinking returns a 400. Add both model prefixes to the
adaptive-thinking set so reasoning requests route correctly.
Extend reasoning effort to the full Anthropic enum (low/medium/high/xhigh/max).
Opus 4.8 introduced xhigh and max; previously these were downgraded to low.
Map the new levels to thinking budgets on legacy models.
Document the Anthropic provider's reasoning effort mapping in a new provider
page, including the adaptive-vs-legacy split and model-specific effort gating.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
📝 WalkthroughWalkthroughThis PR extends the Anthropic provider to support higher reasoning effort levels ( ChangesAnthropic Extended Reasoning Effort Support
Estimated code review effort🎯 2 (Simple) | ⏱️ ~12 minutes Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
Preview deployment for your docs. Learn more about Mintlify Previews.
💡 Tip: Enable Workflows to automatically generate PRs for you. |
Greptile SummaryThis PR adds fuller Anthropic reasoning-effort support for newer Claude models. It changes:
Confidence Score: 5/5This looks safe to merge.
Reviews (2): Last reviewed commit: "test(anthropic): add legacy xhigh Respon..." | Re-trigger Greptile |
|
Codecov Report✅ All modified and coverable lines are covered by tests. 📢 Thoughts on this report? Let us know! |
There was a problem hiding this comment.
Actionable comments posted: 4
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@docs/providers/anthropic.mdx`:
- Line 43: The docs line implying "Opus 4.6+, Sonnet 4.6" is misleading because
the code uses an explicit prefix list (adaptiveThinkingPrefixes) checked by
isAdaptiveThinkingModel; update the wording in the anthropic.mdx row for
Adaptive to explicitly list the concrete, supported versions (e.g., "Adaptive
(currently: Opus 4.6, 4.7, 4.8; Sonnet 4.6)" or similar) or remove the "4.6+"
shorthand so it matches the actual behavior of isAdaptiveThinkingModel.
- Around line 58-62: Update the anthropic.mdx wording to clarify that when
`reasoning.effort` is omitted GoModel does not set `thinking: {type:
"adaptive"}` (so extended thinking is not explicitly enabled), but many adaptive
models may still auto-apply adaptive thinking with a default `effort` of `high`;
rephrase the sentence that currently says "requests run without thinking" to a
conditional statement that mentions some models auto-apply adaptive thinking
when `thinking` is unset, and remove the implication that `effort`
deterministically controls total token spend—instead note that `effort` is soft
guidance for depth/likelihood and that final token usage is variable and bounded
by `max_tokens`.
In `@internal/providers/anthropic/anthropic_test.go`:
- Around line 3655-3704: Add matching ResponsesRequest test cases in
TestConvertResponsesRequestToAnthropic_ReasoningEffort to mirror the new
ChatRequest cases (e.g., "opus 4.8 - adaptive thinking with xhigh effort", "opus
4.8 - adaptive thinking with max effort", "opus 4.7 - adaptive thinking with
high effort", and the legacy mapping cases) so the /v1/responses conversion is
validated; for each case use the same model and reasoning values but set
maxOutputTokens (instead of maxTokens), and include expectedThinkType,
expectedEffort, expectedMaxTokens, expectNilTemp and expectedBudget where
applicable to match the ChatRequest tests and ensure parity across conversion
paths.
- Around line 3939-3942: Add an explicit negative test entry asserting Sonnet
4.8 does NOT accept adaptive thinking by adding entries like
{"claude-sonnet-4-8", false} (and optionally the dated variant
{"claude-sonnet-4-8-20260301", false}) to the same test cases map/array used
alongside {"claude-opus-4-8", true} / {"claude-opus-4-7", true}, and ensure the
loop/assertion that checks adaptive support uses those boolean expectations to
fail if Sonnet 4.8 is treated as adaptive.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: 9bd297bd-e343-494d-8e23-99acab4d9c94
📒 Files selected for processing (8)
docs/docs.jsondocs/providers/anthropic.mdxdocs/providers/overview.mdxinternal/core/types.gointernal/providers/anthropic/anthropic.gointernal/providers/anthropic/anthropic_test.gointernal/providers/anthropic/request_translation.gointernal/providers/anthropic/types.go
Address PR review feedback: - Cap the legacy (manual-thinking) budget at the "high" level (20000) for the xhigh/max effort levels. These are adaptive-thinking features (Opus 4.6+) that legacy models do not support, and inflating budget_tokens — and max_tokens with it — to 32000/48000 could exceed older models' output limits and 400. - Clarify the docs: adaptive routing is an explicit allowlist (not a version comparison), and effort is a behavioral signal bounded by max_tokens. - Add Responses-path test parity for the new effort levels and a negative TestIsAdaptiveThinkingModel case asserting a hypothetical Sonnet 4.8 is not treated as adaptive. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Actionable comments posted: 1
♻️ Duplicate comments (1)
internal/providers/anthropic/anthropic_test.go (1)
3874-3913:⚠️ Potential issue | 🟡 Minor | ⚡ Quick winAdd missing legacy
xhighmapping case in Responses-path reasoning tests.
TestConvertResponsesRequestToAnthropic_ReasoningEffortstill lacks the legacyxhighcase, so Line 3874-Line 3913 doesn’t fully mirror the chat-path mapping matrix for the same behavior change.Suggested minimal addition
{ + name: "legacy model - xhigh effort caps at high budget", + model: "claude-3-5-sonnet-20241022", + reasoning: &core.Reasoning{Effort: "xhigh"}, + maxOutputTokens: new(25000), + expectedThinkType: "enabled", + expectedBudget: 20000, + expectedMaxTokens: 25000, + expectNilTemp: true, + }, + { name: "legacy model - max effort caps at high budget", model: "claude-3-5-sonnet-20241022", reasoning: &core.Reasoning{Effort: "max"},As per coding guidelines, “Add or update tests for behavior changes to cover request translation … and provider-specific parameter mapping.”
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@internal/providers/anthropic/anthropic_test.go` around lines 3874 - 3913, Add a test case to TestConvertResponsesRequestToAnthropic_ReasoningEffort mirroring the legacy model xhigh mapping: create a table entry using model "claude-3-5-sonnet-20241022" with reasoning &core.Reasoning{Effort: "xhigh"}, maxOutputTokens new(25000), and set expectedThinkType "enabled", expectedBudget 20000, expectedMaxTokens 25000, expectNilTemp true so the Responses-path tests match the chat-path mapping matrix.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@docs/providers/anthropic.mdx`:
- Around line 64-69: Update the paragraph describing omission of `reasoning` to
remove the contradiction: clarify that when you omit `reasoning`, GoModel does
not set the `thinking` field (so GoModel does not explicitly enable extended
thinking), but some Claude models may still apply server-side adaptive thinking
when `thinking` is not present and default that adaptive effort to `high`;
reference the `reasoning` and `thinking` fields and the GoModel/Claude behavior
so the text states that absence means GoModel didn't request thinking, while
Claude may independently default to `high` effort.
---
Duplicate comments:
In `@internal/providers/anthropic/anthropic_test.go`:
- Around line 3874-3913: Add a test case to
TestConvertResponsesRequestToAnthropic_ReasoningEffort mirroring the legacy
model xhigh mapping: create a table entry using model
"claude-3-5-sonnet-20241022" with reasoning &core.Reasoning{Effort: "xhigh"},
maxOutputTokens new(25000), and set expectedThinkType "enabled", expectedBudget
20000, expectedMaxTokens 25000, expectNilTemp true so the Responses-path tests
match the chat-path mapping matrix.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: 96570971-9221-4786-9b71-1ae7fc1c077d
📒 Files selected for processing (3)
docs/providers/anthropic.mdxinternal/providers/anthropic/anthropic_test.gointernal/providers/anthropic/request_translation.go
Address follow-up review feedback: - Add the legacy "xhigh effort caps at high budget" case to the Responses-path reasoning test so it fully mirrors the chat-path mapping matrix. - Reword the effort doc note to remove the apparent contradiction: effort is a separate control that governs overall token spend whether or not extended thinking is engaged, defaulting to high when unset. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
♻️ Duplicate comments (1)
docs/providers/anthropic.mdx (1)
64-70:⚠️ Potential issue | 🟡 Minor | ⚡ Quick winClarify omission behavior to avoid implying deterministic server-side effort application.
Line 66-Line 70 mixes two different behaviors: GoModel not setting thinking/output_config, and Anthropic’s possible server defaults. Since GoModel only applies reasoning when
reasoning.effortis present, this should be phrased conditionally (e.g., some models may auto-apply adaptive thinking whenthinkingis unset), not as guaranteed “effort governs overall token spend whether or not thinking is engaged.”Suggested doc tweak
- without it those models do not engage extended thinking. Effort is a separate - control that governs overall token spend (text and tool calls) whether or not - thinking is engaged, and Anthropic defaults it to `high` when unset. It is a + without it GoModel does not configure thinking. Some Claude models may still + auto-apply adaptive thinking server-side when `thinking` is unset (with a + default effort of `high`). Effort is a behavioral signal for depth and verbosity, not a hard budget — actual usage varies per request and is bounded by `max_tokens`.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@docs/providers/anthropic.mdx` around lines 64 - 70, Reword the paragraph to separate GoModel behavior from Anthropic server defaults: state that GoModel only sets thinking: {type: "adaptive"} when reasoning.effort is provided (i.e., GoModel will not enable adaptive thinking unless reasoning.effort is present), and then add a conditional note that some Anthropic models or the Anthropic service may apply their own default thinking/output_config or effort heuristics when thinking is unset; keep references to reasoning.effort, thinking, effort, adaptive, and max_tokens so readers know which controls are local (GoModel) versus potential server-side defaults (Anthropic).
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Duplicate comments:
In `@docs/providers/anthropic.mdx`:
- Around line 64-70: Reword the paragraph to separate GoModel behavior from
Anthropic server defaults: state that GoModel only sets thinking: {type:
"adaptive"} when reasoning.effort is provided (i.e., GoModel will not enable
adaptive thinking unless reasoning.effort is present), and then add a
conditional note that some Anthropic models or the Anthropic service may apply
their own default thinking/output_config or effort heuristics when thinking is
unset; keep references to reasoning.effort, thinking, effort, adaptive, and
max_tokens so readers know which controls are local (GoModel) versus potential
server-side defaults (Anthropic).
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: a4aae82e-f93a-4a82-b045-2ee09975e826
📒 Files selected for processing (2)
docs/providers/anthropic.mdxinternal/providers/anthropic/anthropic_test.go
Summary
Adds full GoModel support for Claude Opus 4.8 (and fixes the same gap for 4.7) on the OpenAI-compatible surface.
thinking: {type: "adaptive"}; manualbudget_tokensthinking returns a 400. Added both prefixes to the adaptive-thinking set so reasoning requests route tooutput_config.effortinstead of a rejectedbudget_tokenspayload.reasoning.effortto the full Anthropic enumlow/medium/high/xhigh/max.xhighandmax(introduced with Opus 4.8) were previously downgraded tolow. Legacy budget-token models map the new levels to higher thinking budgets.docs/providers/anthropic.mdxdocumenting the effort→thinking mapping (adaptive vs legacy), the budget-token table, and model-specific effort gating; wired into the providers overview and nav.User-visible impact
Clients can now send
"reasoning": {"effort": "xhigh"}(ormax) to Opus 4.8/4.7 and have it forwarded asoutput_config.effortwith adaptive thinking, instead of being silently weakened tolow. Omittingreasoningis unchanged (Anthropic applies its server-side default ofhigh).Provider-specific behavior
xhighon 4.8/4.7 only;maxon 4.8/4.7/4.6/Sonnet 4.6). GoModel forwards the requested level and lets Anthropic validate, rather than encoding a capability matrix that would drift against the model registry.systementries) remain available through the/p/anthropic/messagespassthrough route.Testing
xhigh/max, Opus 4.7 adaptive routing, legacy budget mappings, andisAdaptiveThinkingModelcoverage for 4.8/4.7.make test-race,make lint, andmint validatepass via pre-commit hooks.🤖 Generated with Claude Code
Summary by CodeRabbit