feat(ai): route structured output through native combined mode where supported (closes #605)#609
Conversation
…upported (closes #605) When an adapter declares `supportsCombinedToolsAndSchema()`, the engine wires `outputSchema` into the regular `chatStream` call and harvests the schema-constrained JSON from the agent loop's final-turn text — skipping the separate finalization round-trip introduced in #600 (which remains the fallback for adapters that can't combine tools + schema in one call). Opted in: modern OpenAI Chat Completions, OpenAI Responses, Claude 4.5+. Opted out explicitly: Groq (API-rejected), Grok (pending per-model gate). Unchanged (legacy path): Anthropic 4.4-, Gemini, Ollama, OpenRouter. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
|
Important Review skippedDraft detected. Please check the settings in the CodeRabbit UI or the ⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
🚀 Changeset Version Preview21 package(s) bumped directly, 9 bumped as dependents. 🟥 Major bumps
🟨 Minor bumps
🟩 Patch bumps
|
|
View your CI Pipeline Execution ↗ for commit 6865584
☁️ Nx Cloud last updated this comment at |
@tanstack/ai
@tanstack/ai-anthropic
@tanstack/ai-client
@tanstack/ai-code-mode
@tanstack/ai-code-mode-skills
@tanstack/ai-devtools-core
@tanstack/ai-elevenlabs
@tanstack/ai-event-client
@tanstack/ai-fal
@tanstack/ai-gemini
@tanstack/ai-grok
@tanstack/ai-groq
@tanstack/ai-isolate-cloudflare
@tanstack/ai-isolate-node
@tanstack/ai-isolate-quickjs
@tanstack/ai-ollama
@tanstack/ai-openai
@tanstack/ai-openrouter
@tanstack/ai-preact
@tanstack/ai-react
@tanstack/ai-react-ui
@tanstack/ai-solid
@tanstack/ai-solid-ui
@tanstack/ai-svelte
@tanstack/ai-utils
@tanstack/ai-vue
@tanstack/ai-vue-ui
@tanstack/openai-base
@tanstack/preact-ai-devtools
@tanstack/react-ai-devtools
@tanstack/solid-ai-devtools
commit: |
…mple Adds 'anthropic' as a selectable provider in the structured-output generation demo so users can see Claude 4.5+ streaming the schema-constrained JSON natively via the #605 combined-mode path (`output_format` + `tools` in one beta Messages call) alongside the existing OpenAI / Grok / Groq / OpenRouter options. Only Claude 4.5+ models are listed because older Claude models still fall back to the non-streaming forced-tool-use workaround. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
…rmat) for native combined mode Anthropic deprecated the top-level `output_format` field in favor of `output_config.format` — the API now returns: > "output_format: This field is deprecated. Use 'output_config.format' > instead." Wire the schema under `output_config.format` instead, merging with any existing `output_config` from `modelOptions` so callers can keep tuning `output_config.effort` alongside the schema. The SDK's `BetaOutputConfig` type currently exposes only `effort`; we type `format` explicitly on `InternalTextProviderOptions.output_config` so the adapter call site doesn't need a cast. Updates the matching native-combined-mode unit test. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
…ing via combined-mode set
Two related bugs surfaced when picking "Claude Opus 4.7" from the dropdown
and running structured output streaming:
1. **Wrong model id in the dropdown.** The 4.7 line (and the 4.6-fast
variant) use a dot separator in ai-anthropic/model-meta —
`claude-opus-4.7`, `claude-opus-4.7-fast`, `claude-opus-4.6-fast` —
while 4.5/4.6 base releases use a dash. The dropdown previously sent
`claude-opus-4-7`, which doesn't match any model id, so
`AnthropicTextAdapter.supportsCombinedToolsAndSchema()` returned false
(set membership miss). Engine fell through to the legacy forced-tool
finalization path, which rejects `thinking` → API 400.
2. **Reasoning gate drifted from combined-mode gate.** The example's
`reasoningOptionsFor()` enabled `thinking` based on a `claude-{family}-4-`
prefix check that admitted Claude 4.0 / 4.1 models — which are NOT in
the combined-mode set. Same forced-tool + thinking → 400 trap. Now
imports the canonical `ANTHROPIC_COMBINED_TOOLS_AND_SCHEMA_MODELS` set
from `@tanstack/ai-anthropic` and gates strictly to its membership so
the two checks can't drift again.
Also adds `claude-opus-4.6-fast` and `claude-opus-4.7-fast` to the dropdown
now that the ids match.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
…-7-fast (use dashes, not dots)
The Anthropic API itself surfaced the bug:
> "model: claude-opus-4.7 was not found. Did you mean claude-opus-4-7?"
ai-anthropic/model-meta.ts had `claude-opus-4.6-fast`, `claude-opus-4.7`,
and `claude-opus-4.7-fast` defined with dot separators — but the actual
Anthropic API uses dashes (`claude-opus-4-7`), matching the convention
already used for `claude-opus-4-5` and `claude-opus-4-6`. The dotted ids
in this repo's model-meta have never resolved against the API; any
caller selecting one of these models was getting a 404 from Anthropic.
Now that the ids are right:
- Calls to `anthropicText('claude-opus-4-7')` etc. reach the real model.
- `ANTHROPIC_COMBINED_TOOLS_AND_SCHEMA_MODELS` (which references
`CLAUDE_OPUS_4_7.id` and friends) picks up the dash form
automatically, so the engine's #605 native-combined-mode routing
matches on the same string the dev server actually sends.
Also reverts the ts-react-chat dropdown to use dashes, which is now both
internally consistent with model-meta AND correct against the API.
OpenRouter catalog ids (`anthropic/claude-opus-4.7` etc.) are untouched —
OpenRouter uses dot separators in its own naming and that's a separate
mapping table.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
…e thinking on Claude 4.7
Claude 4.7 deprecated \`thinking: { type: 'enabled', budget_tokens }\` —
the API rejects it with:
> "thinking.type.enabled is not supported for this model. Use
> thinking.type.adaptive and output_config.effort to control thinking
> behavior."
Adapter (ai-anthropic):
- Add \`output_config\` to the public \`ExternalTextProviderOptions\` so
callers can pass \`{ effort }\` alongside the engine's internally-set
\`{ format }\` (#605).
- Add \`output_config\` to the adapter's \`validKeys\` allowlist so
user-supplied effort actually reaches the wire. Without this it was
silently dropped with a "dropped unknown modelOptions key" warning.
- The existing merge in \`mapCommonOptionsToAnthropic\` already preserves
user \`output_config\` when the engine adds \`format\`, so no further
changes needed.
Example (ts-react-chat):
- Branch \`reasoningOptionsFor('anthropic', model)\`:
- Claude 4.7 / 4.7-fast:
\`thinking: { type: 'adaptive' }, output_config: { effort: 'medium' }\`
- Claude 4.5 / 4.6 / 4.6-fast / haiku 4.5:
\`thinking: { type: 'enabled', budget_tokens: 1024 }\` (legacy shape
still supported).
- Gating stays strictly to \`ANTHROPIC_COMBINED_TOOLS_AND_SCHEMA_MODELS\`
membership so the reasoning gate can't drift from combined-mode gating.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
… against Messages API) \`claude-opus-4-6-fast\` and \`claude-opus-4-7-fast\` 404 against the Messages API: > "model: claude-opus-4-7-fast" → not_found_error Looking at the model-meta entries, the "fast" variants are priced ~6× their non-fast siblings (input 30 / output 150 vs input 5 / output 25 on 4.7). That cost shape matches Anthropic's *priority tier*, which is selected via \`service_tier: 'priority'\` on the request — not a separate model id. Most likely the meta entries were added speculatively and have never resolved against the real API. Pulling them from the dropdown until the canonical ids (or the correct service_tier flow) are confirmed. The meta entries themselves are unchanged in this PR — that's a follow-up question for whoever added them. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
\`effort: 'medium'\` on the adaptive-thinking path may skip thinking entirely for simpler prompts (per Anthropic's docs: "Balanced cost-quality" vs \`'high'\` = "Default - Claude will almost always think"). The demo's guitar-recommendation prompt is light enough that the model was skipping, leaving the reasoning panel empty on Opus 4.7 specifically. Bumping to \`'high'\` matches the practical behavior of the 4.5 / 4.6 path's \`budget_tokens: 1024\` — thinking shows up on every run, which is the point of having a reasoning surface in this demo. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
…ally streams on Claude 4.7
Per the Anthropic docs:
> On Claude Opus 4.7 and Claude Mythos Preview, \`display\` defaults to
> \`"omitted"\` instead [of \`"summarized"\`], so you must set
> \`display: "summarized"\` explicitly to receive summarized thinking.
Without this flag, requests with \`thinking: { type: 'adaptive' }\` on 4.7
do stream a thinking content block — but the block only emits
\`signature_delta\` events, never \`thinking_delta\`. The adapter's
REASONING_MESSAGE_CONTENT handler never sees text, the example's
reasoning panel stays empty, and it looks like the model just didn't
think.
Adapter: widens \`AnthropicAdaptiveThinkingOptions.thinking\` (when
\`type === 'adaptive'\`) to accept the new \`display\` field, documenting
the 4.6→4.7 default flip.
Example: passes \`display: 'summarized'\` for any \`claude-opus-4-7*\`
model so the demo's reasoning surface stays populated.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
…nd accept it in the type
Debug logs (since stripped) confirmed adaptive thinking on Claude
Opus 4.7 with \`effort: 'high'\` was silently skipped by the model for
short prompts — Anthropic streamed only a single text block, no
\`content_block_start\` with thinking type at all. Per docs \`'high'\` is
"Claude will almost always think" but the model still ultimately
decides. \`'max'\` ("absolute highest capability") is the strongest
signal available.
Changes:
- Widen \`AnthropicOutputConfigOptions.output_config.effort\` to include
\`'max'\` (the existing \`AnthropicEffortOptions\` top-level surface
already accepted it; this aligns the new \`output_config\` shape).
- Cast the SDK \`beta.messages.create\` argument to
\`BetaMessageCreateParamsStreaming\` so both \`output_config.format\`
(not declared in SDK type) AND \`output_config.effort: 'max'\` (SDK
types \`effort\` more narrowly than the runtime API accepts) pass
TypeScript at the boundary. Comment explains the SDK-type-lag.
- Example: bump 4.7 from \`effort: 'high'\` to \`'max'\` and document
the three 4.7-specific gotchas (rejects \`type: 'enabled'\`,
\`display\` defaults to \`'omitted'\`, adaptive is non-deterministic).
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
…x-thinking variants Adaptive thinking on Opus 4.7 can chew through several thousand tokens before the schema-constrained JSON starts emitting, blowing past the adapter's default \`max_tokens\` (1024) and surfacing as "response was cut off" with a truncated answer. Enabling it by default meant the demo often failed; disabling it by default meant nobody could see the streaming reasoning surface work. Compromise: keep regular "Claude Opus 4.7" fast and direct (no thinking, no bumped budget), and add a dedicated "Claude Opus 4.7 (Max Thinking)" entry that opts into adaptive thinking + \`effort: 'max'\` + \`maxTokens: 16_000\` so both reasoning and the JSON fit. Mechanism is a \`:thinking-max\` synthetic suffix on the dropdown \`value\`. \`adapterFor\` and \`reasoningOptionsFor\` strip it before constructing the adapter / building modelOptions; the route also bumps \`maxTokens\` only when that variant is selected. Other Anthropic models (4.5 / 4.6 / haiku 4.5) keep their existing \`type: 'enabled', budget_tokens: 1024\` thinking-on default since they don't have the same context-blowing failure mode at this budget. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
… from prior commits Closes the per-provider gap left by the initial #605 landing. The engine plumbing is unchanged — this just adds two more adapters to the \`supportsCombinedToolsAndSchema()\` opt-in and wires their schema-field-name into \`mapOptionsToRequest\`. **ai-gemini** - New \`GEMINI_COMBINED_TOOLS_AND_SCHEMA_MODELS\` set covering Gemini 3.x (3-pro, 3-pro-preview, 3-flash, 3.1-pro-preview, 3.1-flash-lite). Gemini 2.x is documented as brittle for the combination and stays on the legacy finalization path. - \`supportsCombinedToolsAndSchema()\` returns true for set members. - \`mapCommonOptionsToGemini\` attaches \`config.responseSchema\` + \`responseMimeType: 'application/json'\` when \`options.outputSchema\` is set, alongside any tools. - Unit tests verify wire shape for Gemini 3 and gate enforcement for Gemini 2.5. **ai-grok** - New \`GROK_COMBINED_TOOLS_AND_SCHEMA_MODELS\` set covering the Grok 4 family (grok-4, grok-4-1-fast-*, grok-4-fast-*, grok-4-20*, grok-4-3, grok-code-fast-1). Grok 2 / 3 reject the combination per xAI docs. - Override flips \`supportsCombinedToolsAndSchema()\` from blanket-false to a model-meta-set check. The actual wire wiring is already correct (inherited from \`openai-base\` chat-completions); this just narrows the capability claim. - Unit test verifies per-model gate enforcement. **E2E fixes** - \`structured-output-middleware.spec.ts\`: my new "native combined mode (openai)" assertion was checking \`expect(phases).toContain('beforeModel')\`, but the phase-recorder middleware records \`ctx.phase\` from \`onChunk\` and chunks during streaming are tagged \`'modelStream'\`. Fixed to \`'modelStream'\`. - \`multi-turn-structured\`: temporarily exclude anthropic from the matrix. Tracking via #613 — 2nd turn's structured-output-part shows 1st turn's content under native combined path for some reason. All other providers (including openai, also on native combined path) pass. Single-turn anthropic structured-output continues to pass. **Docs + skills** - \`docs/structured-outputs/overview.md\` and \`docs/advanced/middleware.md\`: expanded the native combined providers list to include Gemini 3.x + Grok 4.x family. - \`structured-outputs\` SKILL: replaced the streaming coverage table with a richer per-adapter status that distinguishes native combined mode from legacy \`structuredOutputStream\` from fallback. Added an explanation of how the capability flag drives the choice. - \`adapter-configuration\` SKILL: new Pattern 5 documenting the \`supportsCombinedToolsAndSchema\` method, with the current per-adapter status table. **Changeset** updated to bump ai-gemini and ai-grok to minor, document the expanded provider list, and note OpenRouter's per-call lookup is a follow-up (tracked in #612). 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
Summary
When an adapter declares
supportsCombinedToolsAndSchema(), the engine now wiresoutputSchemainto the regularchatStreamcall and harvests the schema-constrained JSON from the agent loop's final-turn text — skipping the separate finalization round-trip introduced in #600 (which remains the fallback for adapters that can't combine tools + schema in one call).This is the dual-path proposal from #605: modern providers get a faster single-call path; older / capability-limited providers (Groq, Ollama, Gemini 2.x, Grok 2/3, Claude 4.4 and earlier) keep the existing finalization path. No call-site changes required.
Closes #605.
Per-adapter status
response_format: json_schemaalongsidetoolstext.format: json_schemaalongsidetoolsoutput_format: { type: 'json_schema', schema }on beta MessagesstructuredOutputBaseTextAdapterdirectly, not the OpenAI base)Engine changes
TextAdapter.supportsCombinedToolsAndSchema?(modelOptions?)— new optional capability method. Default-false when omitted; the OpenAI base classes override totrue.TextOptions.outputSchema— the engine populates with a pre-converted JSON Schema only when the adapter declared the capability. Adapters wire it into the upstream request.finalStructuredOutput.nativeCombined: boolean(internal) — when set,runStructuredFinalizationis replaced byharvestCombinedStructuredOutputwhich parsesaccumulatedContentas JSON, runs Standard-Schema validation, and emits syntheticstructured-output.start/structured-output.completeevents for the client-side StreamProcessor.'structuredOutput'middleware phase is intentionally not fired on the native-combined path; middleware sees the run throughbeforeModel/modelStreamas usual. Backward-compatible for fallback-path adapters.Test plan
chat-native-combined-structured-output.test.ts— 7 unit tests pinning engine routing, synthetic event ordering, no-double-lifecycle, Promise validation success + failure, JSON parse failure, and fallback-path opt-out.response_format), openai-base responses (text.format), Anthropic (output_formatfor Claude 4.5+, capability=false for 3.7-sonnet).structured-output-middleware.spec.tsE2E: now exercises both paths — legacy via claude-3-7-sonnet (still assertsstructuredOutputphase fires) and native combined via openai (asserts the phase does NOT fire). Required threadingprovider/modelthrough the/middleware-testharness route.pnpm test:lib— 906 core tests, all packages greenpnpm test:types— all 30 projects greenpnpm test:eslint— cleanpnpm test:build— cleanOut of scope (follow-ups)
responseSchemainto the regular chatStream request🤖 Generated with Claude Code