OpenAI#7
Merged
Merged
Conversation
tombeckenham
added a commit
that referenced
this pull request
May 21, 2026
- Skip agent loop when finalStructuredOutput is set and tools.length === 0 to avoid a wasted chatStream round-trip before finalization (Critical #1). - Omit `tools` structurally from StructuredOutputMiddlewareConfig — the field was inherited but silently discarded at the provider boundary (Important #2). - Preserve Standard Schema `issues[]` on validation failures via a new exported StandardSchemaValidationError carried as `error.cause` (Important #4). - Preserve the original adapter error (stack, cause, provider properties) on the fallbackStructuredOutputStream path via an onAdapterError callback (Important #5). - Add tests for messages-transform via onStructuredOutputConfig and for mid-finalization abort routing through onAbort (Important #6, #7). - Strip transitional source comments ("in Task 7", "closes issue #390"). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
AlemTuzlak
added a commit
that referenced
this pull request
May 21, 2026
* feat(ai): structured-output middleware coverage (closes #390) Middleware now wraps the final structured-output provider call in `chat({ outputSchema })` for both Promise<T> and streaming variants. - Add `'structuredOutput'` to `ChatMiddlewarePhase` and set it on `ChatMiddlewareContext` for the duration of the final structured-output adapter call. - Add optional `ChatMiddleware.onStructuredOutputConfig` hook receiving a `StructuredOutputMiddlewareConfig` (with the JSON Schema) which may return a partial to transform the config before the final call. - Export new `StructuredOutputMiddlewareConfig` type extending `ChatMiddlewareConfig` with `outputSchema: JSONSchema`. - `onChunk` now observes chunks from the final structured-output call; `onFinish` fires once at the end of the whole `chat()` invocation after finalization completes. - Remove the previous `RUN_STARTED`/`RUN_FINISHED` suppression hack in `runStreamingStructuredOutput`; engine now emits exactly one outer pair around the whole run. * test(ai-e2e): add structured-output x middleware spec Adds a Playwright spec that exercises the structured-output finalization path end-to-end with middleware attached, plus a route + fixture for the mocked LLM call and a phase-capture helper for asserting middleware phase transitions. * docs(ai): document onStructuredOutputConfig hook and structuredOutput phase Updates middleware and structured-outputs skills, public docs, and regenerated TypeDoc reference for the new `structuredOutput` middleware phase and the `onStructuredOutputConfig` hook. * fix(ai): address CR findings on structured-output middleware coverage - Type: onConfig / onStructuredOutputConfig Promise return allows null - Error diagnostics: preserve cause + code on validation failures, with smarter message extraction for plain-object errors (Standard Schema) - Streaming consumers see RUN_ERROR on finalization failure (missing result or validation), guarded against double-emission - Synth structured-output.start carries threadId - Abort signal checked inside runStructuredFinalization for-await loop - runAgenticStructuredOutput rethrow preserves cause + code - Skill examples use (ctx, info) signature for onFinish/onError - Docs Mermaid diagram includes structuredOutput phase branch - Docs prose acknowledges 3 onConfig firings (init + beforeModel + structuredOutput boundary) - Docs add FinishInfo table marking info.usage explicitly optional - Changeset reflects suppression-hack relocation (not removal) - Test comment corrected (synthesis still happens) - E2E spec title honest about stream:true; docblock notes scope - kind=phase GET no longer gated behind OTEL_TEST_ENABLED Call-site enumeration (Procedure 2.8): - finalizationError gained cause?: unknown. Readers updated: * TextEngine terminal hook chooser — propagates cause via Error({ cause }) and surfaces code as a non-enumerable Object.defineProperty. * runAgenticStructuredOutput — same treatment when re-throwing. * getFinalizationError return type widened to include cause. - runStructuredFinalization gained a post-loop synthetic RUN_ERROR yield path, gated on yieldChunks and a new runErrorYielded flag. The streaming consumer in runStreamingStructuredOutputImpl iterates engine.run() and propagates the new chunk transparently — no consumer-side changes. - Synthesized structured-output.start gained threadId — passive readers, no behavioral impact. * fix(ai): address Round 2 CR findings on structured-output middleware - Mid-finalization abort routes through onAbort (not onError): skip missing-result attribution when isCancelled() - Finalization chunks no longer pollute agent-loop state: removed handleStreamChunk(chunk) call in runStructuredFinalization; targeted updates only for structured-output.complete + RUN_ERROR + RUN_FINISHED.usage - Synth RUN_ERROR for empty-stream case is preceded by a synth structured-output.start so client-side StructuredOutputPart routing works Call-site enumeration: - handleStreamChunk removal: accumulatedContent (now agent-loop only; info.content stays clean of JSON deltas), finishedEvent / lastFinishReason (finalization no longer overwrites the agent loop's real finish reason), currentMessageId (unchanged path), currentThinking* (no thinking pollution), earlyTermination (irrelevant — finalization is already terminating). Explicit branches still capture structured-output.complete, RUN_ERROR (finalizationError), and RUN_FINISHED.usage (runOnUsage). - isCancelled() early-return + chooser gating: run()'s finally block fires onAbort when !terminalHookCalled && isCancelled(). The terminal- hook chooser at the end of the try-block now additionally skips when isCancelled() so it can't pre-empt the finally with a stray onFinish. - Pre-synth-start before synth-RUN_ERROR: uses the same buildSynthesizedStart() + pipeThroughMiddleware path as the in-loop synth, gated on !startEmitted so we never double-emit. * fix(ai): align onFinish info docs with implementation - Docs/skill no longer claim onFinish.info.usage reflects the full run including finalization tokens. The Round 2 fix correctly segregated finalization state; info.* reflects the agent loop's terminal state only. - Add unit test pinning the documented semantics: tools-less structured-output run gives info.usage=undefined, finishReason=null, content='', while onUsage fires once for finalization tokens. * fix(ai): clarify tools on StructuredOutputMiddlewareConfig is not forwarded to the structured-output adapter call Round 4 CR finding: tools is structurally inherited from ChatMiddlewareConfig but the engine omits tools from structuredCallOptions.chatOptions. Document the caveat in the type JSDoc and the public middleware reference so middleware authors don't expect tools transformation at this boundary to take effect. * fix(ai): align runStreamingStructuredOutput JSDoc with implementation Round 5 CR finding: the JSDoc claimed "Validates the parsed object against the original Standard Schema" but the implementation explicitly defers validation to the consumer (via `void outputSchema`). Update the JSDoc to honestly describe the streaming-path validation policy and call out the deliberate asymmetry with `runAgenticStructuredOutput` (which does validate). * fix(docs): clarify server-side validation is path-dependent (streaming vs Promise<T>) Round 6 CR finding: docs/structured-outputs/overview.md claimed "Server-side validation against your schema is always authoritative" but the streaming path (chat({ outputSchema, stream: true })) deliberately defers validation to the consumer. Update the prose to reflect the actual path-dependent behavior — agentic Promise<T> validates server-side; streaming forwards the adapter event verbatim and consumers validate downstream. * fix(ai): apply Procedure 3 bucket-(c) audit promotions Bucket (c) Promotion Audit (cr-loop final step) flagged 4 items as load-bearing on the structured-output subject this PR makes authoritative: - PROMOTE_TO_A: runAgenticStructuredOutput was calling convertSchemaToJsonSchema without forStructuredOutput: true while runStreamingStructuredOutput did. Same Zod schema produced different JSON Schema depending on stream mode. Both paths now use the strict converter, eliminating the divergence. - PROMOTE_TO_B (3 trivial fixes on PR-adjacent surfaces): - fallbackStructuredOutputStream's IDs prefixed 'mock-' in production code; renamed to 'fallback-' to stop leaking test-style identifiers into user-visible run/thread/message IDs for Anthropic/Gemini/Ollama structured-output runs. - fallbackStructuredOutputStream's RUN_ERROR chunk was missing threadId while sibling RUN_STARTED and RUN_FINISHED carried it; added for consumer correlation. - chat() JSDoc example used chunk.type === 'content' (wrong); changed to 'TEXT_MESSAGE_CONTENT'. 26 other bucket-(c) items confirmed STAY_IN_C (pre-existing, not subject-load-bearing) and are reported to the loop-exit follow-up list. 1 item (gpt-5.2 model existence) REFUTED. * ci: apply automated fixes * fix(ai): address PR #600 review — critical + important findings - Skip agent loop when finalStructuredOutput is set and tools.length === 0 to avoid a wasted chatStream round-trip before finalization (Critical #1). - Omit `tools` structurally from StructuredOutputMiddlewareConfig — the field was inherited but silently discarded at the provider boundary (Important #2). - Preserve Standard Schema `issues[]` on validation failures via a new exported StandardSchemaValidationError carried as `error.cause` (Important #4). - Preserve the original adapter error (stack, cause, provider properties) on the fallbackStructuredOutputStream path via an onAdapterError callback (Important #5). - Add tests for messages-transform via onStructuredOutputConfig and for mid-finalization abort routing through onAbort (Important #6, #7). - Strip transitional source comments ("in Task 7", "closes issue #390"). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * chore(examples): add /verify-pr600 page to ts-react-chat Adds a diagnostics page in the ts-react-chat example that exercises all four PR #600 review fixes against the real chat() engine using inline mock adapters. No API keys required — click "Run verification" and see pass/fail per scenario with observed values. - POST /api/verify-pr600 runs the four scenarios server-side. - /verify-pr600 page calls the endpoint and renders results. - Header nav gains a "Diagnostics" section linking to the page. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * chore(examples): replace verify-pr600 with real-provider repros The mock-based verify-pr600 page only validated my own fixes; it didn't prove anything about real provider wire formats. Replace with two real-provider verifications: - New /issue-390-repro page runs the exact gist from the issue reporter (@imsherrill) against geminiText('gemini-2.5-flash') and surfaces the middleware logs + per-phase chunk counts. Fix is verified iff the middleware observed any chunks with ctx.phase === 'structuredOutput'. - Existing /generations/structured-output page now instruments every request with a counter middleware. Counts surface via a JSON field (non-streaming) or a trailing CUSTOM `phase-counts` event (streaming). Works across all configured providers (OpenAI/Anthropic via OpenRouter, Gemini via OpenRouter, Grok, Groq). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * chore(examples): drop /issue-390-repro page The counter middleware on /generations/structured-output already demonstrates the PR #600 fix against real providers — the dedicated single-shot repro page is redundant. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(ai): drop unused ChatMiddlewarePhase import after merge Surfaced by tsc after merging origin/main (the eslint-config 0.4.0 bump in #607 strengthens unused-import detection). The type is re-exported elsewhere and not referenced inside this file. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * ci: apply automated fixes * fix: address CI lint + openrouter test failures - Remove unnecessary `as object` cast in compose.ts; the upgraded typescript-eslint (via #607) flags it as unnecessary because `Object.keys` already accepts the original type. - Update the openrouter `chat() entrypoint with strict transformation` test: with Critical #1 (skip agent loop when tools.length === 0), the engine no longer consumes the streaming mock for an empty agent pass — the structured-output payload now arrives via the same streaming mock that previously held the placeholder 'ok' delta. Move the JSON payload into the streaming mock and assert via `responseFormat` presence instead of `stream === false`. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: autofix-ci[bot] <114827586+autofix-ci[bot]@users.noreply.github.com> Co-authored-by: Tom Beckenham <34339192+tombeckenham@users.noreply.github.com> Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This PR adds the meta information on the:
It also adds very detailed info on all the model providers and uses them to generate types. For now this is not useful until I add anthropic and maybe gemini models to figure out the common interfaces so we can make the typesystem per-model based