Feature/gpt 5 codex#11
Conversation
…nd state management
…ure in translation tests
There was a problem hiding this comment.
Pull Request Overview
Adds structured reasoning/thinking support (including encrypted content/signatures) across request/response translation and streaming, introduces SSE ping keep‑alives, and adjusts token usage accounting to subtract cached tokens. Also narrows the ResponsesPayload.instructions type (dropping array support) and adds cache token fields with differing naming between streaming and non‑streaming paths.
- Add new reasoning input/output structures and thinking block streaming events (summary text, signature propagation).
- Modify usage token accounting (subtract cached tokens; add cache_* fields) and add SSE ping heartbeat.
- Narrow instructions API (removes array form) and remove related processing.
Reviewed Changes
Copilot reviewed 9 out of 9 changed files in this pull request and generated 6 comments.
Show a summary per file
| File | Description |
|---|---|
| tests/translation.test.ts | Updates expected reasoning shape (summary, status, encrypted_content). |
| tests/anthropic-request.test.ts | Adds signature field to thinking blocks in Anthropic request tests. |
| src/services/copilot/create-responses.ts | Introduces ResponseInputReasoning; alters ResponsesPayload (instructions type narrowed). |
| src/routes/responses/utils.ts | Drops support for array-form instructions expansion. |
| src/routes/responses/handler.ts | Adds periodic SSE ping keep-alive with cleanup. |
| src/routes/messages/responses-translation.ts | Translates thinking blocks to reasoning items; adjusts usage token calculation and signature mapping. |
| src/routes/messages/responses-stream-translation.ts | Adds streaming handlers for reasoning summary text, part completion, item completion with signature; adjusts usage tokens and cache fields. |
| src/routes/messages/anthropic-types.ts | Extends AnthropicThinkingBlock with signature. |
| README.md | Documents /v1/responses endpoint in endpoints table. |
Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.
|
|
||
| export interface ResponsesPayload { | ||
| model: string | ||
| instructions?: string | null |
There was a problem hiding this comment.
instructions previously accepted Array (per removed union) but now only allows string | null; this is a breaking change that removes the ability to pass structured instruction items. If intentional, update external API docs and versioning; if not, restore the union (string | Array | null) or introduce a new field to distinguish formats.
| instructions?: string | null | |
| instructions?: string | Array<ResponseInputItem> | null |
| const { input } = payload | ||
|
|
||
| if (Array.isArray(input)) { | ||
| result.push(...input) |
There was a problem hiding this comment.
Support for array-form instructions was removed (instructions no longer read or merged) which changes how callers can supply multi-part instruction content. Clarify this change in API docs or re-add conditional merging of an instructions array if backward compatibility is required.
| result.push(...input) | |
| result.push(...input) | |
| } else if ( | |
| input && | |
| typeof input === "object" && | |
| Array.isArray((input as any).instructions) | |
| ) { | |
| // Backward compatibility: merge instructions array if present | |
| result.push(...(input as any).instructions) | |
| } else if (input) { | |
| result.push(input as ResponseInputItem) |
| input_tokens: inputTokens - (inputCachedTokens ?? 0), | ||
| output_tokens: outputTokens, | ||
| ...(response.usage?.input_tokens_details?.cached_tokens !== undefined && { | ||
| cache_read_input_tokens: |
There was a problem hiding this comment.
Usage field uses cache_read_input_tokens here, while the streaming path emits cache_creation_input_tokens (see ensureMessageStart). Divergent field names for analogous cache token metrics can confuse clients; align on a single naming convention or document the semantic distinction if both are required.
| cache_read_input_tokens: | |
| cache_creation_input_tokens: |
| const inputTokens = | ||
| (state.initialInputTokens ?? 0) - (state.initialInputCachedTokens ?? 0) |
There was a problem hiding this comment.
Streaming usage metadata uses cache_creation_input_tokens whereas non-streaming usage uses cache_read_input_tokens, creating an inconsistent external contract. Standardize the cache metric naming or emit both (with clear semantics) to avoid client divergence.
| input_tokens: inputTokens, | ||
| output_tokens: 0, | ||
| ...(state.initialInputCachedTokens !== undefined && { | ||
| cache_creation_input_tokens: state.initialInputCachedTokens, |
There was a problem hiding this comment.
Streaming usage metadata uses cache_creation_input_tokens whereas non-streaming usage uses cache_read_input_tokens, creating an inconsistent external contract. Standardize the cache metric naming or emit both (with clear semantics) to avoid client divergence.
| cache_creation_input_tokens: state.initialInputCachedTokens, | |
| cache_creation_input_tokens: state.initialInputCachedTokens, | |
| cache_read_input_tokens: state.initialInputCachedTokens, |
| const openThinkingBlockIfNeeded = ( | ||
| state: ResponsesStreamState, | ||
| outputIndex: number, | ||
| events: Array<AnthropicStreamEventData>, | ||
| ): number => { | ||
| const contentIndex = 0 | ||
| const key = getBlockKey(outputIndex, contentIndex) | ||
| let blockIndex = state.blockIndexByKey.get(key) | ||
|
|
||
| if (blockIndex === undefined) { | ||
| blockIndex = state.nextContentBlockIndex | ||
| state.nextContentBlockIndex += 1 | ||
| state.blockIndexByKey.set(key, blockIndex) | ||
| } | ||
|
|
||
| if (!state.openBlocks.has(blockIndex)) { | ||
| events.push({ | ||
| type: "content_block_start", | ||
| index: blockIndex, | ||
| content_block: { | ||
| type: "thinking", | ||
| thinking: "", | ||
| }, | ||
| }) | ||
| state.openBlocks.add(blockIndex) | ||
| } | ||
|
|
||
| return blockIndex | ||
| } |
There was a problem hiding this comment.
[nitpick] Logic duplicates patterns from openTextBlockIfNeeded with only content type and fixed contentIndex differences; consider refactoring to a generic helper (e.g., openBlockIfNeeded(kind, outputIndex, contentIndex?)) to reduce code duplication and future divergence risk.
No description provided.