Skip to content

feat(gateway): add OpenAI Responses API wire protocol support#263

Merged
BYK merged 2 commits into
mainfrom
feat/openai-responses-protocol
May 12, 2026
Merged

feat(gateway): add OpenAI Responses API wire protocol support#263
BYK merged 2 commits into
mainfrom
feat/openai-responses-protocol

Conversation

@BYK
Copy link
Copy Markdown
Owner

@BYK BYK commented May 12, 2026

Summary

Closes #239

Adds POST /v1/responses route to handle the OpenAI Responses API wire protocol, enabling Lore memory features (LTM injection, gradient transforms, temporal capture, recall) for providers that use this protocol. Also fixes protocol-aware upstream response accumulation and adds OpenAI cached token tracking.

Wire Protocol Support

  • New translator (translate/openai-responses.ts): Ingress parser normalizes Responses API input items (message, function_call, function_call_output) to GatewayRequest; egress builder converts GatewayResponse back to Responses API format (both streaming SSE and non-streaming JSON)
  • New stream accumulator (stream/openai-responses.ts): Parses Responses API SSE events (response.output_text.delta, response.function_call_arguments.delta, response.completed, etc.) into GatewayResponse, reusing parseSSEStream from the Anthropic module
  • Third protocol branch in forwardToUpstream() that preserves "openai-responses" from ingress (prevents model-prefix routing from downgrading to "openai"), routes to correct upstream URL

Protocol-Aware Response Accumulation

Previously accumulateNonStreamResponse only parsed Anthropic-format responses, silently losing all content and usage data when the upstream was OpenAI. Fixed by:

  • Adding effectiveProtocol to UpstreamResult so the pipeline dispatches to the correct accumulator
  • New accumulators: accumulateOpenAINonStreamJSON, accumulateResponsesNonStreamJSON, accumulateNonStreamOpenAIStream (for OpenAI Chat Completions SSE)
  • Streaming openai-responses routed through accumulateResponsesSSEStream

OpenAI Cached Token Tracking

  • Parse prompt_tokens_details.cached_tokens from OpenAI usage responses (all three paths: non-stream JSON, stream SSE, Responses API SSE) and map to existing cacheReadInputTokens
  • Emit prompt_tokens_details.cached_tokens in Chat Completions and Responses API egress responses
  • Prompt ordering already optimal for OpenAI automatic prefix caching (system/tools before messages)

Pi Plugin

  • Added "openai" + 5 newly-discovered compatible providers (zai, minimax, minimax-cn, kimi-coding, vercel-ai-gateway) to GATEWAY_PROVIDERS

Verification

  • Typecheck: All 4 packages pass
  • Tests: 1252 pass, 0 fail (34 new tests)
  • Build: All packages build successfully

Deferred

  • openai-codex and azure-openai-responses Pi provider redirection (need URL pattern investigation)
  • Google / Bedrock (native SDK — not HTTP-proxiable)
  • OpenAI Batch API for worker calls (separate issue)

BYK added 2 commits May 12, 2026 17:45
Add POST /v1/responses route to handle the OpenAI Responses API, enabling
Lore memory features for openai, openai-codex, and azure-openai-responses
providers that use this protocol instead of /v1/chat/completions.

- New ingress/egress translator (translate/openai-responses.ts) that
  normalizes Responses API input items to GatewayRequest and converts
  GatewayResponse back to Responses API format (streaming + non-streaming)
- New SSE stream accumulator (stream/openai-responses.ts) that parses
  Responses API events into GatewayResponse
- Third protocol branch in forwardToUpstream() preserving openai-responses
  wire protocol through the pipeline
- Pi plugin: add openai + 5 newly-discovered compatible providers
- 27 new tests covering translator and stream accumulator
…d_tokens tracking

Fix pipeline to correctly accumulate upstream responses for all three
wire protocols (Anthropic, OpenAI Chat Completions, OpenAI Responses API).
Previously, all responses were parsed as Anthropic format, silently losing
all content and usage data when the upstream was OpenAI.

- Add effectiveProtocol to UpstreamResult so pipeline can dispatch to the
  correct accumulator (Anthropic, OpenAI Chat Completions, or Responses API)
- Add accumulateOpenAINonStreamJSON and accumulateResponsesNonStreamJSON
  for non-streaming upstream responses
- Add accumulateNonStreamOpenAIStream for OpenAI Chat Completions SSE
- Route streaming openai-responses through accumulateResponsesSSEStream
- Parse prompt_tokens_details.cached_tokens from OpenAI usage responses
  and map to existing cacheReadInputTokens (semantic equivalent)
- Emit prompt_tokens_details.cached_tokens in Chat Completions and
  Responses API egress responses when cache data is available
- Prompt ordering already optimal for OpenAI automatic prefix caching
  (system/tools before messages in both builders)
@BYK BYK enabled auto-merge (squash) May 12, 2026 18:02
@BYK BYK merged commit b8a4836 into main May 12, 2026
7 checks passed
@BYK BYK deleted the feat/openai-responses-protocol branch May 12, 2026 18:02
This was referenced May 13, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

feat(gateway): support additional wire protocols

1 participant