feat(gateway): add OpenAI Responses API wire protocol support by BYK · Pull Request #263 · BYK/loreai

BYK · 2026-05-12T17:46:16Z

Summary

Closes #239

Adds POST /v1/responses route to handle the OpenAI Responses API wire protocol, enabling Lore memory features (LTM injection, gradient transforms, temporal capture, recall) for providers that use this protocol. Also fixes protocol-aware upstream response accumulation and adds OpenAI cached token tracking.

Wire Protocol Support

New translator (translate/openai-responses.ts): Ingress parser normalizes Responses API input items (message, function_call, function_call_output) to GatewayRequest; egress builder converts GatewayResponse back to Responses API format (both streaming SSE and non-streaming JSON)
New stream accumulator (stream/openai-responses.ts): Parses Responses API SSE events (response.output_text.delta, response.function_call_arguments.delta, response.completed, etc.) into GatewayResponse, reusing parseSSEStream from the Anthropic module
Third protocol branch in forwardToUpstream() that preserves "openai-responses" from ingress (prevents model-prefix routing from downgrading to "openai"), routes to correct upstream URL

Protocol-Aware Response Accumulation

Previously accumulateNonStreamResponse only parsed Anthropic-format responses, silently losing all content and usage data when the upstream was OpenAI. Fixed by:

Adding effectiveProtocol to UpstreamResult so the pipeline dispatches to the correct accumulator
New accumulators: accumulateOpenAINonStreamJSON, accumulateResponsesNonStreamJSON, accumulateNonStreamOpenAIStream (for OpenAI Chat Completions SSE)
Streaming openai-responses routed through accumulateResponsesSSEStream

OpenAI Cached Token Tracking

Parse prompt_tokens_details.cached_tokens from OpenAI usage responses (all three paths: non-stream JSON, stream SSE, Responses API SSE) and map to existing cacheReadInputTokens
Emit prompt_tokens_details.cached_tokens in Chat Completions and Responses API egress responses
Prompt ordering already optimal for OpenAI automatic prefix caching (system/tools before messages)

Pi Plugin

Added "openai" + 5 newly-discovered compatible providers (zai, minimax, minimax-cn, kimi-coding, vercel-ai-gateway) to GATEWAY_PROVIDERS

Verification

Typecheck: All 4 packages pass
Tests: 1252 pass, 0 fail (34 new tests)
Build: All packages build successfully

Deferred

openai-codex and azure-openai-responses Pi provider redirection (need URL pattern investigation)
Google / Bedrock (native SDK — not HTTP-proxiable)
OpenAI Batch API for worker calls (separate issue)

Add POST /v1/responses route to handle the OpenAI Responses API, enabling Lore memory features for openai, openai-codex, and azure-openai-responses providers that use this protocol instead of /v1/chat/completions. - New ingress/egress translator (translate/openai-responses.ts) that normalizes Responses API input items to GatewayRequest and converts GatewayResponse back to Responses API format (streaming + non-streaming) - New SSE stream accumulator (stream/openai-responses.ts) that parses Responses API events into GatewayResponse - Third protocol branch in forwardToUpstream() preserving openai-responses wire protocol through the pipeline - Pi plugin: add openai + 5 newly-discovered compatible providers - 27 new tests covering translator and stream accumulator

…d_tokens tracking Fix pipeline to correctly accumulate upstream responses for all three wire protocols (Anthropic, OpenAI Chat Completions, OpenAI Responses API). Previously, all responses were parsed as Anthropic format, silently losing all content and usage data when the upstream was OpenAI. - Add effectiveProtocol to UpstreamResult so pipeline can dispatch to the correct accumulator (Anthropic, OpenAI Chat Completions, or Responses API) - Add accumulateOpenAINonStreamJSON and accumulateResponsesNonStreamJSON for non-streaming upstream responses - Add accumulateNonStreamOpenAIStream for OpenAI Chat Completions SSE - Route streaming openai-responses through accumulateResponsesSSEStream - Parse prompt_tokens_details.cached_tokens from OpenAI usage responses and map to existing cacheReadInputTokens (semantic equivalent) - Emit prompt_tokens_details.cached_tokens in Chat Completions and Responses API egress responses when cache data is available - Prompt ordering already optimal for OpenAI automatic prefix caching (system/tools before messages in both builders)

BYK added 2 commits May 12, 2026 17:45

BYK enabled auto-merge (squash) May 12, 2026 18:02

BYK mentioned this pull request May 12, 2026

feat(gateway): OpenAI Batch API support for worker calls #264

Closed

BYK merged commit b8a4836 into main May 12, 2026
7 checks passed

BYK deleted the feat/openai-responses-protocol branch May 12, 2026 18:02

This was referenced May 13, 2026

publish: BYK/loreai@0.18.0 #294

Closed

publish: BYK/loreai@0.18.0 #296

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(gateway): add OpenAI Responses API wire protocol support#263

feat(gateway): add OpenAI Responses API wire protocol support#263
BYK merged 2 commits into
mainfrom
feat/openai-responses-protocol

BYK commented May 12, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

BYK commented May 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Wire Protocol Support

Protocol-Aware Response Accumulation

OpenAI Cached Token Tracking

Pi Plugin

Verification

Deferred

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

BYK commented May 12, 2026 •

edited

Loading