Context
The model-access API (#510) ships an OpenAI-compatible /v1/chat/completions endpoint so that LangChain.js, the OpenAI SDK, Vercel AI SDK, MCP sampling clients, etc. work against Harper unmodified. For non-streaming responses the shape is straightforward; for streaming, OpenAI uses a very specific SSE delta envelope with a [DONE] sentinel.
Rather than have every implementer hand-roll that envelope, ship a single formatter helper in core that wraps a token iterator into the right shape.
Proposed API
import { openaiStream } from 'harperdb/streaming';
class ChatCompletions extends Resource {
async *post(target, body, request) {
const tokens = scope.models.generateStream(body, {
model: body.model,
signal: request.signal,
});
yield* openaiStream(tokens, { model: body.model });
}
}
openaiStream(tokens, { model, id? }) yields { event, data, id } SSE messages whose data is a JSON-stringified OpenAI chunk-delta object:
{
"id": "chatcmpl-...",
"object": "chat.completion.chunk",
"created": 1731000000,
"model": "llama-3.3-70b",
"choices": [{ "index": 0, "delta": { "content": "..." }, "finish_reason": null }]
}
Terminal sentinel: { data: '[DONE]' } (raw — the existing SSE serializer in contentTypes.ts handles this since data: [DONE]\n\n is exactly what's wanted).
Tool-call deltas (choices[0].delta.tool_calls) handled symmetrically when the upstream iterator yields tool-call chunks.
Acceptance
Related
Out of scope
- Non-chat-completions OpenAI endpoints (Assistants, Files, etc.) — separate work if/when needed.
- Anthropic / Bedrock streaming shapes — those clients accept their native protocols which the respective backends already speak.
Context
The model-access API (#510) ships an OpenAI-compatible
/v1/chat/completionsendpoint so that LangChain.js, the OpenAI SDK, Vercel AI SDK, MCP sampling clients, etc. work against Harper unmodified. For non-streaming responses the shape is straightforward; for streaming, OpenAI uses a very specific SSE delta envelope with a[DONE]sentinel.Rather than have every implementer hand-roll that envelope, ship a single formatter helper in core that wraps a token iterator into the right shape.
Proposed API
openaiStream(tokens, { model, id? })yields{ event, data, id }SSE messages whosedatais a JSON-stringified OpenAI chunk-delta object:{ "id": "chatcmpl-...", "object": "chat.completion.chunk", "created": 1731000000, "model": "llama-3.3-70b", "choices": [{ "index": 0, "delta": { "content": "..." }, "finish_reason": null }] }Terminal sentinel:
{ data: '[DONE]' }(raw — the existing SSE serializer incontentTypes.tshandles this sincedata: [DONE]\n\nis exactly what's wanted).Tool-call deltas (
choices[0].delta.tool_calls) handled symmetrically when the upstream iterator yields tool-call chunks.Acceptance
openaiStream(iter, opts)is exported from a stable core path.serializeStreamwithout modification.[DONE]sentinel emitted.openaiStream().Related
Out of scope