Skip to content

Emit a usage chunk on streamed completions#3

Merged
Defilan merged 1 commit into
mainfrom
feat/streaming-usage
May 15, 2026
Merged

Emit a usage chunk on streamed completions#3
Defilan merged 1 commit into
mainfrom
feat/streaming-usage

Conversation

@Defilan
Copy link
Copy Markdown
Member

@Defilan Defilan commented May 15, 2026

What

Add a token-usage block to streamed chat completions.

Why

Streaming responses ended with [DONE] and no token counts, so clients (opencode) could not report context-window consumption for a streamed turn. Only non-streaming responses carried usage.

How

The .finished stream event already carried the Usage — the streaming handler discarded it. Add an optional usage to ChatCompletionChunk and emit an OpenAI-style trailing usage chunk (empty choices, populated usage) just before [DONE]. usage is omitted on every other chunk.

Verified

swift build + 28 tests pass; the routes streaming test now asserts the stream carries prompt_tokens.

Streaming chat completions ended with `[DONE]` and no token counts, so
clients (opencode) could not report context-window consumption for a
streamed turn — only non-streaming responses carried `usage`.

The `.finished` stream event already carried the `Usage`; the streaming
handler discarded it. Add `usage` to `ChatCompletionChunk` and emit an
OpenAI-style trailing usage chunk (empty `choices`, populated `usage`)
before `[DONE]`.
@Defilan Defilan merged commit 18775f8 into main May 15, 2026
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant