Summary
On the default OpenAI Chat Completions streaming path, token usage is never reported. OpenAI ships the usage payload as a separate final SSE chunk (choices: []) that arrives after the finish_reason chunk, but the agent loop breaks out of the stream the moment it sees Finish — so the later Usage chunk is never consumed.
Details
The provider sets stream_options.include_usage: true unconditionally for streaming, and the accumulator emits Usage and Finish from whichever chunk carries each:
OpenAI delivers these in separate chunks: the finish_reason chunk first, then a usage-only chunk (choices: []) last. So Finish is emitted during an earlier ingest(). The agent consumer then stops:
The subsequently-yielded Usage chunk is dropped. The other three providers emit Usage before Finish (Anthropic anthropic.rs:933, Responses responses.rs:1164, Google google.rs:644), so only OpenAI Chat Completions is affected.
Impact
No functional break, but billing/telemetry Usage events are silently lost on the default provider's streaming path — Usage accounting reads zero. Severity: medium.
Suggested fix
Either keep draining the stream after Finish until it closes (consuming a trailing Usage), or have the OpenAI accumulator buffer the Finish until the usage-only terminal chunk has been ingested so Usage is emitted before Finish like the other providers.
Summary
On the default OpenAI Chat Completions streaming path, token usage is never reported. OpenAI ships the usage payload as a separate final SSE chunk (
choices: []) that arrives after thefinish_reasonchunk, but the agent loop breaks out of the stream the moment it seesFinish— so the laterUsagechunk is never consumed.Details
The provider sets
stream_options.include_usage: trueunconditionally for streaming, and the accumulator emitsUsageandFinishfrom whichever chunk carries each:crates/harness-llm/src/openai.rs:796-861—ingest()pushesLlmChunk::Usagewhen a chunk hasusage, andLlmChunk::Finish(viafinalise()) when a chunk hasfinish_reason.OpenAI delivers these in separate chunks: the
finish_reasonchunk first, then a usage-only chunk (choices: []) last. SoFinishis emitted during an earlieringest(). The agent consumer then stops:crates/harness-core/src/agent.rs:636-638:The subsequently-yielded
Usagechunk is dropped. The other three providers emitUsagebeforeFinish(Anthropicanthropic.rs:933, Responsesresponses.rs:1164, Googlegoogle.rs:644), so only OpenAI Chat Completions is affected.Impact
No functional break, but billing/telemetry
Usageevents are silently lost on the default provider's streaming path —Usageaccounting reads zero. Severity: medium.Suggested fix
Either keep draining the stream after
Finishuntil it closes (consuming a trailingUsage), or have the OpenAI accumulator buffer theFinishuntil the usage-only terminal chunk has been ingested soUsageis emitted beforeFinishlike the other providers.