Fix token usage extraction for OpenAI Responses API (response.usage) in SSE/WS paths#3178
Conversation
response.usage) in SSE/WS paths
There was a problem hiding this comment.
Pull request overview
Fixes token usage extraction for OpenAI Responses API, where usage data is nested at json.response.usage in response.completed/response.done events rather than at top-level json.usage. Without this fix, Codex /v1/responses traffic completed with zero token accounting.
Changes:
- Added SSE handler for
response.completed/response.doneevents that reads tokens fromjson.response.usage, including reasoning and cached token enrichment. - Added fallback to
json.response.usage/json.response.modelin non-streaming JSON extraction. - Added parser, HTTP streaming, and WebSocket tests covering nested Responses API usage.
Show a summary per file
| File | Description |
|---|---|
| containers/api-proxy/token-tracker.js | Source the usage object from json.response.usage (and model from json.response.model) as fallback in JSON path; add dedicated SSE branch for response.completed/response.done. |
| containers/api-proxy/token-tracker.parsing.test.js | New unit tests for nested response.usage in JSON and SSE paths, including reasoning/cached token extraction. |
| containers/api-proxy/token-tracker.http.test.js | Streaming /v1/responses test asserting input/output token metrics. |
| containers/api-proxy/token-tracker.websocket.test.js | WebSocket frame test for response.completed usage extraction. |
Copilot's findings
Tip
Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
- Files reviewed: 4/4 changed files
- Comments generated: 0
🔬 Smoke Test Results
Overall: FAIL — Workflow template variables ( cc
|
Smoke Test Results
Result: FAIL (1/3 tests passed) Network connectivity to external domains appears blocked by the sandboxed firewall environment.
|
Chroot Version Comparison Results
Overall: Not all tests passed — Python and Node.js versions differ between host and chroot environments.
|
🏗️ Build Test Suite Results
Overall: 8/8 ecosystems passed — ✅ PASS
|
Smoke Test✅ PR review: Refactor option parser utilities into domain-focused modules with compatibility wrappers; Refactor API proxy request path into focused guard and transport modules Warning Firewall blocked 1 domainThe following domain was blocked by the firewall during workflow execution:
network:
allowed:
- defaults
- "registry.npmjs.org"See Network Configuration for more information.
|
|
Smoke Test Results: FAIL (GitHub MCP tool missing, Connectivity SSL error 35) Warning Firewall blocked 1 domainThe following domain was blocked by the firewall during workflow execution:
network:
allowed:
- defaults
- "localhost"See Network Configuration for more information.
|
Smoke Test Results
Overall: FAIL —
|
|
@copilot fix this failing ci check https://github.com/github/gh-aw-firewall/actions/runs/25897362213/job/76114390532?pr=3178 |
I investigated this run. The failure is not from this PR’s code changes: job |
🔥 Smoke Test: Copilot BYOK (Offline) Mode
Running in BYOK offline mode ( Overall: FAIL (GitHub MCP auth unavailable in this environment)
|
Bug Fix
What was the bug?
token-trackeronly read OpenAI usage from top-leveljson.usage, so Responses API completion events (response.completed/response.done) with usage atjson.response.usagewere ignored. As a result, Codex/v1/responsestraffic could complete successfully while effective token accounting stayed at zero.How did you fix it?
containers/api-proxy/token-tracker.jsextractUsageFromSseLine)response.completedandresponse.done.json.response.usage(input_tokens,output_tokens,total_tokens).reasoning_tokensand cached prompt tokens.json.response.modelwhen present.extractUsageFromJson)json.response.usagewhen top-leveljson.usageis absent.Targeted tests
containers/api-proxy/token-tracker.parsing.test.jsresponse.completedusage extraction.response.usage.containers/api-proxy/token-tracker.http.test.js/v1/responsescase withresponse.completed.containers/api-proxy/token-tracker.websocket.test.jsresponse.completedframe case.Example
Testing
response.usage) and related reasoning/cache fields.