Skip to content

feat(openai_agents): pull cached tokens through into metrics#364

Merged
Abhijeet Prasad (AbhiPrasad) merged 2 commits intomainfrom
curtis/cached_tokens_tracking
Apr 29, 2026
Merged

feat(openai_agents): pull cached tokens through into metrics#364
Abhijeet Prasad (AbhiPrasad) merged 2 commits intomainfrom
curtis/cached_tokens_tracking

Conversation

@cjgalione
Copy link
Copy Markdown
Contributor

Summary

  • Walk *_tokens_details sub-objects in _usage_to_metrics so the OpenAI Agents SDK integration picks up cached / reasoning / audio token counts (e.g. input_tokens_details.cached_tokensprompt_cached_tokens). Mirrors the JS fix in braintrust-sdk-javascript#1186.
  • Route _response_log_data through _usage_to_metrics instead of hardcoding the three total/input/output fields, so the Responses API path benefits from the same extraction.
  • _task_log_data and _turn_log_data already delegated to _usage_to_metrics, so they inherit the fix.

Why

A customer reported that cached tokens are not showing up in the Python BraintrustTracingProcessor. The narrow 3-field extraction in _response_log_data (Responses API) and _usage_to_metrics (chat-completions / Generation spans) drops input_tokens_details.cached_tokens even though the OpenAI wrapper (braintrust/oai.py's _parse_metrics_from_usage) already handles it correctly. The JS SDK was patched in December but the Python equivalent was never written.

Test plan

  • test_response_span_extracts_cached_tokens_from_usage — Response span sees prompt_cached_tokens
  • test_response_span_handles_zero_cached_tokens — zero is preserved, not dropped
  • test_response_span_handles_missing_cached_tokens — no prompt_cached_tokens key when details absent
  • test_generation_span_extracts_cached_tokens_from_usage — Generation span path
  • Existing non-VCR processor tests still pass

Both _response_log_data (Responses API) and _usage_to_metrics
(chat-completions / Generation spans) only emitted total / prompt /
completion tokens. Cached / reasoning / audio token counts surfaced via
the OpenAI usage `*_tokens_details` sub-objects were dropped, so the
OpenAI Agents SDK integration never logged metrics like
prompt_cached_tokens — even though the OpenAI wrapper already does.

Walk *_tokens_details inside _usage_to_metrics (mapping the input/output
prefix to prompt/completion to stay consistent with Braintrust's
convention) and route _response_log_data through the same helper. Mirrors
the JS fix in #1186.

Tests cover the four cases from the JS PR: cached tokens present on a
Response span, zero is preserved, missing details produces no metric,
and Generation spans extract cached tokens too.
@cjgalione
Copy link
Copy Markdown
Contributor Author

I should've done this a long time ago, when I did this one: braintrustdata/braintrust-sdk-javascript@a05dc4d

@AbhiPrasad
Copy link
Copy Markdown
Member

gonna push up some commits re: testing for this! and then we can get it merged in!

@AbhiPrasad Abhijeet Prasad (AbhiPrasad) merged commit de41845 into main Apr 29, 2026
161 of 163 checks passed
@AbhiPrasad Abhijeet Prasad (AbhiPrasad) deleted the curtis/cached_tokens_tracking branch April 29, 2026 16:49
Abhijeet Prasad (AbhiPrasad) added a commit that referenced this pull request Apr 29, 2026
@AbhiPrasad
Copy link
Copy Markdown
Member

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants