feat(openai_agents): pull cached tokens through into metrics#364
Merged
Abhijeet Prasad (AbhiPrasad) merged 2 commits intomainfrom Apr 29, 2026
Merged
feat(openai_agents): pull cached tokens through into metrics#364Abhijeet Prasad (AbhiPrasad) merged 2 commits intomainfrom
Abhijeet Prasad (AbhiPrasad) merged 2 commits intomainfrom
Conversation
Both _response_log_data (Responses API) and _usage_to_metrics (chat-completions / Generation spans) only emitted total / prompt / completion tokens. Cached / reasoning / audio token counts surfaced via the OpenAI usage `*_tokens_details` sub-objects were dropped, so the OpenAI Agents SDK integration never logged metrics like prompt_cached_tokens — even though the OpenAI wrapper already does. Walk *_tokens_details inside _usage_to_metrics (mapping the input/output prefix to prompt/completion to stay consistent with Braintrust's convention) and route _response_log_data through the same helper. Mirrors the JS fix in #1186. Tests cover the four cases from the JS PR: cached tokens present on a Response span, zero is preserved, missing details produces no metric, and Generation spans extract cached tokens too.
Contributor
Author
|
I should've done this a long time ago, when I did this one: braintrustdata/braintrust-sdk-javascript@a05dc4d |
Member
|
gonna push up some commits re: testing for this! and then we can get it merged in! |
Abhijeet Prasad (AbhiPrasad)
approved these changes
Apr 29, 2026
Merged
Abhijeet Prasad (AbhiPrasad)
added a commit
that referenced
this pull request
Apr 29, 2026
Member
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
*_tokens_detailssub-objects in_usage_to_metricsso the OpenAI Agents SDK integration picks up cached / reasoning / audio token counts (e.g.input_tokens_details.cached_tokens→prompt_cached_tokens). Mirrors the JS fix in braintrust-sdk-javascript#1186._response_log_datathrough_usage_to_metricsinstead of hardcoding the threetotal/input/outputfields, so the Responses API path benefits from the same extraction._task_log_dataand_turn_log_dataalready delegated to_usage_to_metrics, so they inherit the fix.Why
A customer reported that cached tokens are not showing up in the Python
BraintrustTracingProcessor. The narrow 3-field extraction in_response_log_data(Responses API) and_usage_to_metrics(chat-completions / Generation spans) dropsinput_tokens_details.cached_tokenseven though the OpenAI wrapper (braintrust/oai.py's_parse_metrics_from_usage) already handles it correctly. The JS SDK was patched in December but the Python equivalent was never written.Test plan
test_response_span_extracts_cached_tokens_from_usage— Response span seesprompt_cached_tokenstest_response_span_handles_zero_cached_tokens— zero is preserved, not droppedtest_response_span_handles_missing_cached_tokens— noprompt_cached_tokenskey when details absenttest_generation_span_extracts_cached_tokens_from_usage— Generation span path