-
Notifications
You must be signed in to change notification settings - Fork 849
Propagate CachedInputTokenCount in OpenTelemetry telemetry #7234
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…lient Co-authored-by: stephentoub <2642209+stephentoub@users.noreply.github.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This pull request adds support for tracking cached input tokens in OpenTelemetry telemetry following the OpenTelemetry semantic conventions. The changes propagate UsageDetails.CachedInputTokenCount through both histogram metrics and activity span tags in the OpenTelemetryChatClient.
Changes:
- Added OpenTelemetry constants for cache_read token type and corresponding semantic convention attribute name
- Updated OpenTelemetryChatClient to record cached input tokens in histogram metrics and activity span tags
- Enhanced tests to verify the new cached token telemetry functionality
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated no comments.
| File | Description |
|---|---|
| src/Libraries/Microsoft.Extensions.AI/OpenTelemetryConsts.cs | Added TokenTypeCacheRead ("cache_read") constant and CacheReadInputTokens ("gen_ai.usage.cache_read_input_tokens") attribute name following OpenTelemetry semantic conventions |
| src/Libraries/Microsoft.Extensions.AI/ChatCompletion/OpenTelemetryChatClient.cs | Added histogram recording for cached input tokens with token type tag and activity span tag using the new semantic convention attribute, following the same pattern as input/output tokens |
| test/Libraries/Microsoft.Extensions.AI.Tests/ChatCompletion/OpenTelemetryChatClientTests.cs | Added CachedInputTokenCount = 5 to test usage details and corresponding assertion for the activity span tag in both streaming and non-streaming test scenarios |
…ic conventions Co-authored-by: stephentoub <2642209+stephentoub@users.noreply.github.com>
Head branch was pushed to by a user without write access
…defined in semantic conventions Co-authored-by: stephentoub <2642209+stephentoub@users.noreply.github.com>
OpenTelemetryConsts.csto add new constant:CacheReadInputTokens= "gen_ai.usage.cache_read.input_tokens" for activity span tagOpenTelemetryChatClient.csto:CachedInputTokenCountas an activity span tag usinggen_ai.usage.cache_read.input_tokensOpenTelemetryChatClientTests.cs:CachedInputTokenCount = 5to test usage details for both sync and streaming responsesgen_ai.usage.cache_read.input_tokens= 5 in activity tagsSummary
This PR adds support for propagating
CachedInputTokenCountthrough OpenTelemetry telemetry as an activity span tag usinggen_ai.usage.cache_read.input_tokensas defined in the OpenTelemetry semantic conventions registry.Note: The histogram metric
gen_ai.client.token.usageonly supports "input" and "output" token types per the semantic conventions, so cached tokens are only recorded as a span attribute, not as a separate histogram entry.Original prompt
💡 You can make Copilot smarter by setting up custom instructions, customizing its development environment and configuring Model Context Protocol (MCP) servers. Learn more Copilot coding agent tips in the docs.
Microsoft Reviewers: Open in CodeFlow