Report 1h vs 5m Anthropic cache-creation token split in telemetry by bhavyaus · Pull Request #319172 · microsoft/vscode

bhavyaus · 2026-05-30T22:22:14Z

Adds per-request 1h vs 5m cache-creation token split to telemetry, enabling exact COGS attribution for the chat.anthropic.promptCaching.extendedTtl A/B experiment without inferring rates from arm assignment.

The new fields live on a nested `anthropic_cache_creation?` object on `APIUsage.prompt_tokens_details`, namespaced to make the provider-specificity explicit at the type level. Other providers leave it undefined; telemetry uses optional chaining so missing values drop cleanly from the row.

prompt_tokens_details?: {
    cached_tokens: number;
    cache_creation_input_tokens?: number;
    anthropic_cache_creation?: {
        ephemeral_1h_input_tokens?: number;   // 2x rate
        ephemeral_5m_input_tokens?: number;   // 1.25x rate
    };
};

Parse the per-TTL breakdown from Anthropic's usage.cache_creation object (present in message_start across CAPI/Anthropic 1P, Bedrock InvokeModel, and Vertex AI) and emit two new measurements on response.success: - promptCacheCreation1hTokenCount: 1h-TTL writes (2x base input rate) - promptCacheCreation5mTokenCount: 5m-TTL writes (1.25x base input rate) Enables exact per-row COGS attribution for the chat.anthropic.promptCaching.extendedTtl A/B experiment without inferring rates from arm assignment. The new fields live on a nested anthropic_cache_creation? object on APIUsage.prompt_tokens_details, namespaced to make the provider-specificity explicit at the type level. Other providers leave it undefined; telemetry uses optional chaining so missing values drop cleanly from the row.

Copilot

Pull request overview

This PR extends Copilot Chat’s Anthropic Messages API usage accounting to capture (and emit in existing response.success telemetry) a per-request split of prompt cache-creation (write) tokens by TTL (1h vs 5m), enabling more accurate cost attribution for prompt caching experiments.

Changes:

Add anthropic_cache_creation to APIUsage.prompt_tokens_details to represent Anthropic’s usage.cache_creation TTL breakdown.
Parse and preserve the 1h/5m cache-creation token breakdown in both non-streaming and streaming Anthropic Messages API response handling.
Emit two new response.success telemetry measurements and add unit tests covering the new parsing/streaming semantics.

Show a summary per file

File	Description
extensions/copilot/src/platform/networking/common/openai.ts	Extends `APIUsage` typing to include Anthropic-specific cache-creation TTL breakdown.
extensions/copilot/src/platform/endpoint/node/messagesApi.ts	Parses/accumulates 1h vs 5m cache-creation token counts for streaming + non-streaming responses and surfaces them in `prompt_tokens_details`.
extensions/copilot/src/extension/prompt/node/chatMLFetcherTelemetry.ts	Adds two new `response.success` telemetry measurements sourced from `anthropic_cache_creation`.
extensions/copilot/src/platform/endpoint/test/node/messagesApi.spec.ts	Adds unit tests for non-streaming + streaming preservation/override behavior of the TTL breakdown.

Copilot's findings

Files reviewed: 4/4 changed files
Comments generated: 1

Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>

Copilot AI review requested due to automatic review settings May 30, 2026 22:22

Copilot started reviewing on behalf of bhavyaus May 30, 2026 22:22 View session

aiday-mar previously approved these changes May 30, 2026

View reviewed changes

bhavyaus enabled auto-merge (squash) May 30, 2026 22:23

bhavyaus force-pushed the dev/bhavyau/anthropic-cache-creation-ttl-telemetry branch from aad6ebe to d80fd65 Compare May 30, 2026 22:26

Copilot AI reviewed May 30, 2026

View reviewed changes

Comment thread extensions/copilot/src/extension/prompt/node/chatMLFetcherTelemetry.ts Outdated

Potential fix for pull request finding

63513f3

Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>

bhavyaus dismissed aiday-mar’s stale review via 63513f3 May 30, 2026 22:27

bhavyaus force-pushed the dev/bhavyau/anthropic-cache-creation-ttl-telemetry branch from bd0c532 to 63513f3 Compare May 30, 2026 22:35

dmitrivMS approved these changes May 30, 2026

View reviewed changes

bhavyaus merged commit f6d1fcf into main May 30, 2026
33 of 47 checks passed

bhavyaus deleted the dev/bhavyau/anthropic-cache-creation-ttl-telemetry branch May 30, 2026 22:56

vs-code-engineering Bot added this to the 1.123.0 milestone May 30, 2026

bhavyaus mentioned this pull request May 31, 2026

Report Anthropic thinking_tokens as reasoning tokens in telemetry #319185

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Report 1h vs 5m Anthropic cache-creation token split in telemetry#319172

Report 1h vs 5m Anthropic cache-creation token split in telemetry#319172
bhavyaus merged 2 commits into
mainfrom
dev/bhavyau/anthropic-cache-creation-ttl-telemetry

bhavyaus commented May 30, 2026 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

bhavyaus commented May 30, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Copilot's findings

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

bhavyaus commented May 30, 2026 •

edited

Loading