Skip to content

Report 1h vs 5m Anthropic cache-creation token split in telemetry#319172

Merged
bhavyaus merged 2 commits into
mainfrom
dev/bhavyau/anthropic-cache-creation-ttl-telemetry
May 30, 2026
Merged

Report 1h vs 5m Anthropic cache-creation token split in telemetry#319172
bhavyaus merged 2 commits into
mainfrom
dev/bhavyau/anthropic-cache-creation-ttl-telemetry

Conversation

@bhavyaus
Copy link
Copy Markdown
Collaborator

@bhavyaus bhavyaus commented May 30, 2026

Adds per-request 1h vs 5m cache-creation token split to telemetry, enabling exact COGS attribution for the chat.anthropic.promptCaching.extendedTtl A/B experiment without inferring rates from arm assignment.

The new fields live on a nested `anthropic_cache_creation?` object on `APIUsage.prompt_tokens_details`, namespaced to make the provider-specificity explicit at the type level. Other providers leave it undefined; telemetry uses optional chaining so missing values drop cleanly from the row.

prompt_tokens_details?: {
    cached_tokens: number;
    cache_creation_input_tokens?: number;
    anthropic_cache_creation?: {
        ephemeral_1h_input_tokens?: number;   // 2x rate
        ephemeral_5m_input_tokens?: number;   // 1.25x rate
    };
};

Copilot AI review requested due to automatic review settings May 30, 2026 22:22
aiday-mar
aiday-mar previously approved these changes May 30, 2026
@bhavyaus bhavyaus enabled auto-merge (squash) May 30, 2026 22:23
Parse the per-TTL breakdown from Anthropic's usage.cache_creation object
(present in message_start across CAPI/Anthropic 1P, Bedrock InvokeModel,
and Vertex AI) and emit two new measurements on response.success:

- promptCacheCreation1hTokenCount: 1h-TTL writes (2x base input rate)
- promptCacheCreation5mTokenCount: 5m-TTL writes (1.25x base input rate)

Enables exact per-row COGS attribution for the
chat.anthropic.promptCaching.extendedTtl A/B experiment without
inferring rates from arm assignment.

The new fields live on a nested anthropic_cache_creation? object on
APIUsage.prompt_tokens_details, namespaced to make the
provider-specificity explicit at the type level. Other providers leave
it undefined; telemetry uses optional chaining so missing values drop
cleanly from the row.
@bhavyaus bhavyaus force-pushed the dev/bhavyau/anthropic-cache-creation-ttl-telemetry branch from aad6ebe to d80fd65 Compare May 30, 2026 22:26
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR extends Copilot Chat’s Anthropic Messages API usage accounting to capture (and emit in existing response.success telemetry) a per-request split of prompt cache-creation (write) tokens by TTL (1h vs 5m), enabling more accurate cost attribution for prompt caching experiments.

Changes:

  • Add anthropic_cache_creation to APIUsage.prompt_tokens_details to represent Anthropic’s usage.cache_creation TTL breakdown.
  • Parse and preserve the 1h/5m cache-creation token breakdown in both non-streaming and streaming Anthropic Messages API response handling.
  • Emit two new response.success telemetry measurements and add unit tests covering the new parsing/streaming semantics.
Show a summary per file
File Description
extensions/copilot/src/platform/networking/common/openai.ts Extends APIUsage typing to include Anthropic-specific cache-creation TTL breakdown.
extensions/copilot/src/platform/endpoint/node/messagesApi.ts Parses/accumulates 1h vs 5m cache-creation token counts for streaming + non-streaming responses and surfaces them in prompt_tokens_details.
extensions/copilot/src/extension/prompt/node/chatMLFetcherTelemetry.ts Adds two new response.success telemetry measurements sourced from anthropic_cache_creation.
extensions/copilot/src/platform/endpoint/test/node/messagesApi.spec.ts Adds unit tests for non-streaming + streaming preservation/override behavior of the TTL breakdown.

Copilot's findings

  • Files reviewed: 4/4 changed files
  • Comments generated: 1

Comment thread extensions/copilot/src/extension/prompt/node/chatMLFetcherTelemetry.ts Outdated
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
@bhavyaus bhavyaus force-pushed the dev/bhavyau/anthropic-cache-creation-ttl-telemetry branch from bd0c532 to 63513f3 Compare May 30, 2026 22:35
@bhavyaus bhavyaus merged commit f6d1fcf into main May 30, 2026
33 of 47 checks passed
@bhavyaus bhavyaus deleted the dev/bhavyau/anthropic-cache-creation-ttl-telemetry branch May 30, 2026 22:56
@vs-code-engineering vs-code-engineering Bot added this to the 1.123.0 milestone May 30, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants