Skip to content

perf(observability): add prefix-cache telemetry metrics#911

Merged
wsp1911 merged 3 commits into
GCWing:mainfrom
wsp1911:fix-#907
May 28, 2026
Merged

perf(observability): add prefix-cache telemetry metrics#911
wsp1911 merged 3 commits into
GCWing:mainfrom
wsp1911:fix-#907

Conversation

@wsp1911
Copy link
Copy Markdown
Collaborator

@wsp1911 wsp1911 commented May 28, 2026

Supersedes #907.

Credit to @harryfan1985 for the original work in #907. This replacement PR keeps and adapts the parts of that work that remain valid on top of current main, while leaving context compression aligned with 5e5738f, which already implemented prompt-cache-aware context compression on main via the full-prefix compression path.

What changed:

  • add cache write token tracking for prefix-cache observability
  • refine cache hit ratio accounting to use only input tokens from requests where cache telemetry was explicitly reported
  • document the prefix-cache stability contract for contextual tool manifests

What is intentionally not included:

  • dc992d6, which is already on main
  • the context compression implementation from 3a5033c / 3acb3fb, because 5e5738f already implemented prompt-cache-aware compression on the current main branch and this PR's compression path takes a different approach

Commits:

  • 3a312eeb perf(observability): add cache write token metrics
  • 70378754 fix(token-usage): refine cache hit ratio denominator
  • 2165bfa7 docs(agent-tools): document prefix-cache stability contract

wsp1911 and others added 3 commits May 28, 2026 11:50
Co-authored-by: harryfan1985 <harryfan1985@gmail.com>
Co-authored-by: harryfan1985 <harryfan1985@gmail.com>
Co-authored-by: harryfan1985 <harryfan1985@gmail.com>
@wsp1911 wsp1911 marked this pull request as ready for review May 28, 2026 04:25
@wsp1911 wsp1911 merged commit 9a0c7d4 into GCWing:main May 28, 2026
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant