fix: 1h TTL cache write pricing and dynamic savings/overhead wording#445
Merged
Conversation
- computeCallCost(), estimateCompactionCost(), updateShadowContext() now accept TTL param; 1h sessions apply 2x cache_write multiplier - setCachePricing() in gradient uses 1h-adjusted rate for bust-vs-continue - Sentry cost metric accounts for 1h TTL - Dashboard wording switches between 'savings' and 'overhead' based on sign (table header, stat card, live bar, breakdown, historical row)
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem
Anthropic charges 2x base input price for 1h TTL cache writes ($10/MTok for Opus vs $6.25 for 5m TTL). Three code paths in the cost tracker used the 5m rate regardless of TTL, causing:
computeCallCost()— conversation turn costs underreported by ~6% for 1h sessionsestimateCompactionCost()— avoided compaction savings undercounted (e.g. $10.29 → $13.44 for 28 compactions)setCachePricing()— gradient's bust-vs-continue gate used wrong write costAdditionally, several UI locations displayed "savings" language even when the net value was negative, which was confusing.
Fix
1h TTL Pricing (
cost-tracker.ts,pipeline.ts,sentry.ts)ttl?: "5m" | "1h"param tocomputeCallCost(),recordConversationCost(),updateShadowContext(),estimateCompactionCost(), andemitCostMetric()ttl === "1h", applies 2x multiplier tocache_writeratesessionState.resolvedConversationTTLsetCachePricing()now uses the 1h-adjusted rate for bust-vs-continue decisionsDynamic Wording (
ui.ts)5 locations now switch between "savings" and "overhead" based on sign:
SavingsNet(neutral)Net Overhead(red)N% savedN% overheadNet savings: -$XNet overhead: $XNet estimated savingsNet estimated overhead