Skip to content

fix: warmup cooldown timing + rising cost threshold + accurate TTL pricing#373

Merged
BYK merged 1 commit into
mainfrom
fix-warmup-costs
May 18, 2026
Merged

fix: warmup cooldown timing + rising cost threshold + accurate TTL pricing#373
BYK merged 1 commit into
mainfrom
fix-warmup-costs

Conversation

@BYK
Copy link
Copy Markdown
Owner

@BYK BYK commented May 18, 2026

Summary

Fixes cache warming economics — warmup was costing $65.89 but only saving $4.43 (net -$59.50). The dominant issue was a cooldown timing bug causing every continuation warmup to be a cold cache write ($1.25) instead of a cheap refresh ($0.10).

Root Cause

The normal-path cooldown was set to ttlMs (300s for 5m TTL) instead of ttlMs - warmupMarginMs (255s). Since warmups fire in the margin window (4:15–5:00), a 300s cooldown means the next warmup fires right as the previous cache expires — every warmup is a cold write. The forced/tool-call path already used the correct tighter cooldown.

Changes

Fix 1: Cooldown timing (the $65 → $6.50 fix)

  • Uses ttlMs - warmupMarginMs as cooldown for all paths, not just forced/tool-call
  • Each continuation warmup now fires while the previous cache is still alive (cache read ~$0.10 vs cold write ~$1.25)
  • Also fixes cooldownFor() (used by dashboard diagnostics) to match

Fix 2: Rising cumulative cost threshold

  • Adds cumulativeCostThreshold(k, read, write) — after k warmup cycles, break-even requires higher P(returns)
  • Formula: P > k × read / (write - read) — rises linearly with cycles spent
  • For Opus 5m: k=1 needs 8.7%, k=3 needs 26%, k=5 needs 43%, k=6 needs 52%
  • At k=1 matches costThreshold() exactly; clamped to 1.0
  • Applied in Phase B continuation to replace the flat 8.7% threshold

Fix 3: TTL-aware cost recording

  • recordWarmupCost() and recordWarmupHit() accept optional ttl parameter
  • For 1h TTL sessions, doubles cache_write to match Anthropic's actual pricing
  • Previously, displayed costs and savings were undercounted by up to 50% for 1h sessions
  • recordTTLSavings intentionally uses base rate (counterfactual is against 5m, not 1h)

Fix 4: Session-level ROI guard

  • After 10+ total warmups, if session hit rate < 20%, warming stops entirely
  • Applies to both normal path AND tool-call fast path
  • Prevents unlimited spending across many short breaks where per-break checks pass but aggregate ROI is negative
  • Dashboard notWarmingReason updated to show ROI guard and rising threshold rejections

Tests

  • Updated cooldown test to verify the fix (260s > 255s allowed, 250s < 255s blocked)
  • Added cumulativeCostThreshold() tests: formula values, k=1 matches costThreshold(), clamping, edge cases
  • Added session ROI guard tests: low hit rate rejected, high hit rate allowed, skipped below 10 warmups, applies to tool-call path

Impact Estimate

For the session that triggered investigation (410 turns, 65 warmups, 12 hits):

  • Before: $65.89 warmup cost, $4.43 savings → net -$61.46
  • After (cooldown fix alone): ~$6.50 warmup cost → net -$2.07
  • After (all fixes): Rising threshold stops warming around cycle 5-6, ROI guard prevents chronically unprofitable sessions

@BYK BYK self-assigned this May 18, 2026
@BYK BYK force-pushed the fix-warmup-costs branch from e48e70f to 68c02e6 Compare May 18, 2026 21:01
@BYK BYK merged commit bda1646 into main May 18, 2026
10 checks passed
@BYK BYK deleted the fix-warmup-costs branch May 18, 2026 21:05
This was referenced May 21, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant