Skip to content

fix: prevent post-idle compaction from busting /keep sessions on 5m TTL#266

Merged
BYK merged 2 commits into
mainfrom
fix/keep-warm-cache-bust-5m-ttl
May 12, 2026
Merged

fix: prevent post-idle compaction from busting /keep sessions on 5m TTL#266
BYK merged 2 commits into
mainfrom
fix/keep-warm-cache-bust-5m-ttl

Conversation

@BYK
Copy link
Copy Markdown
Owner

@BYK BYK commented May 12, 2026

Summary

  • isCacheWarm() now returns true unconditionally for /keep sessions that have received at least one warmup, preventing post-idle compaction from firing and busting the warmed cache prefix
  • shouldWarm() double-warm guard now uses a tighter cooldown (ttlMs - warmupMarginMs) in forced-keep mode, so warmups fire every ~5m instead of ~10m on a 5m TTL — eliminates the dead zone where the conversation cache expired between warmups
  • 2 new tests covering the tighter /keep cooldown and verifying non-forced mode still uses the full TTL cooldown

Root Cause

Session 0IZRCvwuhvQcRgZW had /keep activated but was on 5-minute conversation TTL (only ~8 turns). Two compounding bugs:

  1. isCacheWarm() didn't account for /keep — it checked (now - lastWarmupAt) < ttlMs. Last warmup was 6.5 min before the user returned, exceeding the 5m TTL window, so it returned false even though the warmer was actively maintaining the session.

  2. Warmup cadence was 2x TTL — the double-warm guard blocked for a full ttlMs (5m) after each warmup, but the margin window didn't align again until ~10 min later. The cache expired for ~5 min every cycle (confirmed by logs showing 42% hit rate on every warmup after the first).

Result: user returned after 51 min idle → isCacheWarm returned falseskipCompact=false → post-idle compaction fired → cache busted → first turn after resume was only 40% cache hit.

BYK added 2 commits May 12, 2026 19:02
isCacheWarm() now returns true unconditionally for /keep sessions that
have received at least one warmup, preventing post-idle compaction from
firing and busting the warmed cache prefix.

shouldWarm() double-warm guard now uses a tighter cooldown
(ttlMs - warmupMarginMs) in forced-keep mode so warmups fire every ~5m
instead of ~10m on a 5m TTL, eliminating the dead zone where the
conversation cache expired between warmups.
Prevent isCacheWarm from returning true indefinitely if the warmer stops
(circuit breaker trip, process failure). Now checks lastWarmupAt is within
2x TTL instead of unconditional true. Also add Math.max guard on cooldownMs.
@BYK BYK merged commit 974eb06 into main May 12, 2026
7 checks passed
@BYK BYK deleted the fix/keep-warm-cache-bust-5m-ttl branch May 12, 2026 19:12
This was referenced May 13, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant