Skip to content

feat(kv-cache): optional hit-count decay for eviction score#162

Merged
antirez merged 1 commit into
antirez:mainfrom
unsaltedbutter-ai:feat/kv-cache-hit-decay
May 15, 2026
Merged

feat(kv-cache): optional hit-count decay for eviction score#162
antirez merged 1 commit into
antirez:mainfrom
unsaltedbutter-ai:feat/kv-cache-hit-decay

Conversation

@unsaltedbutter-ai
Copy link
Copy Markdown
Contributor

This is one way to solve issue #161

Problem

The disk-cache eviction score in kv_entry_eviction_score (ds4_server.c:8696-8729) is (hits + 1) * tokens / file_size for every non-protected entry. hits is monotonic — it only ever goes up on a successful disk reuse, and last_used is captured at write time but never feeds back into the score. That works fine while the workload stays the same, but it lets a once-popular file dominate forever after any prompt schema change.

Concrete case: my cache directory currently holds this top entry by score:

 score*1e5  hits   tokens     reason   size_MB           last_used  file
    50.887     7    12288       cold    184.2  2026-05-14 22:57:44  716996621553

That's a 7-hit cold snapshot of an old Hermes system prompt. I shrunk that prompt yesterday by removing some tools, so the file can never match an incoming prompt again. But its score (50.9) is 7× the next-highest entry, and the policy will keep it pinned at the top forever. Newly stored prefixes from the new prompt start at a (0 + 1) baseline and can never accumulate a first hit while the cache is at budget.

There is no LRU sweep, no TTL, no decay. Once a file earns hits, those hits never leave the file.

Change

Add --kv-cache-hit-half-life-days N. When N > 0, the score uses

effective_hits = hits * 2 ** -((now - last_used) / half_life)

instead of hits. A matching prompt that refreshes last_used keeps the entry hot, and an entry that stops getting hits gradually falls toward the (0 + 1) * tokens / file_size baseline that protects fresh stores. With a 7-day half-life, a 10-hit file untouched for 14 days has its bonus shrunk to 10 × 2⁻² = 2.5, after 70 days to 10 / 1024 ≈ 0.01, etc.

Default is 0, which disables decay and preserves the current behavior bit-for-bit. The startup log was extended to print the configured half-life so the option is visible at boot.

Why this shape

  • Exponential, not "lose one hit per week". Continuous decay avoids cliff effects at week boundaries and is one fewer parameter than a step schedule. The 2 ** -elapsed/half_life form is exp2, which is in <math.h> and already in the existing -lm link line.
  • Decay multiplies hits, not score. Keeping the + 1 floor outside the decay means a brand-new entry still gets exactly tokens / file_size, identical to today. The fix is purely about reducing the bonus of stale entries, never penalizing fresh ones.
  • Composes with protected_sha. Decay applies only on the hit-weighted branch below the protected-current-store short-circuit. The brand-new file stays protected by its SHA; a stale once-popular file gets demoted by age. The two guards target different failure modes.
  • Opt-in. Default 0 is no behavior change, so existing deployments are unaffected.
  • Days as the unit. Seconds would be more parallel with last_used, but a half-life this short is unusable in practice; days picks a sane scale for the workloads where this matters.

Test plan

New unit test test_kv_cache_eviction_score_decays_stale_hits exercises the score function directly with two entries:

  • stale: 1024 tokens, 10 hits, 4096 bytes, last_used = 0
  • fresh: 2048 tokens, 0 hits, 4096 bytes, last_used = 1000

evaluated at now = 1000 + 14 × 7 × 86400 (14 half-lives past stale's last_used). Both calls pass NULL for protected_sha so they exercise the hit-weighted branch this PR modifies, not the short-circuit. The test asserts:

  1. With half_life = 0, stale outscores fresh (current behavior preserved).
  2. With half_life = 7 days, fresh outscores stale (decay flips the order).
  3. fresh's decayed score equals exactly (0 + 1) * tokens / file_size (the floor never erodes).

Run: make ds4_test && ./ds4_test --server

Existing tests test_kv_cache_eviction_values_fresh_snapshots, test_kv_cache_eviction_protects_current_store, test_kv_cache_eviction_does_not_protect_oversize_current_store, and test_kv_cache_eviction_keeps_aligned_continued_frontiers continue to pass.

Diff size

One file, +58 / -6. Two of those lines are the new #include <math.h> and one struct field; the rest is the score function update, the new CLI branch, the help-text and startup-log strings, and the test.

The disk cache eviction score is (hits + 1) * tokens / file_size.
hits is monotonic: it only ever goes up on a successful disk reuse.
That is fine while the workload stays the same, but it locks
once-popular files in place after a prompt schema change (system
prompt rewritten, tools removed, model swapped, etc.).  The stale
file can no longer match anything, so it never accumulates new hits,
but its existing bonus keeps it well above every freshly-stored entry
indefinitely.

Add an optional time-based decay so the bonus erodes when an entry
goes untouched.  When --kv-cache-hit-half-life-days N is set, the
score uses

    effective_hits = hits * 2 ** -((now - last_used) / half_life)

so a matching prompt that refreshes last_used keeps the entry hot,
and an entry that stops getting hits gradually falls toward the
(0 + 1) baseline that protects fresh stores.  Default is 0, which
disables decay and preserves the current behavior exactly.
@antirez
Copy link
Copy Markdown
Owner

antirez commented May 15, 2026

Thanks! I would make this the default behavior actually... too many options are not a good idea if there is a much better single choice.

antirez added a commit that referenced this pull request May 15, 2026
PR #162 introduced the right score shape by decaying only the hit bonus while preserving the token/byte baseline. Make that behavior unconditional so stale once-hot checkpoints cannot remain pinned unless the operator opts in. Remove the extra CLI knob, keep the half-life visible in startup logs, and add an eviction test for same-density fresh checkpoints versus stale high-hit files.\n\nFixes #161.
@antirez antirez merged commit 277c198 into antirez:main May 15, 2026
@unsaltedbutter-ai unsaltedbutter-ai deleted the feat/kv-cache-hit-decay branch May 16, 2026 19:19
espentrydal pushed a commit to espentrydal/ds4 that referenced this pull request May 16, 2026
PR antirez#162 introduced the right score shape by decaying only the hit bonus while preserving the token/byte baseline. Make that behavior unconditional so stale once-hot checkpoints cannot remain pinned unless the operator opts in. Remove the extra CLI knob, keep the half-life visible in startup logs, and add an eviction test for same-density fresh checkpoints versus stale high-hit files.\n\nFixes antirez#161.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants