feat(kv-cache): optional hit-count decay for eviction score#162
Merged
Conversation
The disk cache eviction score is (hits + 1) * tokens / file_size.
hits is monotonic: it only ever goes up on a successful disk reuse.
That is fine while the workload stays the same, but it locks
once-popular files in place after a prompt schema change (system
prompt rewritten, tools removed, model swapped, etc.). The stale
file can no longer match anything, so it never accumulates new hits,
but its existing bonus keeps it well above every freshly-stored entry
indefinitely.
Add an optional time-based decay so the bonus erodes when an entry
goes untouched. When --kv-cache-hit-half-life-days N is set, the
score uses
effective_hits = hits * 2 ** -((now - last_used) / half_life)
so a matching prompt that refreshes last_used keeps the entry hot,
and an entry that stops getting hits gradually falls toward the
(0 + 1) baseline that protects fresh stores. Default is 0, which
disables decay and preserves the current behavior exactly.
Owner
|
Thanks! I would make this the default behavior actually... too many options are not a good idea if there is a much better single choice. |
antirez
added a commit
that referenced
this pull request
May 15, 2026
PR #162 introduced the right score shape by decaying only the hit bonus while preserving the token/byte baseline. Make that behavior unconditional so stale once-hot checkpoints cannot remain pinned unless the operator opts in. Remove the extra CLI knob, keep the half-life visible in startup logs, and add an eviction test for same-density fresh checkpoints versus stale high-hit files.\n\nFixes #161.
espentrydal
pushed a commit
to espentrydal/ds4
that referenced
this pull request
May 16, 2026
PR antirez#162 introduced the right score shape by decaying only the hit bonus while preserving the token/byte baseline. Make that behavior unconditional so stale once-hot checkpoints cannot remain pinned unless the operator opts in. Remove the extra CLI knob, keep the half-life visible in startup logs, and add an eviction test for same-density fresh checkpoints versus stale high-hit files.\n\nFixes antirez#161.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This is one way to solve issue #161
Problem
The disk-cache eviction score in
kv_entry_eviction_score(ds4_server.c:8696-8729) is(hits + 1) * tokens / file_sizefor every non-protected entry.hitsis monotonic — it only ever goes up on a successful disk reuse, andlast_usedis captured at write time but never feeds back into the score. That works fine while the workload stays the same, but it lets a once-popular file dominate forever after any prompt schema change.Concrete case: my cache directory currently holds this top entry by score:
That's a 7-hit cold snapshot of an old Hermes system prompt. I shrunk that prompt yesterday by removing some tools, so the file can never match an incoming prompt again. But its score (50.9) is 7× the next-highest entry, and the policy will keep it pinned at the top forever. Newly stored prefixes from the new prompt start at a
(0 + 1)baseline and can never accumulate a first hit while the cache is at budget.There is no LRU sweep, no TTL, no decay. Once a file earns hits, those hits never leave the file.
Change
Add
--kv-cache-hit-half-life-days N. WhenN > 0, the score usesinstead of
hits. A matching prompt that refresheslast_usedkeeps the entry hot, and an entry that stops getting hits gradually falls toward the(0 + 1) * tokens / file_sizebaseline that protects fresh stores. With a 7-day half-life, a 10-hit file untouched for 14 days has its bonus shrunk to10 × 2⁻² = 2.5, after 70 days to10 / 1024 ≈ 0.01, etc.Default is 0, which disables decay and preserves the current behavior bit-for-bit. The startup log was extended to print the configured half-life so the option is visible at boot.
Why this shape
2 ** -elapsed/half_lifeform isexp2, which is in<math.h>and already in the existing-lmlink line.+ 1floor outside the decay means a brand-new entry still gets exactlytokens / file_size, identical to today. The fix is purely about reducing the bonus of stale entries, never penalizing fresh ones.protected_sha. Decay applies only on the hit-weighted branch below the protected-current-store short-circuit. The brand-new file stays protected by its SHA; a stale once-popular file gets demoted by age. The two guards target different failure modes.last_used, but a half-life this short is unusable in practice; days picks a sane scale for the workloads where this matters.Test plan
New unit test
test_kv_cache_eviction_score_decays_stale_hitsexercises the score function directly with two entries:stale: 1024 tokens, 10 hits, 4096 bytes,last_used = 0fresh: 2048 tokens, 0 hits, 4096 bytes,last_used = 1000evaluated at
now = 1000 + 14 × 7 × 86400(14 half-lives past stale's last_used). Both calls passNULLforprotected_shaso they exercise the hit-weighted branch this PR modifies, not the short-circuit. The test asserts:half_life = 0,staleoutscoresfresh(current behavior preserved).half_life = 7 days,freshoutscoresstale(decay flips the order).fresh's decayed score equals exactly(0 + 1) * tokens / file_size(the floor never erodes).Run:
make ds4_test && ./ds4_test --serverExisting tests
test_kv_cache_eviction_values_fresh_snapshots,test_kv_cache_eviction_protects_current_store,test_kv_cache_eviction_does_not_protect_oversize_current_store, andtest_kv_cache_eviction_keeps_aligned_continued_frontierscontinue to pass.Diff size
One file, +58 / -6. Two of those lines are the new
#include <math.h>and one struct field; the rest is the score function update, the new CLI branch, the help-text and startup-log strings, and the test.