Skip to content

Upgrade llama.cpp b8648 → b8664; expose KV cache idle-eviction API#71

Merged
bernardladenthin merged 1 commit intomasterfrom
claude/review-api-b8664-NCrWa
Apr 4, 2026
Merged

Upgrade llama.cpp b8648 → b8664; expose KV cache idle-eviction API#71
bernardladenthin merged 1 commit intomasterfrom
claude/review-api-b8664-NCrWa

Conversation

@bernardladenthin
Copy link
Copy Markdown
Owner

Version bump:

  • CMakeLists.txt GIT_TAG b8648 → b8664
  • README.md llama.cpp badge and link updated
  • CLAUDE.md pinned version updated
  • README.md Java badge corrected from 11+ to 8+ (pom.xml already targets 1.8)

New ModelParameters setters (all three work together to enable idle-slot eviction, a b8664 feature that saves KV state of idle slots to RAM and restores on cache-hit, freeing GPU memory between requests):

  • setKvUnified(boolean) → --kv-unified / --no-kv-unified
  • setCacheRamMib(int) → --cache-ram N (MiB; -1=unlimited, 0=disabled)
  • setClearIdle(boolean) → --clear-idle / --no-clear-idle (new in b8664)

Both boolean setters remove the opposite flag so flip calls are safe. All three are covered by 13 new unit tests in ModelParametersExtendedTest.

https://claude.ai/code/session_01UqHGktHkNpEri7QLZE5cjr

Version bump:
- CMakeLists.txt GIT_TAG b8648 → b8664
- README.md llama.cpp badge and link updated
- CLAUDE.md pinned version updated
- README.md Java badge corrected from 11+ to 8+ (pom.xml already targets 1.8)

New ModelParameters setters (all three work together to enable idle-slot
eviction, a b8664 feature that saves KV state of idle slots to RAM and
restores on cache-hit, freeing GPU memory between requests):
- setKvUnified(boolean)  → --kv-unified / --no-kv-unified
- setCacheRamMib(int)    → --cache-ram N (MiB; -1=unlimited, 0=disabled)
- setClearIdle(boolean)  → --clear-idle / --no-clear-idle (new in b8664)

Both boolean setters remove the opposite flag so flip calls are safe.
All three are covered by 13 new unit tests in ModelParametersExtendedTest.

https://claude.ai/code/session_01UqHGktHkNpEri7QLZE5cjr
@bernardladenthin bernardladenthin merged commit 14f31b1 into master Apr 4, 2026
16 checks passed
@bernardladenthin bernardladenthin deleted the claude/review-api-b8664-NCrWa branch April 4, 2026 20:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants