Add EntropyAdaptivePress: per-layer adaptive eviction via attention entropy by jagmarques · Pull Request #206 · NVIDIA/kvpress

jagmarques · 2026-04-08T09:50:39Z

Extends ObservedAttentionPress. Layers with peaked attention (structured text) get evicted more aggressively. Layers with uniform attention (creative text) keep more tokens.

No dead code, no external benchmark claims. Just the press and 4 tests.

Tests:

Smoke test: press runs on unit_test_model_output_attention
Compression: cache size < input size
Ratio bounds: min_ratio/max_ratio respected
Zero compression: all tokens kept when ratio=0

What changed:

kvpress/presses/entropy_adaptive_press.py (63 lines)
tests/presses/test_entropy_adaptive_press.py (65 lines)

Extends ObservedAttentionPress so it inherits the attention-based scoring. The entropy computation modulates scores per-layer: uniform attention layers get a score boost (keep more tokens), peaked attention layers don't (evict more).

Extends ObservedAttentionPress. Layers with peaked attention (structured text) get evicted more aggressively. Layers with uniform attention (creative text) keep more tokens. Includes 4 tests using the unit_test_model_output_attention fixture. Signed-off-by: João André Gomes Marques <joaoagm90@gmail.com>

copy-pr-bot · 2026-04-08T09:50:44Z

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add EntropyAdaptivePress: per-layer adaptive eviction via attention entropy#206

Add EntropyAdaptivePress: per-layer adaptive eviction via attention entropy#206
jagmarques wants to merge 1 commit intoNVIDIA:mainfrom
jagmarques:add-entropy-adaptive-press

jagmarques commented Apr 8, 2026

Uh oh!

copy-pr-bot bot commented Apr 8, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

jagmarques commented Apr 8, 2026

Uh oh!

copy-pr-bot bot commented Apr 8, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant