refactor: optimized covariance transform in ExpectedAttentionPress by neuralsorcerer · Pull Request #111 · NVIDIA/kvpress

neuralsorcerer · 2025-08-05T22:02:42Z

Changes:

Compute per-head query covariance directly in the projected query space, avoiding any intermediate $O((n * d)^2)$ hidden-state covariance tensors.

Why?

A quick benchmark with n=32, d=128, seq_len=64 showed the old method taking ~1.87s and storing 33,554,432 elements vs. ~0.04s and 1,048,576 elements for this approach about 32x less memory and ~50x faster.

Signed-off-by: Soumyadip Sarkar <soumya.papanvk18@gmail.com>

alessiodevoto

Hi @neuralsorcerer ! I just revised and tested the code and I believe we can merge, thanks for opening this PR and contributing to KVPress !

Signed-off-by: Max Jeblick <maximilianjeblick@gmail.com>

refactor: optimized covariance transform in ExpectedAttentionPress

fe899d3

Signed-off-by: Soumyadip Sarkar <soumya.papanvk18@gmail.com>

maxjeblick requested a review from alessiodevoto August 6, 2025 10:07

alessiodevoto approved these changes Aug 7, 2025

View reviewed changes

alessiodevoto merged commit e079b22 into NVIDIA:main Aug 7, 2025
3 checks passed

neuralsorcerer deleted the patch-1 branch August 7, 2025 09:04

maxjeblick pushed a commit that referenced this pull request Aug 12, 2025

Optimized covariance transform in ExpectedAttentionPress (#111)

a0414f7

Signed-off-by: Max Jeblick <maximilianjeblick@gmail.com>

maxjeblick pushed a commit that referenced this pull request Aug 12, 2025

Optimized covariance transform in ExpectedAttentionPress (#111)

a9866ba

Signed-off-by: Max Jeblick <maximilianjeblick@gmail.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

refactor: optimized covariance transform in ExpectedAttentionPress#111

refactor: optimized covariance transform in ExpectedAttentionPress#111
alessiodevoto merged 1 commit intoNVIDIA:mainfrom
neuralsorcerer:patch-1

neuralsorcerer commented Aug 5, 2025

Uh oh!

alessiodevoto left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

neuralsorcerer commented Aug 5, 2025

Uh oh!

alessiodevoto left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants