Skip to content

Add Expected Attention with Stats #120

Merged
alessiodevoto merged 10 commits intomainfrom
ea
Aug 29, 2025
Merged

Add Expected Attention with Stats #120
alessiodevoto merged 10 commits intomainfrom
ea

Conversation

@alessiodevoto
Copy link
Copy Markdown
Collaborator

PR description

So far Expected Attention only supports non QK norm models (no support for Gemma and Qwen) and it requires computing mean and covariance at inference time. This PR adds the option to load precomputed stats if they are available, or compute them on the fly if not. There are two main contributions:

  • Expected attention with stats
  • Change to the base press to have it work with subpresses that require post_init_from_model.

Checklist

  • Tests are working (make test)
  • Code is formatted correctly (make style, on errors try fix with make format)
  • Copyright header is included
  • All commits are signed-off using git commit -s
  • (new press) mypress_press.py is in the presses directory
  • (new press) MyPress is in __init__.py
  • (new press) README.md is updated with a 1 liner about the new press in the Available presses section
  • (new press) New press is in the default_presses list in tests/default_presses.py
  • (new press) A docstring is provided that follows the same structure as the existing ones

Copy link
Copy Markdown
Collaborator

@maxjeblick maxjeblick left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot for opening the PR!

I've left some comments; the most important one is probably the one about how to compute mean query and mean covariance.

Comment thread kvpress/presses/expected_attention_press.py Outdated
Comment thread kvpress/presses/expected_attention_press.py Outdated
Comment thread kvpress/presses/base_press.py Outdated
Comment thread kvpress/presses/expected_attention_press.py Outdated
Comment thread kvpress/presses/utils.py
Comment thread kvpress/presses/utils.py Outdated
Comment thread kvpress/presses/utils.py Outdated
Comment thread kvpress/presses/expected_attention_press.py Outdated
Comment thread kvpress/presses/utils.py Outdated
Comment thread kvpress/presses/utils.py Outdated
Comment thread evaluation/evaluate_config.yaml Outdated
@copy-pr-bot
Copy link
Copy Markdown

copy-pr-bot Bot commented Aug 26, 2025

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@maxjeblick
Copy link
Copy Markdown
Collaborator

/ok to test

@copy-pr-bot
Copy link
Copy Markdown

copy-pr-bot Bot commented Aug 26, 2025

/ok to test

@maxjeblick, there was an error processing your request: E1

See the following link for more information: https://docs.gha-runners.nvidia.com/cpr/e/1/

@maxjeblick
Copy link
Copy Markdown
Collaborator

/ok to test 24ef932

Signed-off-by: alessiodevoto <devoto.alessio@gmail.com>
Signed-off-by: alessiodevoto <devoto.alessio@gmail.com>
Signed-off-by: alessiodevoto <devoto.alessio@gmail.com>
@alessiodevoto
Copy link
Copy Markdown
Collaborator Author

I have refactored the code significantly (sorry 😅), before computing and uploading stats to the hub and adding the press to tests and press list I would like to hear your opinion :)

Major changes:

  • updated ea code and made it more modular (this way it is easier to subclass)
  • instead of having a separate file just for computing stats, I put everything inside the press file. This way it stays self-contained and cleaner.

Let me know wdyt @maxjeblick 🙂

@maxjeblick
Copy link
Copy Markdown
Collaborator

maxjeblick commented Aug 27, 2025

Hi!
Refactoring (moving to another file) looks good.
The code itself needs some clean up, see my other comments above.
Id' also prefer having a main function with fire.Fire instead of argsparse.

@alessiodevoto
Copy link
Copy Markdown
Collaborator Author

alessiodevoto commented Aug 27, 2025

Thanks Max!
No problem I will update later as well.

  • For fire instead of argparse -> Ok.
  • As for the comments, I think I addressed all of them in this PR. The only difference that I wanted to double check is whether we should have an external file for stats computation or it is better to keep it self contained like for duo attention.

@maxjeblick
Copy link
Copy Markdown
Collaborator

As for the comments, I think I addressed all of them in this PR. The only difference that I wanted to double check is whether we should have an external file for stats computation or it is better to keep it self contained like for duo attention.

I had one comment (it is collapsed above) about the if-else logic returning collected_queries, mean_query, cov_query. I think this can be simplified to always return those values.

The folder structure looks fine, it is nice that it is self-contained.

Comment thread kvpress/presses/expected_attention_with_stats.py
Signed-off-by: alessiodevoto <devoto.alessio@gmail.com>
Signed-off-by: alessiodevoto <devoto.alessio@gmail.com>
@alessiodevoto
Copy link
Copy Markdown
Collaborator Author

Got it, I simplified the if else logic. Also, there was a bug in the computation (I was computing the stats but forgot to save them), now it is fixed.

Signed-off-by: alessiodevoto <devoto.alessio@gmail.com>
Comment thread kvpress/presses/expected_attention_with_stats.py Outdated
Signed-off-by: alessiodevoto <devoto.alessio@gmail.com>
Signed-off-by: alessiodevoto <devoto.alessio@gmail.com>
@alessiodevoto
Copy link
Copy Markdown
Collaborator Author

/ok to test e4157b2

@alessiodevoto
Copy link
Copy Markdown
Collaborator Author

Update:

  • computed stats both for mock testing model and Qwen (for future)
  • added tests (they download stats from the hub)

Signed-off-by: alessiodevoto <devoto.alessio@gmail.com>
@alessiodevoto
Copy link
Copy Markdown
Collaborator Author

/ok to test 7f93961

@alessiodevoto
Copy link
Copy Markdown
Collaborator Author

Had a dependency error because of "fire", fixed. All tests pass now 👌

Signed-off-by: alessiodevoto <devoto.alessio@gmail.com>
@alessiodevoto
Copy link
Copy Markdown
Collaborator Author

Forgot to clean some comments, now it's should be /ok to test bc27dd8

@alessiodevoto
Copy link
Copy Markdown
Collaborator Author

/ok to test bc27dd8

Copy link
Copy Markdown
Collaborator

@maxjeblick maxjeblick left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, great work!

@alessiodevoto alessiodevoto merged commit a908e99 into main Aug 29, 2025
3 checks passed
@alessiodevoto alessiodevoto deleted the ea branch August 29, 2025 11:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants