AIME dataset support?

###  Bug
Hi, there.
Thanks for the wonderful kvpress.
I'm reading the paper "Expected Attention", and it's claimed AIME2025 is tested with good accuracy. 
However, there's no AIME dataset supported in evaluate_registry.py.

1. first just to confirm at first,  does ExpectedAttn on AIME2025 supported e.g. using DeepSeek-R1-Qwen-7b ? is there a script to reproduce the Fig4.
-   or simply run with ExpectedAttn pipeline and input AIME25 question? 

2. since AIME25 is kind of short prompt + long decoding tokens, so KV eviction happens on-demand during decode stage?

### To Reproduce

No AIME dataset listed in evaluation/evaluate_registry.py 

### Repository version
master

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

AIME dataset support? #141

Bug

To Reproduce

Repository version

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

AIME dataset support? #141

Description

Bug

To Reproduce

Repository version

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions