Skip to content

AIME dataset support? #141

@Zhaojp-Frank

Description

@Zhaojp-Frank

Bug

Hi, there.
Thanks for the wonderful kvpress.
I'm reading the paper "Expected Attention", and it's claimed AIME2025 is tested with good accuracy.
However, there's no AIME dataset supported in evaluate_registry.py.

  1. first just to confirm at first, does ExpectedAttn on AIME2025 supported e.g. using DeepSeek-R1-Qwen-7b ? is there a script to reproduce the Fig4.
  • or simply run with ExpectedAttn pipeline and input AIME25 question?
  1. since AIME25 is kind of short prompt + long decoding tokens, so KV eviction happens on-demand during decode stage?

To Reproduce

No AIME dataset listed in evaluation/evaluate_registry.py

Repository version

master

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions