-
Notifications
You must be signed in to change notification settings - Fork 320
Open
Description
Evaluation short description
IFEval is widely used to measure the instruction following capabilities of LLMs but is now quite saturated. IFBench is an improved version of IFEval which has broader diversity of task constraints. I think IFBench is likely to provide better signal for the next generation of models.
Evaluation metadata
Provide all available
- Paper url: https://arxiv.org/abs/2507.02833
- Github url: https://github.com/allenai/IFBench
- Dataset url: https://huggingface.co/collections/allenai/ifbench-683f590687f61b512558cdf1
Metadata
Metadata
Assignees
Labels
No labels