[EVAL] Add IFBench

## Evaluation short description
IFEval is widely used to measure the instruction following capabilities of LLMs but is now quite saturated. IFBench is an improved version of IFEval which has broader diversity of task constraints. I think IFBench is likely to provide better signal for the next generation of models.

## Evaluation metadata
Provide all available
- Paper url: https://arxiv.org/abs/2507.02833
- Github url: https://github.com/allenai/IFBench
- Dataset url: https://huggingface.co/collections/allenai/ifbench-683f590687f61b512558cdf1


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[EVAL] Add IFBench #908

Evaluation short description

Evaluation metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[EVAL] Add IFBench #908

Description

Evaluation short description

Evaluation metadata

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions