Skip to content

Pull requests: EleutherAI/lm-evaluation-harness

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Assigned to nobody Loading
Sort

Pull requests list

Fix add_bos_token not updated for Gemma tokenizer
#3206 opened Aug 4, 2025 by DarkLight1337 Loading…
Support for DDP+MP with native torch and no accelerate
#3205 opened Aug 3, 2025 by xgal Loading…
Add new task: kmmlu_pro, kmmlu_redux
#3198 opened Aug 1, 2025 by jeonghodot Loading…
Add xnli_va dataset
#3194 opened Jul 30, 2025 by FranValero97 Loading…
refactor registry
#3189 opened Jul 28, 2025 by baberabb Loading…
Add LM-SynEval Benchmark
#3184 opened Jul 24, 2025 by jmichaelov Loading…
Update MMLU-ProX task
#3174 opened Jul 22, 2025 by weihao1115 Loading…
3 of 6 tasks
feat: Add CLIcK task
#3173 opened Jul 22, 2025 by shing100 Loading…
Add eqbench tasks in Spanish and Catalan
#3168 opened Jul 21, 2025 by priverabsc Loading…
Add EsBBQ and CaBBQ tasks
#3167 opened Jul 21, 2025 by valleruizf Loading…
Bugfix: update hellaswag ds path
#3158 opened Jul 18, 2025 by marawangamal Loading…
Add task dynamic_ifeval
#3149 opened Jul 15, 2025 by davideguidobene Loading…
Add DETAILS.md for improved documentation
#3141 opened Jul 12, 2025 by ginylil-tech Loading…
Add tasklist
#3133 opened Jul 11, 2025 by baberabb Loading…
set repeat metrics from config
#3109 opened Jul 5, 2025 by baberabb Loading…
add metric configs
#3105 opened Jul 4, 2025 by baberabb Loading…
ProTip! Mix and match filters to narrow down what you’re looking for.