Skip to content

OpenCompass v0.2.2.rc1

Pre-release
Pre-release
Compare
Choose a tag to compare
@Leymore Leymore released this 06 Feb 17:33
· 316 commits to main since this release
e257254

Provide with more parsed datasets:

OpenCompassData-core-20240207.zip
OpenCompassData-complete-20240207.zip

Important updates compared to previous version are as follow:

  • Subjective: Add AlignBench, MTBench
  • Agent: Add T-Eval
  • Medicine: Add MedBench
  • Code: Add HumanEval-X, DS-1000
  • Finance: Add FinanceIQ
  • Law: Update LawBench Evaluation Assets

OpenCompassData-core-20240207.zip

AGIEval ARC BBH ceval CLUE cmmlu
commonsenseqa drop FewCLUE flores_first100 GAOKAO-BENCH gsm8k
hellaswag humaneval lambada LCSTS math mbpp
mmlu nq openbookqa piqa race siqa
strategyqa summedits SuperGLUE TheoremQA triviaqa tydiqa
winogrande xstory_cloze Xsum

OpenCompassData-complete-20240207.zip

AGIEval anli ARC BBH CDME ceval
cibench_dataset cleva clozeTest-maxmin CLUE CMB cmmlu
commonsenseqa commonsenseqa_cn crowspairs_cn drop ds1000_data FewCLUE
FinanceIQ flores200_dataset flores_first100 FunctionalMT game24 GAOKAO-BENCH
gpqa gsm8k hellaswag humaneval humaneval_cn humaneval_multipl-e
humanevalx HungarianExamMath InfiniteBench lambada lanQ lawbench
LCSTS math math401 mbpp mbpp_cn mbpp_plus
MedBench mmlu MNIST NPHardEval nq nq_cn
nq-open openbookqa piqa py150 qabench race
scibench siqa SQuAD2.0 strategyqa alignment_bench mtbench
summedits SuperGLUE svamp teval TheoremQA triviaqa
tydiqa winogrande xiezhi xlsum xstory_cloze Xsum