Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add task for mmlu evaluation in arc multiple choice format #1745

Merged
merged 2 commits into from
May 8, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
11 changes: 11 additions & 0 deletions lm_eval/tasks/mmlu/continuation/_continuation_template_yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
dataset_path: hails/mmlu_no_train # a copy of `cais/mmlu` with no auxiliary_train split
output_type: multiple_choice
test_split: test
fewshot_split: dev
fewshot_config:
sampler: first_n
doc_to_text: "Question: {{question.strip()}}\nAnswer:"
doc_to_choice: "{{choices}}"
doc_to_target: "{{answer}}"
metadata:
version: 0.0
6 changes: 6 additions & 0 deletions lm_eval/tasks/mmlu/continuation/_mmlu.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
group: mmlu_continuation
task:
- mmlu_continuation_stem
- mmlu_continuation_other
- mmlu_continuation_social_sciences
- mmlu_continuation_humanities
6 changes: 6 additions & 0 deletions lm_eval/tasks/mmlu/continuation/mmlu_abstract_algebra.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
"dataset_name": "abstract_algebra"
"description": "The following are questions (with answers) about abstract\
\ algebra.\n\n"
"group": "mmlu_continuation_stem"
"include": "_continuation_template_yaml"
"task": "mmlu_continuation_abstract_algebra"
6 changes: 6 additions & 0 deletions lm_eval/tasks/mmlu/continuation/mmlu_anatomy.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
"dataset_name": "anatomy"
"description": "The following are questions (with answers) about anatomy.\n\
\n"
"group": "mmlu_continuation_stem"
"include": "_continuation_template_yaml"
"task": "mmlu_continuation_anatomy"
6 changes: 6 additions & 0 deletions lm_eval/tasks/mmlu/continuation/mmlu_astronomy.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
"dataset_name": "astronomy"
"description": "The following are questions (with answers) about astronomy.\n\
\n"
"group": "mmlu_continuation_stem"
"include": "_continuation_template_yaml"
"task": "mmlu_continuation_astronomy"
6 changes: 6 additions & 0 deletions lm_eval/tasks/mmlu/continuation/mmlu_business_ethics.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
"dataset_name": "business_ethics"
"description": "The following are questions (with answers) about business\
\ ethics.\n\n"
"group": "mmlu_continuation_other"
"include": "_continuation_template_yaml"
"task": "mmlu_continuation_business_ethics"
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
"dataset_name": "clinical_knowledge"
"description": "The following are questions (with answers) about clinical\
\ knowledge.\n\n"
"group": "mmlu_continuation_other"
"include": "_continuation_template_yaml"
"task": "mmlu_continuation_clinical_knowledge"
6 changes: 6 additions & 0 deletions lm_eval/tasks/mmlu/continuation/mmlu_college_biology.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
"dataset_name": "college_biology"
"description": "The following are questions (with answers) about college\
\ biology.\n\n"
"group": "mmlu_continuation_stem"
"include": "_continuation_template_yaml"
"task": "mmlu_continuation_college_biology"
6 changes: 6 additions & 0 deletions lm_eval/tasks/mmlu/continuation/mmlu_college_chemistry.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
"dataset_name": "college_chemistry"
"description": "The following are questions (with answers) about college\
\ chemistry.\n\n"
"group": "mmlu_continuation_stem"
"include": "_continuation_template_yaml"
"task": "mmlu_continuation_college_chemistry"
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
"dataset_name": "college_computer_science"
"description": "The following are questions (with answers) about college\
\ computer science.\n\n"
"group": "mmlu_continuation_stem"
"include": "_continuation_template_yaml"
"task": "mmlu_continuation_college_computer_science"
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
"dataset_name": "college_mathematics"
"description": "The following are questions (with answers) about college\
\ mathematics.\n\n"
"group": "mmlu_continuation_stem"
"include": "_continuation_template_yaml"
"task": "mmlu_continuation_college_mathematics"
6 changes: 6 additions & 0 deletions lm_eval/tasks/mmlu/continuation/mmlu_college_medicine.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
"dataset_name": "college_medicine"
"description": "The following are questions (with answers) about college\
\ medicine.\n\n"
"group": "mmlu_continuation_other"
"include": "_continuation_template_yaml"
"task": "mmlu_continuation_college_medicine"
6 changes: 6 additions & 0 deletions lm_eval/tasks/mmlu/continuation/mmlu_college_physics.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
"dataset_name": "college_physics"
"description": "The following are questions (with answers) about college\
\ physics.\n\n"
"group": "mmlu_continuation_stem"
"include": "_continuation_template_yaml"
"task": "mmlu_continuation_college_physics"
6 changes: 6 additions & 0 deletions lm_eval/tasks/mmlu/continuation/mmlu_computer_security.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
"dataset_name": "computer_security"
"description": "The following are questions (with answers) about computer\
\ security.\n\n"
"group": "mmlu_continuation_stem"
"include": "_continuation_template_yaml"
"task": "mmlu_continuation_computer_security"
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
"dataset_name": "conceptual_physics"
"description": "The following are questions (with answers) about conceptual\
\ physics.\n\n"
"group": "mmlu_continuation_stem"
"include": "_continuation_template_yaml"
"task": "mmlu_continuation_conceptual_physics"
6 changes: 6 additions & 0 deletions lm_eval/tasks/mmlu/continuation/mmlu_econometrics.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
"dataset_name": "econometrics"
"description": "The following are questions (with answers) about econometrics.\n\
\n"
"group": "mmlu_continuation_social_sciences"
"include": "_continuation_template_yaml"
"task": "mmlu_continuation_econometrics"
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
"dataset_name": "electrical_engineering"
"description": "The following are questions (with answers) about electrical\
\ engineering.\n\n"
"group": "mmlu_continuation_stem"
"include": "_continuation_template_yaml"
"task": "mmlu_continuation_electrical_engineering"
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
"dataset_name": "elementary_mathematics"
"description": "The following are questions (with answers) about elementary\
\ mathematics.\n\n"
"group": "mmlu_continuation_stem"
"include": "_continuation_template_yaml"
"task": "mmlu_continuation_elementary_mathematics"
6 changes: 6 additions & 0 deletions lm_eval/tasks/mmlu/continuation/mmlu_formal_logic.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
"dataset_name": "formal_logic"
"description": "The following are questions (with answers) about formal\
\ logic.\n\n"
"group": "mmlu_continuation_humanities"
"include": "_continuation_template_yaml"
"task": "mmlu_continuation_formal_logic"
6 changes: 6 additions & 0 deletions lm_eval/tasks/mmlu/continuation/mmlu_global_facts.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
"dataset_name": "global_facts"
"description": "The following are questions (with answers) about global\
\ facts.\n\n"
"group": "mmlu_continuation_other"
"include": "_continuation_template_yaml"
"task": "mmlu_continuation_global_facts"
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
"dataset_name": "high_school_biology"
"description": "The following are questions (with answers) about high\
\ school biology.\n\n"
"group": "mmlu_continuation_stem"
"include": "_continuation_template_yaml"
"task": "mmlu_continuation_high_school_biology"
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
"dataset_name": "high_school_chemistry"
"description": "The following are questions (with answers) about high\
\ school chemistry.\n\n"
"group": "mmlu_continuation_stem"
"include": "_continuation_template_yaml"
"task": "mmlu_continuation_high_school_chemistry"
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
"dataset_name": "high_school_computer_science"
"description": "The following are questions (with answers) about high\
\ school computer science.\n\n"
"group": "mmlu_continuation_stem"
"include": "_continuation_template_yaml"
"task": "mmlu_continuation_high_school_computer_science"
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
"dataset_name": "high_school_european_history"
"description": "The following are questions (with answers) about high\
\ school european history.\n\n"
"group": "mmlu_continuation_humanities"
"include": "_continuation_template_yaml"
"task": "mmlu_continuation_high_school_european_history"
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
"dataset_name": "high_school_geography"
"description": "The following are questions (with answers) about high\
\ school geography.\n\n"
"group": "mmlu_continuation_social_sciences"
"include": "_continuation_template_yaml"
"task": "mmlu_continuation_high_school_geography"
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
"dataset_name": "high_school_government_and_politics"
"description": "The following are questions (with answers) about high\
\ school government and politics.\n\n"
"group": "mmlu_continuation_social_sciences"
"include": "_continuation_template_yaml"
"task": "mmlu_continuation_high_school_government_and_politics"
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
"dataset_name": "high_school_macroeconomics"
"description": "The following are questions (with answers) about high\
\ school macroeconomics.\n\n"
"group": "mmlu_continuation_social_sciences"
"include": "_continuation_template_yaml"
"task": "mmlu_continuation_high_school_macroeconomics"
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
"dataset_name": "high_school_mathematics"
"description": "The following are questions (with answers) about high\
\ school mathematics.\n\n"
"group": "mmlu_continuation_stem"
"include": "_continuation_template_yaml"
"task": "mmlu_continuation_high_school_mathematics"
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
"dataset_name": "high_school_microeconomics"
"description": "The following are questions (with answers) about high\
\ school microeconomics.\n\n"
"group": "mmlu_continuation_social_sciences"
"include": "_continuation_template_yaml"
"task": "mmlu_continuation_high_school_microeconomics"
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
"dataset_name": "high_school_physics"
"description": "The following are questions (with answers) about high\
\ school physics.\n\n"
"group": "mmlu_continuation_stem"
"include": "_continuation_template_yaml"
"task": "mmlu_continuation_high_school_physics"
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
"dataset_name": "high_school_psychology"
"description": "The following are questions (with answers) about high\
\ school psychology.\n\n"
"group": "mmlu_continuation_social_sciences"
"include": "_continuation_template_yaml"
"task": "mmlu_continuation_high_school_psychology"
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
"dataset_name": "high_school_statistics"
"description": "The following are questions (with answers) about high\
\ school statistics.\n\n"
"group": "mmlu_continuation_stem"
"include": "_continuation_template_yaml"
"task": "mmlu_continuation_high_school_statistics"
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
"dataset_name": "high_school_us_history"
"description": "The following are questions (with answers) about high\
\ school us history.\n\n"
"group": "mmlu_continuation_humanities"
"include": "_continuation_template_yaml"
"task": "mmlu_continuation_high_school_us_history"
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
"dataset_name": "high_school_world_history"
"description": "The following are questions (with answers) about high\
\ school world history.\n\n"
"group": "mmlu_continuation_humanities"
"include": "_continuation_template_yaml"
"task": "mmlu_continuation_high_school_world_history"
6 changes: 6 additions & 0 deletions lm_eval/tasks/mmlu/continuation/mmlu_human_aging.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
"dataset_name": "human_aging"
"description": "The following are questions (with answers) about human\
\ aging.\n\n"
"group": "mmlu_continuation_other"
"include": "_continuation_template_yaml"
"task": "mmlu_continuation_human_aging"
6 changes: 6 additions & 0 deletions lm_eval/tasks/mmlu/continuation/mmlu_human_sexuality.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
"dataset_name": "human_sexuality"
"description": "The following are questions (with answers) about human\
\ sexuality.\n\n"
"group": "mmlu_continuation_social_sciences"
"include": "_continuation_template_yaml"
"task": "mmlu_continuation_human_sexuality"
6 changes: 6 additions & 0 deletions lm_eval/tasks/mmlu/continuation/mmlu_international_law.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
"dataset_name": "international_law"
"description": "The following are questions (with answers) about international\
\ law.\n\n"
"group": "mmlu_continuation_humanities"
"include": "_continuation_template_yaml"
"task": "mmlu_continuation_international_law"
6 changes: 6 additions & 0 deletions lm_eval/tasks/mmlu/continuation/mmlu_jurisprudence.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
"dataset_name": "jurisprudence"
"description": "The following are questions (with answers) about jurisprudence.\n\
\n"
"group": "mmlu_continuation_humanities"
"include": "_continuation_template_yaml"
"task": "mmlu_continuation_jurisprudence"
6 changes: 6 additions & 0 deletions lm_eval/tasks/mmlu/continuation/mmlu_logical_fallacies.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
"dataset_name": "logical_fallacies"
"description": "The following are questions (with answers) about logical\
\ fallacies.\n\n"
"group": "mmlu_continuation_humanities"
"include": "_continuation_template_yaml"
"task": "mmlu_continuation_logical_fallacies"
6 changes: 6 additions & 0 deletions lm_eval/tasks/mmlu/continuation/mmlu_machine_learning.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
"dataset_name": "machine_learning"
"description": "The following are questions (with answers) about machine\
\ learning.\n\n"
"group": "mmlu_continuation_stem"
"include": "_continuation_template_yaml"
"task": "mmlu_continuation_machine_learning"
6 changes: 6 additions & 0 deletions lm_eval/tasks/mmlu/continuation/mmlu_management.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
"dataset_name": "management"
"description": "The following are questions (with answers) about management.\n\
\n"
"group": "mmlu_continuation_other"
"include": "_continuation_template_yaml"
"task": "mmlu_continuation_management"
6 changes: 6 additions & 0 deletions lm_eval/tasks/mmlu/continuation/mmlu_marketing.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
"dataset_name": "marketing"
"description": "The following are questions (with answers) about marketing.\n\
\n"
"group": "mmlu_continuation_other"
"include": "_continuation_template_yaml"
"task": "mmlu_continuation_marketing"
6 changes: 6 additions & 0 deletions lm_eval/tasks/mmlu/continuation/mmlu_medical_genetics.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
"dataset_name": "medical_genetics"
"description": "The following are questions (with answers) about medical\
\ genetics.\n\n"
"group": "mmlu_continuation_other"
"include": "_continuation_template_yaml"
"task": "mmlu_continuation_medical_genetics"
6 changes: 6 additions & 0 deletions lm_eval/tasks/mmlu/continuation/mmlu_miscellaneous.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
"dataset_name": "miscellaneous"
"description": "The following are questions (with answers) about miscellaneous.\n\
\n"
"group": "mmlu_continuation_other"
"include": "_continuation_template_yaml"
"task": "mmlu_continuation_miscellaneous"
6 changes: 6 additions & 0 deletions lm_eval/tasks/mmlu/continuation/mmlu_moral_disputes.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
"dataset_name": "moral_disputes"
"description": "The following are questions (with answers) about moral\
\ disputes.\n\n"
"group": "mmlu_continuation_humanities"
"include": "_continuation_template_yaml"
"task": "mmlu_continuation_moral_disputes"
6 changes: 6 additions & 0 deletions lm_eval/tasks/mmlu/continuation/mmlu_moral_scenarios.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
"dataset_name": "moral_scenarios"
"description": "The following are questions (with answers) about moral\
\ scenarios.\n\n"
"group": "mmlu_continuation_humanities"
"include": "_continuation_template_yaml"
"task": "mmlu_continuation_moral_scenarios"
6 changes: 6 additions & 0 deletions lm_eval/tasks/mmlu/continuation/mmlu_nutrition.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
"dataset_name": "nutrition"
"description": "The following are questions (with answers) about nutrition.\n\
\n"
"group": "mmlu_continuation_other"
"include": "_continuation_template_yaml"
"task": "mmlu_continuation_nutrition"
6 changes: 6 additions & 0 deletions lm_eval/tasks/mmlu/continuation/mmlu_philosophy.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
"dataset_name": "philosophy"
"description": "The following are questions (with answers) about philosophy.\n\
\n"
"group": "mmlu_continuation_humanities"
"include": "_continuation_template_yaml"
"task": "mmlu_continuation_philosophy"
6 changes: 6 additions & 0 deletions lm_eval/tasks/mmlu/continuation/mmlu_prehistory.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
"dataset_name": "prehistory"
"description": "The following are questions (with answers) about prehistory.\n\
\n"
"group": "mmlu_continuation_humanities"
"include": "_continuation_template_yaml"
"task": "mmlu_continuation_prehistory"
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
"dataset_name": "professional_accounting"
"description": "The following are questions (with answers) about professional\
\ accounting.\n\n"
"group": "mmlu_continuation_other"
"include": "_continuation_template_yaml"
"task": "mmlu_continuation_professional_accounting"
6 changes: 6 additions & 0 deletions lm_eval/tasks/mmlu/continuation/mmlu_professional_law.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
"dataset_name": "professional_law"
"description": "The following are questions (with answers) about professional\
\ law.\n\n"
"group": "mmlu_continuation_humanities"
"include": "_continuation_template_yaml"
"task": "mmlu_continuation_professional_law"
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
"dataset_name": "professional_medicine"
"description": "The following are questions (with answers) about professional\
\ medicine.\n\n"
"group": "mmlu_continuation_other"
"include": "_continuation_template_yaml"
"task": "mmlu_continuation_professional_medicine"
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
"dataset_name": "professional_psychology"
"description": "The following are questions (with answers) about professional\
\ psychology.\n\n"
"group": "mmlu_continuation_social_sciences"
"include": "_continuation_template_yaml"
"task": "mmlu_continuation_professional_psychology"
6 changes: 6 additions & 0 deletions lm_eval/tasks/mmlu/continuation/mmlu_public_relations.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
"dataset_name": "public_relations"
"description": "The following are questions (with answers) about public\
\ relations.\n\n"
"group": "mmlu_continuation_social_sciences"
"include": "_continuation_template_yaml"
"task": "mmlu_continuation_public_relations"
6 changes: 6 additions & 0 deletions lm_eval/tasks/mmlu/continuation/mmlu_security_studies.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
"dataset_name": "security_studies"
"description": "The following are questions (with answers) about security\
\ studies.\n\n"
"group": "mmlu_continuation_social_sciences"
"include": "_continuation_template_yaml"
"task": "mmlu_continuation_security_studies"
6 changes: 6 additions & 0 deletions lm_eval/tasks/mmlu/continuation/mmlu_sociology.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
"dataset_name": "sociology"
"description": "The following are questions (with answers) about sociology.\n\
\n"
"group": "mmlu_continuation_social_sciences"
"include": "_continuation_template_yaml"
"task": "mmlu_continuation_sociology"