Skip to content

Home repository for the ACORN dataset: 3,500 explanations with aspect-wise human ratings of their quality.

License

Notifications You must be signed in to change notification settings

a-brassard/ACORN

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

12 Commits
 
 
 
 
 
 
 
 

Repository files navigation

A cute illustration of an acorn character.

Home repository for the dataset introduced in ACORN: Aspect-wise Commonsense Reasoning Explanation Evaluation. ACORN contains 3,500 human-written and LLM-generated explanations with aspect-wise quality ratings given by humans.

Also available on 🤗HuggingFace here!

Five human raters evaluating an explanation of the answer for a commonsense reasoning question. Ratings for 3500 explanations are aggregated into a dataset.

Data

The entire dataset is contained in ACORN.jsonl. Each row consists of an explanation, related information, aggregated (majority-voted) ratings, and the full set of individual worker ratings.

Basic fields:

  • question question text
  • choices list of answer choices
  • label correct answer index
  • explanation explanation text
  • voted_ratings majority-voted ratings
  • worker_ratings all worker ratings, saved as a dictionary of dictionaries (worker id → rating dict).

→ See Additional fields for the full list of fields.

Quality aspects

Explanation quality is subjective and can depend on the intended use. Our choice includes both a general rating and fine-grained aspects of explanation quality assuming an ideal of fluent, sufficient, minimal, and contrastive explanations.

Rating criteria

Sources

ACORN contains a blend of explanations from several sources. See Section 2.2 in the paper for a more detailed overview.

ACORN contains samples from ECQA, CoS-E, COPA-SSE, generated explanations for Commonsense QA, generated explanations for Balanced COPA, newly collected explanations for Balanced COPA, and GPT-3.5 edited versions of CoS-E and COPA-SSE. Each group has 500 samples, totaling 3500 samples.

Additional fields

In addition to the fields listed in Files, the dataset contains the following information.

  • id test sample ID
  • q_id original question ID
  • e_id original explanation ID
  • q_source question source (Commonsense QA or Balanced COPA)
  • e_source explanation source (→ Sources)
  • triples triple-form explanation (COPA-SSE only)
  • postivies, negatives positive and negative statements (ECQA only)

Citation

If you use this dataset, please consider citing the following work.

@article{brassard2024acorn,
  title   = {ACORN: Aspect-wise Commonsense Reasoning Explanation Evaluation},
  author  = {Ana Brassard and Benjamin Heinzerling and Keito Kudo and Keisuke Sakaguchi and Kentaro Inui},
  year    = {2024},
  journal = {arXiv preprint arXiv: 2405.04818}
}

About

Home repository for the ACORN dataset: 3,500 explanations with aspect-wise human ratings of their quality.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published