Skip to content
Repository for the Question Answering via Sentence Composition (QASC) dataset
Branch: master
Clone or download

Latest commit

Fetching latest commit…
Cannot retrieve the latest commit at this time.


Type Name Latest commit message Commit time
Failed to load latest commit information.
qasc Add qid Oct 24, 2019
.gitignore remove cache files Oct 21, 2019
LICENSE Initial commit Oct 7, 2019 Add link to Arxiv paper Oct 28, 2019
requirements.txt Add README with predictor to produce prediction csv Oct 21, 2019

Question Answering via Sentence Composition (QASC)

QASC is a question-answering dataset with a focus on sentence composition. It consists of 9,980 8-way multiple-choice questions about grade school science (8,134 train, 926 dev, 920 test), and comes with a corpus of 17M sentences. This repository shows how to download the QASC dataset and corpus. Note that the test set does not have the answer key or fact annotations. To evaluate your model on the test set, please submit your predictions (limited to once/week to prevent over-fitting) to the QASC leaderboard in the CSV format described here. We also provide two sample baseline models that can be used to produce the predictions in this CSV format.

Table of Contents

Downloading Data


Download and unzip the dataset into the data/QASC_Dataset folder:

mkdir -p data
tar xvfz qasc_dataset.tar.gz  -C data/
rm qasc_dataset.tar.gz


Download and unzip the text corpus (17M sentences) into the data/QASC_Corpus folder:

mkdir -p data
tar xvfz qasc_corpus.tar.gz  -C data/
rm qasc_corpus.tar.gz


Download and unzip the baseline models into the data/QASC_Models folder:

mkdir -p data
tar xvfz qasc_models.tar.gz  -C data/
rm qasc_models.tar.gz

Setting up the Environment

We currently use a fork of AllenNLP by Oyvind Tafjord to train and evaluate our models. To use this repository, setup a Python 3.6 environment in venv or conda (to ensure a clean setup) and install the requirements. For example, to setup using conda:

conda create -n qasc python=3.6
source activate qasc
pip install -r requirements.txt

We intend to release models and scripts that directly use AllenNLP/Transformers library by HuggingFace in the near future.

Evaluating Models

We release two sample models that predict answer choices based on (1) no knowledge and (2) single-step retrieval knowledge. Both models use BertMCQAModel that adds a linear layer on the output of the [CLS] token representation.

No Knowledge Baseline

This model can be run directly against the test set to produce predictions using AllenNLP's predict command:

python -m predict \
     --include-package qasc \
     --predictor multiple-choice-qa-json \
     --output-file data/nokb_predictions.jsonl \
     data/QASC_Models/bertlc_nokb/model.tar.gz  \

To convert the predictions into the CSV format expected by the leaderboard, use jq, a command-line tool to parse JSON files:

jq -r "[.id,.answer]|@csv" data/nokb_predictions.jsonl > data/nokb_predictions.csv

Single Step Baseline

To run the single-step retrieval baseline, we provide the train, dev and test files with the retrieved context here. These JSONL files contain the retrieved context for each choice as a paragraph in the question.choices[].para field. The sentences are sorted in the reverse order of their retrieval scores before being concatenated to produce the paragraph. This ensures that the most relevant sentences are closer to the end and do not get removed when the context is truncated (from the front) to fit within 184 word-pieces.

To download the data:

tar xvfz qasc_dataset_1step.tar.gz  -C data/
rm qasc_dataset_1step.tar.gz

To produce predictions:

python -m predict \
     --include-package qasc \
     --predictor multiple-choice-qa-json \
     --output-file data/1step_predictions.jsonl \
     data/QASC_Models/bertlc_1step/model.tar.gz  \

jq -r "[.id,.answer]|@csv" data/1step_predictions.jsonl > data/1step_predictions.csv
You can’t perform that action at this time.