Analyzing Open-domain QA

Before diving into the detail of this doc, you're strongly recommended to know some important concepts about system analyses.

In this file we describe how to analyze open-domain QA models. We will give an example using the natural_questions_comp_gen dataset, but other datasets can be analyzed in a similar way.

Data Preparation

Format of `Dataset` File

(1) datalab: if your datasets have been supported by datalab, you fortunately don't need to prepare the dataset.
(2) json (basically, it's a list of dictionaries with two keys: question and answers)

[
  {'question': 'who got the first nobel prize in physics', 'answers': ['Wilhelm Conrad Röntgen']},
  {'question': 'when is the next deadpool movie being released', 'answers': ['May 18 , 2018']},
  ...
]

Format of `System Output` File

In this task, your system outputs should be as follows:

william henry bragg
may 18, 2018
...

where each line represents one predicted answer. An example system output file is here: test.dpr.nq.txt

Let's say we have several files such as

gpt2.json

etc. from different systems.

Performing Basic Analysis

In order to perform your basic analysis, we can run the following command:

explainaboard --task qa-open-domain --dataset natural_questions_comp_gen   --system-outputs ./data/system_outputs/qa_open_domain/test.dpr.nq.txt  > report.json

where

--task: denotes the task name, you can find all supported task names here
--system-outputs: denote the path of system outputs. Multiple one should be separated by space, for example, system1 system2
--dataset:denotes the dataset name
report.json: the generated analysis file with json format. You can find the file here. Tips: use a json viewer like this one for better interpretation.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

task_qa_open_domain.md

task_qa_open_domain.md

Analyzing Open-domain QA

Data Preparation

Format of `Dataset` File

Format of `System Output` File

Performing Basic Analysis

Files

task_qa_open_domain.md

Latest commit

History

task_qa_open_domain.md

File metadata and controls

Analyzing Open-domain QA

Data Preparation

Format of Dataset File

Format of System Output File

Performing Basic Analysis

Format of `Dataset` File

Format of `System Output` File