Source Code for "Teaching Machine Comprehension with Compositional Explanations" (Findings of EMNLP 2020)
TL;DR: We collect human explanations that justifies their answering decisions when doing QA task; We transform these explanations into executable “teacher” programs; We use programs to annotate unlabeled QA examples and train a “student” QA model.
Project homepage: http://inklab.usc.edu/mrc-explanation-project/
conda create -n mrc-explanation python=3.6.9
conda activate mrc-explanation
pip install torch==1.4.0 allennlp==0.9.0 nltk==3.4.5 pandas==0.25.3
Then navigate to nltk source code nltk/parse/chart.py
, line 685, modify function parse, change for edge in self.select(start=0, end=self._num_leaves,lhs=root):
to for edge in self.select(start=0, end=self._num_leaves):
.
Please download pre-processed data and explanations from here. Please put the csv files at ./explanations
and json files at ./data/squad
.
This code snippet contains a minimal example that explains how an explanation is parsed, and how a constructed program is used to annotate new instances.
PYTHONPATH='.' python parser/example.py
PYTHONPATH='.' python parser/parse_squad_exps.py --verbose --save_ans_func
PYTHONPATH='.' python parser/match_squad_hard.py --nproc 32 --verbose --save_matched