This repository contains code and experiment results introduced in the following paper:
-
What does it take to bake a cake? The RecipeRef corpus and anaphora resolution in procedural text
-
Biaoyan Fang, Timothy Baldwin and Karin Verspoor
-
In Finding of ACL 2022
- For the data and detailed annotation guideline of the RecipeRef corpus, please refer to RecipeRef dataset. We also provided data in jsonlines format in data but for the original data, please refer to RecipeRef dataset.
-
Install python (preference 3) requirement:
pip install -r requirements.txt
-
Download GloVe embeddings and also another version glove_50_300_2.txt
-
Download RecipeRef dataset and put it into the
data
directory. Note that we separate full set and partition 80. -
run
setup_all.sh
and thensetup_training.sh
-
Install brat evalation tool
-
We use nltk to tokenzize the brat file for training and generating the jsonlines files. Our code can be found in convert_brat_into_training_format-clear.ipynb
- Experiment configurations are found in
experiments.conf
- Choose an experiment that you would like to run, e.g.
bridging
- Training:
python train_folds.py <experiment>
- Checkpoints are stored in the
logs
directory - Prediction results are stored in the
prediction
directory
- Evaluation:
python evaluate_folds.py <experiment>
- Evaluation tool provides differnet settings,
exact
andrelax
mention matching. For this paper, we useexact
mention matching