joint-coref-srl

Code to reproduce experiments from "Joint Semantic Analysis with Document-Level Cross-Task Coherence Rewards"

Recommended: Python 3.7 with a new conda environment

Installation

Install all dependencies with pip install -r requirements.txt

Then navigate to the allennlp folder and run the following commands

$ pip install --editable .
$ pip install -r dev-requirements.txt

Install pytorc-geometric by following this post

Training

Download Ontonotes 5.0 and run prepare_ontonotes.sh to convert it into the CoNLL-2012 format
Make sure that the concatenated files are in a folder named data/conll-2012_single_file with names like test.english.gold_conll (without the version number) and the original split files are in a folder named data/conll-2012
Run mkdir graphs && python combine_data.py to combine the gold (train and development) SRL tags and corefernce clusters into a single JSON object. This should create 2 files, one for training and development each
To further convert them into processable graphs, run python graph_util/output_to_graph.py <JSON-path-of-graphs> (once for train and again for dev.)
Finally, train the coherence classifiers by running dgi.py

Now you are all set to train the coreference and SRL models!

All the configuration files can be found in the configs folder:

single: contains the configs for the single-task baselines
mtl: contains the configs for the multi-task baselines
ft: contains the configs to finetune the different models (each encoder type has a different config file)

To train the baseline models, run

$ python train.py --config_file_path <path-to-config> --serialization_dir <path-to-save>

To finetune the models, run

$ python predict.py <path-to-ft-config> <reward-type> <graph-encoder-type> <ft-task-name> <ft-dataset> <path-to-model-dir>

In all our experiments we use reward-type=sep and graph-encoder-type=GCN.

Evaluation

To evaluate a model on a dataset, run

$ python evaluate.py -s <path-to-model-dir> -t <task-name> -d <dataset-name> -m <type-of-model>

type-of-model can be pt or ft for evaluating pre-trained and fine-tuned models respectively.

For all files, running it with --help flag prints the various options and their help strings.

Thanks

This code is built on HMTL

Citation

@misc{aralikatte2020joint,
      title={Joint Semantic Analysis with Document-Level Cross-Task Coherence Rewards}, 
      author={Rahul Aralikatte and Mostafa Abdou and Heather Lent and Daniel Hershcovich and Anders Søgaard},
      year={2020},
      eprint={2010.05567},
      archivePrefix={arXiv},
      primaryClass={cs.CL}
}

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
allennlp		allennlp
configs		configs
graph_util		graph_util
hmtl		hmtl
.gitattributes		.gitattributes
.gitignore		.gitignore
.gitmodules		.gitmodules
LICENSE		LICENSE
README.md		README.md
build_dgl_graph.py		build_dgl_graph.py
combine_data.py		combine_data.py
dgi.py		dgi.py
error-analysis.py		error-analysis.py
evaluate.py		evaluate.py
finetune.py		finetune.py
predictor.py		predictor.py
requirements.txt		requirements.txt
train.py		train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

joint-coref-srl

Installation

Training

Evaluation

Thanks

Citation

About

Releases

Packages

Languages

License

rahular/joint-coref-srl

Folders and files

Latest commit

History

Repository files navigation

joint-coref-srl

Installation

Training

Evaluation

Thanks

Citation

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages