Coreference Resolution with Entity Equalization

Introduction

This repository contains the code for replicating results from

Coreference Resolution with Entity Equalization
In ACL 2019
The baseline model is from the paper Higher-order Coreference Resolution with Coarse-to-fine Inference
Code for baseline model: https://github.com/kentonl/e2e-coref

Getting Started

Install python (either 2 or 3) requirements: pip install -r requirements.txt
Download GloVe embeddings and build custom kernels by running setup_all.sh.
- There are 3 platform-dependent ways to build custom TensorFlow kernels. Please comment/uncomment the appropriate lines in the script.
To train your own models, run setup_training.shand extract_bert_features.sh
- This assumes access to OntoNotes 5.0. Please edit the ontonotes_path variable.

Training Instructions

Experiment configurations are found in experiments.conf
Choose an experiment that you would like to run, e.g. best
Training: python train.py <experiment>
Results are stored in the logs directory and can be viewed via TensorBoard.
Evaluation: python evaluate.py <experiment>

Demo Instructions

Command-line demo: python demo.py final
To run the demo with other experiments, replace final with your configuration name.

Batched Prediction Instructions

Create a file where each line is in the following json format (make sure to strip the newlines so each line is well-formed json):

{
  "clusters": [],
  "doc_key": "nw",
  "sentences": [["This", "is", "the", "first", "sentence", "."], ["This", "is", "the", "second", "."]],
  "speakers": [["spk1", "spk1", "spk1", "spk1", "spk1", "spk1"], ["spk2", "spk2", "spk2", "spk2", "spk2"]]
}

clusters should be left empty and is only used for evaluation purposes.
doc_key indicates the genre, which can be one of the following: "bc", "bn", "mz", "nw", "pt", "tc", "wb"
speakers indicates the speaker of each word. These can be all empty strings if there is only one known speaker.
Run python predict.py <experiment> <input_file> <output_file>, which outputs the input jsonlines with predicted clusters.

Other Quirks

It does not use GPUs by default. Instead, it looks for the GPU environment variable, which the code treats as shorthand for CUDA_VISIBLE_DEVICES.
The training runs indefinitely and needs to be terminated manually. The model generally converges at about 400k steps.

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
cache_elmo.py		cache_elmo.py
conll.py		conll.py
continuous_evaluate.py		continuous_evaluate.py
coref_bert_model_2.py		coref_bert_model_2.py
coref_kernels.cc		coref_kernels.cc
coref_model.py		coref_model.py
coref_ops.py		coref_ops.py
data.py		data.py
demo.py		demo.py
evaluate.py		evaluate.py
experiments.conf		experiments.conf
extract_bert_features.sh		extract_bert_features.sh
extract_features.py		extract_features.py
filter_embeddings.py		filter_embeddings.py
get_char_vocab.py		get_char_vocab.py
metrics.py		metrics.py
minimize.py		minimize.py
modeling.py		modeling.py
optimization.py		optimization.py
predict.py		predict.py
prepare_bert_data.py		prepare_bert_data.py
ps.py		ps.py
requirements.txt		requirements.txt
setup_all.sh		setup_all.sh
setup_pretrained.sh		setup_pretrained.sh
setup_training.sh		setup_training.sh
tokenization.py		tokenization.py
train.py		train.py
train_mgpu.sh		train_mgpu.sh
util.py		util.py
worker.py		worker.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Coreference Resolution with Entity Equalization

Introduction

Getting Started

Training Instructions

Demo Instructions

Batched Prediction Instructions

Other Quirks

About

Releases

Packages

Contributors 2

Languages

License

bkntr/coref-ee

Folders and files

Latest commit

History

Repository files navigation

Coreference Resolution with Entity Equalization

Introduction

Getting Started

Training Instructions

Demo Instructions

Batched Prediction Instructions

Other Quirks

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages