End-to-end Neural Coreference Resolution
Switch branches/tags
Nothing to show
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Failed to load latest commit information.
.gitignore Add distributed training. Sep 5, 2018
LICENSE Initial commit. Jul 21, 2017
README.md Make the data in predict.py a command-line argument rather than a fie… Jul 15, 2018
cache_elmo.py Initial commit for update from 'Higher-order Coreference Resolution w… Jul 15, 2018
conll.py Initial commit for update from 'Higher-order Coreference Resolution w… Jul 15, 2018
continuous_evaluate.py Add distributed training. Sep 5, 2018
coref_kernels.cc Initial commit for update from 'Higher-order Coreference Resolution w… Jul 15, 2018
coref_model.py Initial commit for update from 'Higher-order Coreference Resolution w… Jul 15, 2018
coref_ops.py Initial commit for update from 'Higher-order Coreference Resolution w… Jul 15, 2018
demo.py Initial commit for update from 'Higher-order Coreference Resolution w… Jul 15, 2018
evaluate.py Add distributed training. Sep 5, 2018
experiments.conf Add distributed training. Sep 5, 2018
filter_embeddings.py Initial commit for update from 'Higher-order Coreference Resolution w… Jul 15, 2018
get_char_vocab.py Initial commit for update from 'Higher-order Coreference Resolution w… Jul 15, 2018
metrics.py Initial commit for update from 'Higher-order Coreference Resolution w… Jul 15, 2018
minimize.py Initial commit for update from 'Higher-order Coreference Resolution w… Jul 15, 2018
predict.py Make the data in predict.py a command-line argument rather than a fie… Jul 15, 2018
ps.py Add distributed training. Sep 5, 2018
requirements.txt Use exactly TF 1.7.0, required by TF-Hub. Aug 16, 2018
setup_all.sh Initial commit for update from 'Higher-order Coreference Resolution w… Jul 15, 2018
setup_pretrained.sh Initial commit for update from 'Higher-order Coreference Resolution w… Jul 15, 2018
setup_training.sh Initial commit for update from 'Higher-order Coreference Resolution w… Jul 15, 2018
train.py Add distributed training. Sep 5, 2018
util.py Initial commit for update from 'Higher-order Coreference Resolution w… Jul 15, 2018
worker.py Add distributed training. Sep 5, 2018

README.md

Higher-order Coreference Resolution with Coarse-to-fine Inference

Introduction

This repository contains the code for replicating results from

Getting Started

  • Install python (either 2 or 3) requirements: pip install -r requirements.txt
  • Download pretrained word embeddings and build custom kernels by running setup_all.sh.
    • There are 3 platform-dependent ways to build custom TensorFlow kernels. Please comment/uncomment the appropriate lines in the script.
  • Run one of the following:
    • To use the pretrained model only, run setup_pretrained.sh
    • To train your own models, run setup_training.sh
      • This assumes access to OntoNotes 5.0. Please edit the ontonotes_path variable.

Training Instructions

  • Experiment configurations are found in experiments.conf
  • Choose an experiment that you would like to run, e.g. best
  • Training: python train.py <experiment>
  • Results are stored in the logs directory and can be viewed via TensorBoard.
  • Evaluation: python evaluate.py <experiment>

Demo Instructions

  • Command-line demo: python demo.py final
  • To run the demo with other experiments, replace final with your configuration name.

Batched Prediction Instructions

  • Create a file where each line is in the following json format (make sure to strip the newlines so each line is well-formed json):
{
  "clusters": [],
  "doc_key": "nw",
  "sentences": [["This", "is", "the", "first", "sentence", "."], ["This", "is", "the", "second", "."]],
  "speakers": [["spk1", "spk1", "spk1", "spk1", "spk1", "spk1"], ["spk2", "spk2", "spk2", "spk2", "spk2"]]
}
  • clusters should be left empty and is only used for evaluation purposes.
  • doc_key indicates the genre, which can be one of the following: "bc", "bn", "mz", "nw", "pt", "tc", "wb"
  • speakers indicates the speaker of each word. These can be all empty strings if there is only one known speaker.
  • Run python predict.py <experiment> <input_file> <output_file>, which outputs the input jsonlines with predicted clusters.

Other Quirks

  • It does not use GPUs by default. Instead, it looks for the GPU environment variable, which the code treats as shorthand for CUDA_VISIBLE_DEVICES.
  • The training runs indefinitely and needs to be terminated manually. The model generally converges at about 400k steps.