Rewarding Coreference Resolvers for Being Consistent with World Knowledge

To appear in EMNLP 2019.


For convinience, create a symlink: cd e2e-coref && ln -s ../wiki ./wiki

For pre-training the coreference resolution system, OntoNotes 5.0 is required. [Download] [Create splits]

Data for training the reward models and fine-tuning the coreference resolver (place in <PROJECT_HOME>/data):

  • 2M triples for RE-Text [Train] [Dev]
  • 12M triples for RE-KG [Train] [Dev]
  • 60k triples for RE-Joint [Train] [Dev]
  • 10k wikipedia summaries for fine-tuning [Download]

Note: If you want to make these files from scratch, follow the instructions in the triples folder.

Pre-trained models

  • Best performing reward model (RE-Distill) [Download]
  • Best performing coreference resolver (Coref-Distill) [Download]


Unzip Coref-Distill into e2e-coref/logs folder and run GPU=x python final


Reward models

  • Download pytorch big-graph embeddings (~40G, place in <PROJECT_HOME>/embeddings) [Download]
  • Run wiki/ to create an index of the embeddings (you need to do this only once)
  • Run reward module training with cd wiki/reward && python <dataset-name>

Coreference resolver

  • Follow e2e-coref/ to setup environment, create ELMO embeddings, etc.
  • Run coreference pre-training with cd e2e-coref && GPU=x python <experiment>
  • Start the sling server with python wiki/reward/
  • Change SLING_IP in wiki/reward/ to the IP of the sling server
  • Run coreference fine-tuning with cd e2e-coref && GPU=x python <experiment> (see e2e-coref/experiments.conf for the different configurations)


  • wiki/reward/ can be used to distill the various reward models
  • e2e-coref/ can be used to save the weights of the fine-tuned coreference models so that they can be combined by setting the distill flag in the configuration file


  author    = {Rahul Aralikatte and
               Heather Lent and
               Ana Valeria Gonz{\'{a}}lez{-}Gardu{\~{n}}o and
               Daniel Hershcovich and
               Chen Qiu and
               Anders Sandholm and
               Michael Ringaard and
               Anders S{\o}gaard},
  title     = {Rewarding Coreference Resolvers for Being Consistent with World Knowledge},
  journal   = {CoRR},
  volume    = {abs/1909.02392},
  year      = {2019},
  url       = {},
  archivePrefix = {arXiv},
  eprint    = {1909.02392},
  timestamp = {Mon, 16 Sep 2019 17:27:14 +0200},
  biburl    = {},
  bibsource = {dblp computer science bibliography,}
