Labeled Span Graph Networks
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Type Name Latest commit message Commit time
Failed to load latest commit information.

Labeled Span Graph Network (Under Construction)

This repository contains code and models for replicating results from the following publication:

Part of the codebase is extended from e2e-coref.


  • Python 2.7
  • TensorFlow 1.8.0
  • pyhocon (for parsing the configurations)
  • tensorflow_hub (for loading ELMo)

Getting Started

  • sudo apt-get install tcsh (Only required for processing CoNLL05 data)
  • GloVe embeddings and the srlconll scripts:
  • Build kernels: ./scripts/ (Please make adjustments to the script according to your OS/gcc version)
  • Download pretrained models by running ./scripts/

Setting up for ELMo (in progress)

  • Some of our models are trained with the ELMo embeddings. We use the ELMo model loaded by tensorflow_hub.
  • It is recommended to cache ELMo embeddings for training and validating efficiency. Instructions will be added soon.

Making Predictions with Pretrained Models

  • Please see data/sample.jsonlines for input format (json). Each json object can contain multiple sentences.
  • For example, run python conll2012_final data/sample.jsonlines sample.out to predict SRL structures.
  • The output will also be in json format, with an additional array storing the SRL tuples. For example, for the following input sentences:

[["John", "told", "Pat", "to", "stop", "the", "robot", "immediately", "."], ["Pat", "refused", "."]]

The following json object

"predicted_srl": [[1, 0, 0, "ARG0"], [1, 2, 2, "ARG2"], [1, 3, 7, "ARG1"], [4, 2, 2, "ARG0"], [4, 5, 6, "ARG1"], [4, 7, 7, "ARGM-TMP"], [10, 9, 9, "ARG0"]]

contains SRL predictions for the two sentences, formatted as [predicate_position, argument_span_start, argument_end, role_label]. The token ids are counted starting 0 from the beginning of the document (instead of the beginning of each sentence).

CoNLL Data

For replicating results on CoNLL-2005 and CoNLL-2012 datasets, please follow the steps below.


The data is provided by: CoNLL-2005 Shared Task, but the original words are from the Penn Treebank dataset, which is not publicly available. If you have the PTB corpus, you can run:
./scripts/ /path/to/ptb/


You have to follow the instructions below to get CoNLL-2012 data CoNLL-2012, this would result in a directory called /path/to/conll-formatted-ontonotes-5.0. Run:
./scripts/ /path/to/conll-formatted-ontonotes-5.0

Training Instructions

  • Experiment configurations are found in experiments.conf
  • Choose an experiment that you would like to run, e.g. conll2012_best
  • For a single-machine experiment, run the following two commands:
    • python <experiment>
    • python <experiment>
  • Results are stored in the logs directory and can be viewed via TensorBoard.
  • For final evaluation of the checkpoint with the maximum dev F1:
    • Run python <experiment> for the single-model evaluation. For example: python conll2012_final

Other Quirks

  • It does not use GPUs by default. Instead, it looks for the GPU environment variable, which the code treats as shorthand for CUDA_VISIBLE_DEVICES.
  • The evaluator should not be run on GPUs, since evaluating full documents does not fit within GPU memory constraints.
  • The training runs indefinitely and needs to be terminated manually. The model generally converges at about 300k steps and within 48 hours.
  • At test time, the code loads the entire GloVe 300D embedding file in the beginning, which would take a while.