Labeled Span Graph Network (Under Construction)
This repository contains code and models for replicating results from the following publication:
- Jointly Predicting Predicates and Arguments in Neural Semantic Role Labeling
- Luheng He, Kenton Lee, Omer Levy and Luke Zettlemoyer
- In ACL 2018
Part of the codebase is extended from e2e-coref.
- Python 2.7
- TensorFlow 1.8.0
- pyhocon (for parsing the configurations)
- tensorflow_hub (for loading ELMo)
- sudo apt-get install tcsh (Only required for processing CoNLL05 data)
- GloVe embeddings and the srlconll scripts:
- Build kernels:
./scripts/build_custom_kernels.sh(Please make adjustments to the script according to your OS/gcc version)
- Download pretrained models by running
Setting up for ELMo (in progress)
- Some of our models are trained with the ELMo embeddings. We use the ELMo model loaded by tensorflow_hub.
- It is recommended to cache ELMo embeddings for training and validating efficiency. Instructions will be added soon.
Making Predictions with Pretrained Models
- Please see
data/sample.jsonlinesfor input format (json). Each json object can contain multiple sentences.
- For example, run
python decoder.py conll2012_final data/sample.jsonlines sample.outto predict SRL structures.
- The output will also be in json format, with an additional array storing the SRL tuples. For example, for the following input sentences:
[["John", "told", "Pat", "to", "stop", "the", "robot", "immediately", "."], ["Pat", "refused", "."]]
The following json object
"predicted_srl": [[1, 0, 0, "ARG0"], [1, 2, 2, "ARG2"], [1, 3, 7, "ARG1"], [4, 2, 2, "ARG0"], [4, 5, 6, "ARG1"], [4, 7, 7, "ARGM-TMP"], [10, 9, 9, "ARG0"]]
contains SRL predictions for the two sentences, formatted as
[predicate_position, argument_span_start, argument_end, role_label]. The token ids are counted starting 0 from the beginning of the document (instead of the beginning of each sentence).
For replicating results on CoNLL-2005 and CoNLL-2012 datasets, please follow the steps below.
The data is provided by:
CoNLL-2005 Shared Task,
but the original words are from the Penn Treebank dataset, which is not publicly available.
If you have the PTB corpus, you can run:
You have to follow the instructions below to get CoNLL-2012 data
CoNLL-2012, this would result in a directory called
- Experiment configurations are found in
- Choose an experiment that you would like to run, e.g.
- For a single-machine experiment, run the following two commands:
python singleton.py <experiment>
python evaluator.py <experiment>
- Results are stored in the
logsdirectory and can be viewed via TensorBoard.
- For final evaluation of the checkpoint with the maximum dev F1:
python test_single.py <experiment>for the single-model evaluation. For example:
python test_single.py conll2012_final
- It does not use GPUs by default. Instead, it looks for the
GPUenvironment variable, which the code treats as shorthand for
- The evaluator should not be run on GPUs, since evaluating full documents does not fit within GPU memory constraints.
- The training runs indefinitely and needs to be terminated manually. The model generally converges at about 300k steps and within 48 hours.
- At test time, the code loads the entire GloVe 300D embedding file in the beginning, which would take a while.