Yale Dependency Parser for CoNLL 2018

Environments

Python 3.6 is supported. TensorFlow needs to be installed before running the training script. TensorFlow 1.0.0 or higher is supported.

GloVe

Our architecture utilizes pre-trained word embedding vectors, GloveVectors. Run the following:

wget http://nlp.stanford.edu/data/glove.6B.zip

and save it to a sub-directory glovevector/.

Preprocessing

python3 scripts/preprocess.py sample_data/config_demo.json

Train a Parser

All you need to do is to create a new directory for your data in the conllustag format and a json file for the model configuration and data information. We provide a sample json file for the sample data directory. You can train a parser on the sample data by the following command:

python2 train_graph_parser.py sample_data/config_demo.json

After running this command, you should be getting the following files and directories in sample_data/:

Directory/File	Description
checkpoint.txt	Contains information about the best model.
sents/	Contains the words in the one-sentence-per-line format
gold_pos/	Contains the gold POS tags in the one-sentence-per-line format
gold_stag/	Contains the gold supertags in the one-sentence-per-line format
arcs/	Contains the gold arcs in the one-sentence-per-line format
rels/	Contains the gold rels in the one-sentence-per-line format
predicted_arcs/	Contains the predicted arcs in the one-sentence-per-line format
predicted_rels/	Contains the gold rels in the one-sentence-per-line format
Parsing_Models/	Stores the best model.
conllu/sample.conllustag_stag	Contains the predicted supertags in the conllustag format

Structure of the Code

File	Description
`utils/preprocessing.py`	Contains tools for preprocessing. Mainly for tokenizing and indexing words/tags. Gets imported to `utils/data_process_secsplit.py`
`utils/data_process_secsplit.py`	Reads training and test data and tokenize/index words, POS tags, stags, and characters.
`utils/parsing_model.py`	Contains the `Parsing_Model` class that constructs our LSTM computation graph. The class has the necessary methods for training and testing. Gets imported to `bilstm_stagger_model.py`. For more details, read README for utils.
`utils/lstm.py`	Contains tensorflow LSTM equations. Gets imported to `utils/stagging_model.py`.
`graph_parser_model.py`	Contains functions that instantiate the `Parsing_Model` class and train/test a model. Gets imported to `graph_parser_main.py`
`graph_parser_main.py`	Main file to run experiments. Reads model and data options.
`scripts/train_graph_parser.py`	Runs `graph_parser_main.py` in bash according to the json file that gets passed.

Run a pre-trained TAG Parser

To Be Added.

Notes

If you use this tool for your research, please consider citing:

@InProceedings{Kasai&al.18,
  author =  {Jungo Kasai and Robert Frank and Pauli Xu and William Merrill and Owen Rambow},
  title =   {End-to-end Graph-based TAG Parsing with Neural Networks},
  year =    {2018},  
  booktitle =   {Proceedings of NAACL},  
  publisher =   {Association for Computational Linguistics},
}

Name		Name	Last commit message	Last commit date
Latest commit History 79 Commits
analyzers		analyzers
evaluation_script		evaluation_script
images		images
sample_data		sample_data
scripts		scripts
utils		utils
.gitignore		.gitignore
README.md		README.md
dummy_embedding.txt		dummy_embedding.txt
dummy_embeddings.txt		dummy_embeddings.txt
graph_parser_main.py		graph_parser_main.py
graph_parser_model.py		graph_parser_model.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Yale Dependency Parser for CoNLL 2018

Table of Contents

Environments

GloVe

Preprocessing

Train a Parser

Structure of the Code

Run a pre-trained TAG Parser

Notes

About

Releases

Packages

Languages

Yale-LILY/graph_parser

Folders and files

Latest commit

History

Repository files navigation

Yale Dependency Parser for CoNLL 2018

Table of Contents

Environments

GloVe

Preprocessing

Train a Parser

Structure of the Code

Run a pre-trained TAG Parser

Notes

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages