Skip to content
Parsing Reading Predict Network
Python
Branch: master
Clone or download
yikangshen Merge pull request #2 from ExplorerFreda/master
fix required PyTorch version
Latest commit 42212fb Jan 22, 2019
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
data/ptb dictionary Apr 1, 2018
.gitignore default arguments Mar 1, 2018
LICENSE Initial commit Feb 28, 2018
LSTMCell.py PRPN code Feb 28, 2018
ParsingNetwork.py Sigmoid activation Mar 31, 2018
PredictNetwork.py PRPN code Feb 28, 2018
README.md fix required PyTorch version Jan 21, 2019
ReadingNetwork.py PRPN code Feb 28, 2018
blocks.py PRPN code Feb 28, 2018
data.py PRPN code Feb 28, 2018
data_ptb.py Unsupervised parsing Apr 1, 2018
demo.py Unsupervised parsing Apr 1, 2018
hinton.py PRPN code Feb 28, 2018
main_LM.py Unsupervised parsing Apr 1, 2018
main_UP.py
model_PRPN.py PRPN code Feb 28, 2018
test_phrase_grammar.py demo Apr 1, 2018

README.md

PRPN

Parsing Reading Predict Network

This repository contains the code used for word-level language model and unsupervised parsing experiments in Neural Language Modeling by Jointly Learning Syntax and Lexicon paper, originally forked from the PyTorch word level language modeling example. If you use this code or our results in your research, we'd appreciate if you cite our apper as following:

@inproceedings{
shen2018neural,
title={Neural Language Modeling by Jointly Learning Syntax and Lexicon},
author={Yikang Shen and Zhouhan Lin and Chin-wei Huang and Aaron Courville},
booktitle={International Conference on Learning Representations},
year={2018},
url={https://openreview.net/forum?id=rkgOLb-0W},
}

Software Requirements

Python 2.7, NLTK and PyTorch 0.3 are required for the current codebase.

Steps

  1. Install PyTorch 0.3 and NLTK

  2. Download PTB data. Note that the two tasks, i.e., language modeling and unsupervised parsing share the same model strucutre but require different formats of the PTB data. For language modeling we need the standard 10,000 word Penn Treebank corpus data, and for parsing we need Penn Treebank Parsed data.

  3. Scripts and commands

    • Language Modeling python main_LM.py --cuda --tied --hard --data /path/to/your/data

    The default setting in main_LM.py achieves a test perplexity of approximately 60.97 on PTB test set.

    • Unsupervised Parsing python main_UP.py --cuda --tied --hard

    The default setting in main_UP.py achieves an unlabeled f1 of approximately 0.70 on the standard test set of PTB WSJ10 subset. For visualizing the parsed sentence trees in nested bracket form, and evaluate the trained model, please run test_phrase_grammar.py

You can’t perform that action at this time.