This release contains the code used for paper An Imitation Learning Approach to Unsupervised Parsing
Code for PRPN and Gumbel Tree-LSTM is borrowed from PRPN codebase and NYU's implementation respectively.
Requirements:
- Python 2.7.5
- Pytorch 0.3.1
Data to download:
- Train PRPN using its original code base
- Forward PRPN model on All-NLI data to get predicted trees:
./python/prpn_util/generate_distance.sh
- Shuffle the generated training set and then split it to
train/dev
set
cat train.json | shuf > train_shuffled.json
head -n TRAIN_NUM train_shuffled.json > train_shuffled_train.json
tail -n VALID_NUM train_shuffled.json > train_shuffled_valid.json
- Step-by-step supervised learning
./python/sl_rl.sh
- Policy refinement
./python/rl_ft.sh