Top-down Tree LSTM (NAACL 2016)
Lua Perl Shell Python
Switch branches/tags
Nothing to show
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Failed to load latest commit information.
dataset intial commit Mar 13, 2016
msr_scripts intial commit Mar 13, 2016
scripts intial commit Mar 13, 2016
utils intial commit Mar 13, 2016
main_bid.lua intial commit Mar 13, 2016
main_nce.lua intial commit Mar 13, 2016
rerank.lua intial commit Mar 13, 2016
train_mlp.lua intial commit Mar 13, 2016

Top-down Tree Long Short-Term Memory Networks

A Torch implementation of the Top-down TreeLSTM described in the following paper.

Top-down Tree Long Short-Term Memory Networks

Xingxing Zhang, Liang Lu and Mirella Lapata. In Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL 2016).

  author    = {Zhang, Xingxing  and  Lu, Liang  and  Lapata, Mirella},
  title     = {Top-down Tree Long Short-Term Memory Networks},
  booktitle = {Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies},
  month     = {June},
  year      = {2016},
  address   = {San Diego, California},
  publisher = {Association for Computational Linguistics},
  pages     = {310--320},
  url       = {}

Implemented Models

  • TreeLSTM
  • TreeLSTM-NCE
  • LdTreeLSTM
  • LdTreeLSTM-NCE

TreeLSTM and LdTreeLSTM (check the details in the paper above) are trained with Negative Log-likelihood (NLL); while TreeLSTM-NCE and LdTreeLSTM-NCE are trained with Noise Contrastive Estimation (NCE) (see this paper and also this paper for details).

Note that in experiments, the normalization term Z of NCE is learned automatically. The implemented NCE module also support keeping Z fixed.


Torch can be installed with the instructions here. You also need to install some torch components.

luarocks install nn
luarocks install nngraph
luarocks install cutorch
luarocks install cunn

You may find this document useful when installing torch-hdf5 (DON'T use luarocks).

Please also note that to run the code, you need to use an old version of Torch with the instructions here.

Language Modeling (MSR Sentence Completion)

Pre-trained models (TreeLSTM-400 and LdTreeLSTM-400) are available


First, parse the dataset into dependency trees using Stanford CoreNLP toolkit. It should looks like this

SILAP10.TXT#0	det(Etext-4, The-1) nn(Etext-4, Project-2) nn(Etext-4, Gutenberg-3) root(ROOT-0, Etext-4) prep(Etext-4, of-5) det(Rise-7, The-6) pobj(of-5, Rise-7) prep(Rise-7, of-8) nn(Lapham-10, Silas-9) pobj(of-8, Lapham-10) prep(Etext-4, by-11) nn(Howells-14, William-12) nn(Howells-14, Dean-13) pobj(by-11, Howells-14) det(RISE-16, THE-15) dep(Etext-4, RISE-16) prep(RISE-16, OF-17) nn(LAPHAM-19, SILAS-18) pobj(OF-17, LAPHAM-19) prep(RISE-16, by-20) nn(Howells-23, William-21) nn(Howells-23, Dean-22) pobj(by-20, Howells-23) npadvmod(Howells-23, I-24) punct(Etext-4, .-25)
SILAP10.TXT#1	advmod(went-4, WHEN-1) nn(Hubbard-3, Bartley-2) nsubj(went-4, Hubbard-3) advcl(received-40, went-4) aux(interview-6, to-5) xcomp(went-4, interview-6) nn(Lapham-8, Silas-7) dobj(interview-6, Lapham-8) prep(interview-6, for-9) det(Men-13, the-10) punct(Men-13, ``-11) amod(Men-13, Solid-12) pobj(for-9, Men-13) prep(Men-13, of-14) pobj(of-14, Boston-15) punct(Men-13, ''-16) dep(Men-13, series-17) punct(series-17, ,-18) dobj(undertook-21, which-19) nsubj(undertook-21, he-20) rcmod(series-17, undertook-21) aux(finish-23, to-22) xcomp(undertook-21, finish-23) prt(finish-23, up-24) prep(finish-23, in-25) det(Events-27, The-26) pobj(in-25, Events-27) punct(received-40, ,-28) mark(replaced-31, after-29) nsubj(replaced-31, he-30) advcl(received-40, replaced-31) poss(projector-34, their-32) amod(projector-34, original-33) dobj(replaced-31, projector-34) prep(replaced-31, on-35) det(newspaper-37, that-36) pobj(on-35, newspaper-37) punct(received-40, ,-38) nsubj(received-40, Lapham-39) root(ROOT-0, received-40) dobj(received-40, him-41) prep(received-40, in-42) poss(office-45, his-43) amod(office-45, private-44) pobj(in-42, office-45) prep(received-40, by-46) amod(appointment-48, previous-47) pobj(by-46, appointment-48) punct(received-40, .-49)

Each line is a sentence (format: label \t dependency tuples), where SILAP10.TXT#0 is the label for the sentence (it can be any string and it doesn't matter).

Dataset after the preprocessing above can be downloaded here.

Then, convert the dependency tree dataset into HDF5 format and sort the dataset to make sure sentences in each batch have similar length. Sorting the dataset is for faster training, which is a commonly used strategy for training RNN or Sequence based models.

Create Dataset for TreeLSTM

cd scripts

Note the program will crash when running ./ You can ignore the crash or you should use --sort 0 switch instead of --sort 20.

Create Dataset for LdTreeLSTM

cd scripts

Alternately, you can contact the first author to request the dataset after preprocessing.

Training and Evaluation

Basically it is just one command.

cd experiments/msr
# to run TreeLSTM with hidden size 400
# to run LdTreeLSTM with hidden size 400

But don't forget to specify where is your code, your dataset and whatever by modifying or

# where is your code? (you should use absolute path)
# where is your dataset (you should use absolute path)
# label for this model

# where is your testset (you should use absolute path); this will only be used in evaluation

Dependency Parsing Reranking


For TreeLSTM

cd scripts

For LdTreeLSTM

cd scripts

Train and Evaluate Dependency Reranking Models

Training TreeLSTMs and LdTreeLSTMs are quit similar. The following is about training a TreeLSTM.

cd experiments/depparse

Then, you will get a trained TreeLSTM. We can use this TreeLSTM to rerank the K dependencies produced by the second order MSTParser.

The following script will use the trained dependency model to rerank the top 20 dependencies from MSRParser on the validation set. The script will try different K and choose the one gives best UAS.


Given the K we've got from the validation set, we can get the reranking performance on test set by using the following script.


Dependency Tree Generation

How will we generate dependency trees? (details see Section 3.4 of the paper)

  • Run the Language Modeling experiment or the dependency parsing experiment to get a trained TreeLSTM or LdTreeLSTM
  • Generate training data for the four classifiers (Add-Left, Add-Right, Add-Nx-Left, Add-Nx-Right)
  • Train Add-Left, Add-Right, Add-Nx-Left and Add-Nx-Right
  • Generate dependency trees with a trained TreeLSTM (or LdTreeLSTM) and the four classifiers

Generate Training data

Go to sampler.lua and run the following code

-- model_1.0.w200.t7 is the trained TreeLSTM
-- penn_wsj.conllx.sort.h5 is the dataset for the trained TreeLSTM
-- eot.penn_wsj.conllx.sort.h5 is the output dataset for the four classifiers

Train the Four Classifiers

Use train_mlp.lua

$ th train_mlp.lua -h
Usage: /afs/ [options] 
====== MLP v 1.0 ======

  --seed        random seed [123]
  --useGPU      use gpu [false]
  --snhids      string hidden sizes for each layer [400,300,300,2]
  --activ       options: tanh, relu [tanh]
  --dropout     dropout rate (dropping) [0]
  --maxEpoch    max number of epochs [10]
  --dataset     dataset [/disk/scratch/XingxingZhang/treelstm/dataset/depparse/eot.penn_wsj.conllx.sort.h5]
  --ftype        [|x|oe|]
  --ytype        [1]
  --batchSize    [256]
  --lr           [0.01]
  --optimMethod options: SGD, AdaGrad [AdaGrad]
  --save        save path [model.t7]

Note --ytype 1, 2, 3, 4 corresponds to the four classifiers. Here is a sample script:

ID=`./ --id-to-hog 2`
echo $ID
if [ $ID -eq -1 ]; then
    echo "no gpu is free"

echo $curdir
echo $codedir

cd $codedir
CUDA_VISIBLE_DEVICES=$ID th train_mlp.lua --useGPU \
    --activ relu --dropout 0.5 --lr $lr --maxEpoch 10 \
    --snhids "400,300,300,2" --ftype "|x|oe|" --ytype 1 \
    --save $curdir/$model | tee $curdir/$log

cd $curdir

./ --free $ID

Generation by Sampling

Go to sampler.lua and run the following code. The code will output dependency trees in LaTeX format.

-- model_1.0.w200.t7: trained TreeLSTM
-- trained classifiers, note that model.yt1.x.oe.t7, model.yt2.x.oe.t7, model.yt3.x.oe.t7 and model.yt4.x.oe.t7 must all exist
    's100.txt', -- output dependency trees
    1,     -- rand seed
    100)   -- number of tree samples