An implementation of the mTreeLSTM architectures.
Nam Khanh Tran, Weiwei Cheng. Multiplicative Tree-Structured Long Short-Term Memory Networks for Semantic Representations. Proceedings of the 7th Joint Conference on Lexical and Computational Semantics (*SEM-18): 276-286, ACL. New Orleans, USA, June 2018
- PyTorch (0.3.0)
- Python3 (3.6.1)
- Java8 (for Stanford Parsers)
Download the following data:
- Stanford Sentiment Treebank (sentiment classification task)
- SICK dataset (semantic relatedness and NLI tasks)
- SNLI dataset (NLI task)
- Glove word vectors (Common Crawl 840B)
Preprocess:
Or run the script fetch_and_preprocess.sh, as described in
https://github.com/stanfordnlp/treelstm.
Or use the pre-processed sentences here
In this task, the model reads two sentences (a premise and a hypothesis), and outputs a judgement of entailment,
contradiction, or neutral, reflecting the relationship between the meanings of the two sentences.
To train models for the NLI task on SICK dataset, run:
python nli.py --model <base|add|full|multi> --data data/sick --glove data/glove --word_size 300 --edge_size 100
--mem_size 150 --hidden_size 50 --batch_size 25 --optim adam --epochs 10 --num_classes 3
To train models for the NLI task on SNLI dataset, run:
python nli.py --model <base|add|full|multi> --data data/snli --glove data/glove --word_size 300 --edge_size 100
--mem_size 100 --hidden_size 200 --batch_size 128 --optim adam --epochs 10 --num_classes 3
where:
model: TreeLSTM variant to traindata: path to datasetglove: path to pre-trained word embeddingsedge_size: size of relation embeddingsmem_size: LSTM memory dimensionhidden_size: size of the classifier layerbatch_size: batch sizeepochs: the number of traning epochs
See the paper for more details on these experiments.