Skip to content
Hierarchical BiLSTM max pooling architecture for NLI
Branch: master
Clone or download
Latest commit ec6d6ec Aug 28, 2018
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
LICENSE
README.md
classifier.py Cleaning up the code Aug 27, 2018
corpora.py
download_data.sh
embeddings.py added arxiv link Aug 28, 2018
evaluate_senteval.py
test.py
train.py cleaned up Aug 22, 2018
train_hbmp.sh
train_infersent.sh

README.md

Natural Language Inference with Hierarchical BiLSTM Max Pooling Architecture (HBMP)

Aarne Talman, Anssi Yli-Jyrä and Jörg Tiedemann. 2018. Natural Language Inference with Hierarchical BiLSTM Max Pooling Architecture

Abstract: Recurrent neural networks have proven to be very effective for natural language inference tasks. We build on top of one such model, namely BiLSTM with max pooling, and show that adding a hierarchy of BiLSTM and max pooling layers yields state of the art results for the SNLI sentence encoding-based models and the SciTail dataset, as well as provides strong results for the MultiNLI dataset. We also show that our sentence embeddings can be utilized in a wide variety of transfer learning tasks, outperforming InferSent on 7 out of 10 and SkipThought on 8 out of 9 SentEval sentence embedding evaluation tasks. Furthermore, our model beats the InferSent model in 8 out of 10 recently published SentEval probing tasks designed to evaluate sentence embeddings' ability to capture some of the important linguistic properties of sentences.

Key Results

Key NLI results

  • SNLI: 86.6% (600D model)
  • SciTail: 86.0% (600D model)

SentEval results

Results for the SentEval sentence embedding evaluation library.

Model MR CR SUBJ MPQA SST TREC MRPC SICK-R SICK-E STS14
InferSent 81.1 86.3 92.4 90.2 84.6 88.2 76.2/83.1 0.884 86.3 .70/.67
SkipThought 79.4 83.1 93.7 89.3 82.9 88.4 - 0.858 79.5 .44/.45
600D HBMP 81.5 86.4 92.7 89.8 83.6 86.4 74.6/82.0 0.876 85.3 .70/.66
1200D HBMP 81.7 87.0 93.7 90.3 84.0 88.8 76.7/83.4 0.876 84.7 .71/.68

SentEval probing task results

Model SentLen WC TreeDepth TopConst BShift Tense SubjNum ObjNum SOMO CoordInv
InferSent 71.7 87.3 41.6 70.5 65.1 86.7 80.7 80.3 62.1 66.8
600D HBMP 75.9 84.1 42.9 76.6 64.3 86.2 83.7 79.3 58.9 68.5
1200D HBMP 75.0 85.3 43.8 77.2 65.6 88.0 87.0 81.8 59.0 70.8

Instructions

To replicate the results of our paper, follow the steps below.

Install dependencies

The following dependencies are required (versions used in brackets):

  • Python (3.5.3)
  • Pytorch (0.3.1)
  • Numpy (1.14.3)
  • Torchtext (for preprocessing) (0.2.1)
  • SpaCy (for tokenization) (2.0.11)

For SpaCy you need to download the English model

python -m spacy download en

Download and prepare the datasets

./download_data.sh

This will download the needed datasets and word embeddings, including:

Train and test HBMP

Run the train_hbmp.sh script to reproduce the NLI results for the HBMP model

./train_hbmp.sh

Default settings for the SNLI dataset are as follows:

python3 train.py \
  --epochs 20 \
  --batch_size 64 \
  --corpus snli \
  --encoder_type HBMP \
  --activation leakyrelu \
  --optimizer adam \
  --word_embedding glove.840B.300d \
  --embed_dim 300 \
  --fc_dim 600 \
  --hidden_dim 600 \
  --layers 1 \
  --dropout 0.1 \
  --learning_rate 0.0005 \
  --lr_patience 1 \
  --lr_decay 0.99 \
  --lr_reduction_factor 0.2 \
  --weight_decay 0 \
  --early_stopping_patience 3 \
  --save_path results \
  --seed 1234

To rerproduce the results for the other datasets, change the --corpus option to one of the following breaking_nli, multinli_matched, multinli_mismatched, scitail, all_nli.

In our paper some of the results for InferSent model were obtained using our implementation of the model. To train the InferSent model with our implementation use the train_infersent.sh script. See the paper for more details.

python3 train.py \
  --epochs 20 \
  --batch_size 64 \
  --corpus snli \
  --encoder_type BiLSTMMaxPoolEncoder \
  --activation tanh \
  --optimizer sgd \
  --word_embedding glove.840B.300d \
  --embed_dim 300 \
  --fc_dim 512 \
  --hidden_dim 2048 \
  --layers 1 \
  --dropout 0 \
  --learning_rate 0.1 \
  --lr_patience 1 \
  --lr_decay 0.99 \
  --lr_reduction_factor 0.2 \
  --save_path results \
  --seed 1234

References

Please cite our paper if you find this code useful.

[1] Aarne Talman, Anssi Yli-Jyrä and Jörg Tiedemann. 2018. Natural Language Inference with Hierarchical BiLSTM Max Pooling Architecture

@article{talman2018hbmp,
  title={Natural Language Inference with Hierarchical BiLSTM Max Pooling Architecture},
  author={Talman, Aarne and Yli-Jyr\"a, Anssi and Tiedemann, J\"org},
  journal={arXiv preprint arXiv:1808.08762},
  year={2018}
}
You can’t perform that action at this time.