Pytorch re-implementation of Distance-based Self-Attention Network for Natural Language Inference.
This is an unofficial implementation.
Dataset: SNLI
Model | Valid Acc(%) | Test Acc(%) |
---|---|---|
Baseline from the paper | - | 86.3 |
Re-implemenation | 86.3 | 86.0 |
Baseline from the paper (without distance mask) | - | 86.0 |
Re-implemenation (without distance mask) | 86.2 | 85.7 |
- OS: Ubuntu 16.04 LTS (64bit)
- Language: Python 3.6.6
- Pytorch: 0.4.0
Please install the following library requirements first.
nltk==3.3
tensorboardX==1.2
torch==0.4.0
torchtext==0.2.3
python train.py --help
usage: train.py [-h] [--batch-size BATCH_SIZE] [--data-type DATA_TYPE]
[--dropout DROPOUT] [--epoch EPOCH] [--gpu GPU]
[--learning-rate LEARNING_RATE] [--print-freq PRINT_FREQ]
[--word-dim WORD_DIM] [--num-heads NUM_HEADS] [--d-ff D_FF]
[--alpha ALPHA]
optional arguments:
-h, --help show this help message and exit
--batch-size BATCH_SIZE
--data-type DATA_TYPE
--dropout DROPOUT
--epoch EPOCH
--gpu GPU
--learning-rate LEARNING_RATE
--print-freq PRINT_FREQ
--word-dim WORD_DIM
--num-heads NUM_HEADS
--d-ff D_FF
--alpha ALPHA
Note:
- Only codes to use SNLI as training data are implemented.
- The Dropout and Layer Normalization technique exist in this model, but it is not clear how those are applied by the paper.