Attention-over-Attention Model Using Transformer Encoder

This is an implementation of Attention-over-Attention Model Using Transformer Encoder with PyTorch. THe origin mod model was proposed by Cui et al. (paper).

Prerequisite

PyTorch with cuda
Python 3.6+
NLTK (with punkt data)

Usage

This implementation uses cnn data.

Preprocessing

Make sure the data files (train.txt, dev.txt, test.txt) are present in the data/cnn directory.

To preprocess the data:

python preprocess_cnn.py

This will generate the dictonary(dict.pt) from all words appeared in the dataset and vectorize all data (train.txt.pt, dev.txt.pt, test.txt.pt).

Train the model

Below is an example of training a model, set the parameters as you like.

python train.py -traindata data/cnn/train.txt.pt -validdata data/cnn/dev.txt.pt -dict data/cnn/dict.pt \
 -save_model model_cnn -hidden_size 384 -embed_size 384 -batch_size 64 -dropout 0.1 \
 -epochs 13 -learning_rate 0.001 -weight_decay 0.0001 -gpu 0 -log_interval 50

After each epoch, a checkpoint will be saved, to resume a training process from checkpoint:

python train.py -train_from model_cnn/xxx_model_xxx_epoch_x.pt

Testing

python test.py -testdata data/cnn/test.txt.pt -dict data/cnn/dict.pt -out result.txt -model model_cnn/xx_checkpoint_epochxx.pt

If you run the jupyter notebook, please follow the instruction in Analyze Model.ipynb

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
aoareader		aoareader
data/cnn		data/cnn
model_cnn		model_cnn
models		models
Analyze Model.ipynb		Analyze Model.ipynb
README.org		README.org
Transformer_Encoder.jpg		Transformer_Encoder.jpg
Transformer_Model.jpg		Transformer_Model.jpg
attention_over_attention_baseline.jpeg.jpg		attention_over_attention_baseline.jpeg.jpg
test.py		test.py
train.py		train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

aoareader

aoareader

data/cnn

data/cnn

model_cnn

model_cnn

models

models

Analyze Model.ipynb

Analyze Model.ipynb

README.org

README.org

Transformer_Encoder.jpg

Transformer_Encoder.jpg

Transformer_Model.jpg

Transformer_Model.jpg

attention_over_attention_baseline.jpeg.jpg

attention_over_attention_baseline.jpeg.jpg

test.py

test.py

train.py

train.py

Repository files navigation

Attention-over-Attention Model Using Transformer Encoder

Prerequisite

Usage

Preprocessing

Train the model

Testing

If you run the jupyter notebook, please follow the instruction in Analyze Model.ipynb

About

Releases

Packages

Languages

tungngthanh/Transformer_Encoder_AoAreader

Folders and files

Latest commit

History

Repository files navigation

Attention-over-Attention Model Using Transformer Encoder

Prerequisite

Usage

Preprocessing

Train the model

Testing

If you run the jupyter notebook, please follow the instruction in Analyze Model.ipynb

About

Resources

Stars

Watchers

Forks

Languages