De-En NMT

This is transformer-based NMT, implemented by pytorch-lightning. See more detail below. https://arxiv.org/abs/1706.03762

Quick Start

python translate.py [sentence]

batch_size: the size of mini-batches (default=128)
epoch: the number of iterations training the whole batches (default=30)
embedding_dim: the dimension of embedding vector (default=64)
hidden_size: the size of hidden state (default=200)
n_layers: the number of transformer layers in both encoder and decoder (default=4)
n_head: the number of head in multi-head attention (default=8)
dropout: the ratio of nodes which will be dropped (default=.5)
lr: the learning rate (default=1e-3)
device: device (default='cuda')
early_stopping: early stopping patience, not use early stopping if 0 (default=3)
weight_sharing: whether use weightsharing or not (store true)
save_model: whether save model or not, best 3 models will be saved (store true)

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
module		module
README.md		README.md
train.py		train.py
translate.py		translate.py