Attention is all you need: A Pytorch Implementation

This is a PyTorch implementation of the Transformer model in "Attention is All You Need".

State-of-the-art performance on WMT 2014 English-to-German translation task. (2017/06/12)

A novel sequence to sequence framework utilizes the self-attention mechanism, instead of Convolution operation or Recurrent structure.

To learn more about self-attention mechanism, you could read "A Structured Self-attentive Sentence Embedding".

The project is still work in progress, now only support training.

Translating will be available soon.

Usage

0) Prepare the data

python preprocess.py -train_src <train.src.txt> -train_tgt <train.tgt.txt> -valid_src <valid.src.txt> -valid_tgt <valid.tgt.txt> -output <output.pt>

1) Training

python train.py -data <output.pt> -embs_share_weight -proj_share_weight

2) Testing

TODO

Beam search

Requirement

python 3.4+
pytorch 0.1.12
tqdm
numpy

If there is any suggestion or error, feel free to fire an issue to let me know. :)

Name		Name	Last commit message	Last commit date
Latest commit History 21 Commits
transformer		transformer
.gitignore		.gitignore
DataLoader.py		DataLoader.py
LICENSE		LICENSE
README.md		README.md
preprocess.py		preprocess.py
train.py		train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

transformer

transformer

.gitignore

.gitignore

DataLoader.py

DataLoader.py

LICENSE

LICENSE

README.md

README.md

preprocess.py

preprocess.py

train.py

train.py

Repository files navigation

Attention is all you need: A Pytorch Implementation

Usage

0) Prepare the data

1) Training

2) Testing

TODO

Requirement

About

Releases

Packages

Languages

License

awesome-archive/attention-is-all-you-need-pytorch

Folders and files

Latest commit

History

Repository files navigation

Attention is all you need: A Pytorch Implementation

Usage

0) Prepare the data

1) Training

2) Testing

TODO

Requirement

About

Resources

License

Stars

Watchers

Forks

Languages