Skip to content

awesome-archive/attention-is-all-you-need-pytorch

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

21 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Attention is all you need: A Pytorch Implementation

This is a PyTorch implementation of the Transformer model in "Attention is All You Need".

State-of-the-art performance on WMT 2014 English-to-German translation task. (2017/06/12)

A novel sequence to sequence framework utilizes the self-attention mechanism, instead of Convolution operation or Recurrent structure.

To learn more about self-attention mechanism, you could read "A Structured Self-attentive Sentence Embedding".

The project is still work in progress, now only support training.

Translating will be available soon.

Usage

0) Prepare the data

python preprocess.py -train_src <train.src.txt> -train_tgt <train.tgt.txt> -valid_src <valid.src.txt> -valid_tgt <valid.tgt.txt> -output <output.pt>

1) Training

python train.py -data <output.pt> -embs_share_weight -proj_share_weight

2) Testing

TODO

  • Beam search

Requirement

  • python 3.4+
  • pytorch 0.1.12
  • tqdm
  • numpy

If there is any suggestion or error, feel free to fire an issue to let me know. :)

About

A PyTorch implementation of the Transformer model in "Attention is All You Need".

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 100.0%