This is a TensorFlow 2.x implementation of the seq2seq model augmented with attention mechanism (Luong-style or Bahdanau-style) for neural machine translation. Follow this guide for a conceptual understanding about how seq2seq model works.
The implementation in this repo is designed to have the same command line interface as the Transformer implementation. Follow that link for detailed instructions on data preparation, training, evaluation and attention weights visualization.
Unlike Transformer, the seq2seq model augmented with attention mechanism involves only target-to-source attention. Shown below is the attention weights w.r.t each source token (English) when translating the target token (German) one at a time.
- Effective Approaches to Attention-based Neural Machine Translation, Luong et al. https://arxiv.org/abs/1508.04025
- Neural Machine Translation by Jointly Learning to Align and Translate, Bahdanau et al. https://arxiv.org/abs/1409.0473