NMT State of the art
- Sequence to Sequence Learning with Neural Networks, Sutskever et al. (2014)
- Neural Machine Translation by Jointly Learning to Align and Translate, Bahdanau, Cho and Bengio (2015)
- Effective Approaches to Attention-based Neural Machine Translation, Luong et al. (2015)
- Multi-task learning for multiple language translation, Dong et al. (2015)
- Multi-task Sequence to Sequence Learning, Luong, Le, Sutskever et al. (2015)
- Multi-Way, Multilingual Neural Machine Translation with a Shared Attention Mechanism, Firat, Cho and Bengio (2016)
- Multi-Source Neural Translation, Zoph and Knight (2016)
- Tree-to-Sequence Attentional Neural Machine Translation, Eriguchi et al. (2016)
- Addressing the Rare Word Problem in Neural Machine Translation, Luong, Sutskever, Le et al. (2014)
- On Using Very Large Target Vocabulary for Neural Machine Translation, Gulcehre, Firat et al. (2015)
- Pointing the Unknown Words, Gulcehre et al. (2016)
- Character-based Neural Machine Translation, Ling et al. (2015)
Representations of words are computed from their character sequences.
- Achieving Open Vocabulary Neural Machine Translation with Hybrid Word-Character Models, Luong and Manning (2016)
Word representations are taken from an embedding matrix as usual. However, instead of mapping out-of-vocabulary words to UNK, a RNN is used to compute a representation. An RNN is also used in the decoding to output unknown words.
- A Character-level Decoder without Explicit Segmentation for Neural Machine Translation, Chung, Cho and Bengio (2016)
They do translation at the character level (no tokenization into words).
- Improving Neural Machine Translation Models with Monolingual Data, Sennrich et al. (2015)
They include monolingual (target) data in the training data by using as source sentence either a dummy symbol, or a synthetic sentence generated by back-translating the target. This method produces good results, but cannot include much more monolingual data than bilingual data.
- On Using Monolingual Corpora in Neural Machine Translation, Gulcehre et al. (2015)
Includes a language model in the decoding pipeline.
- Semi-supervised Sequence Learning, Dai and Le (2015)
Pre-training using an auto-encoder.
- Transfer Learning for Low-Resource Neural Machine Translation, Zoph et al. (2016)
Pre-training of an encoder-decoder model using another source language (with more resources). They freeze the output (English) embeddings during final training, and use a high dropout rate to avoid overfitting.
- Grammar as a Foreign Language, Vinyals et al. (2015)
- Montreal Neural Machine Translation Systems for WMT15, Jean, Firat, Cho et al. (2015)
- Improved Neural Machine Translation with SMT Features, He et al. (2016)
- Pointing the Unknown Words
- Tree-to-Sequence Attentional Neural Machine Translation
- Multi-Way, Multilingual Neural Machine Translation with a Shared Attention Mechanism
- Improved Neural Machine Translation with SMT Features