NMT State of the art

Key papers

Sequence to Sequence Learning with Neural Networks, Sutskever et al. (2014)
Neural Machine Translation by Jointly Learning to Align and Translate, Bahdanau, Cho and Bengio (2015)
Effective Approaches to Attention-based Neural Machine Translation, Luong et al. (2015)

Multi-task

Multi-task learning for multiple language translation, Dong et al. (2015)
Multi-task Sequence to Sequence Learning, Luong, Le, Sutskever et al. (2015)
Multi-Way, Multilingual Neural Machine Translation with a Shared Attention Mechanism, Firat, Cho and Bengio (2016)

Representations of words are computed from their character sequences.

Achieving Open Vocabulary Neural Machine Translation with Hybrid Word-Character Models, Luong and Manning (2016)

Word representations are taken from an embedding matrix as usual. However, instead of mapping out-of-vocabulary words to UNK, a RNN is used to compute a representation. An RNN is also used in the decoding to output unknown words.

A Character-level Decoder without Explicit Segmentation for Neural Machine Translation, Chung, Cho and Bengio (2016)

They do translation at the character level (no tokenization into words).

Transfer-learning

Improving Neural Machine Translation Models with Monolingual Data, Sennrich et al. (2015)

They include monolingual (target) data in the training data by using as source sentence either a dummy symbol, or a synthetic sentence generated by back-translating the target. This method produces good results, but cannot include much more monolingual data than bilingual data.

On Using Monolingual Corpora in Neural Machine Translation, Gulcehre et al. (2015)

Includes a language model in the decoding pipeline.

Semi-supervised Sequence Learning, Dai and Le (2015)

Pre-training using an auto-encoder.

Transfer Learning for Low-Resource Neural Machine Translation, Zoph et al. (2016)

Pre-training of an encoder-decoder model using another source language (with more resources). They freeze the output (English) embeddings during final training, and use a high dropout rate to avoid overfitting.

Misc

Grammar as a Foreign Language, Vinyals et al. (2015)
Montreal Neural Machine Translation Systems for WMT15, Jean, Firat, Cho et al. (2015)
Improved Neural Machine Translation with SMT Features, He et al. (2016)

To Read

Pointing the Unknown Words
Tree-to-Sequence Attentional Neural Machine Translation
Multi-Way, Multilingual Neural Machine Translation with a Shared Attention Mechanism
Improved Neural Machine Translation with SMT Features

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

NMT State of the art

Key papers

Multi-task

Multi-source

Rare words problem

Character-level translation

Transfer-learning

Misc

To Read

Clone this wiki locally