Machine Translation Reading List

This is a machine translation reading list maintained by the Tsinghua Natural Language Processing Group.

The past three decades have witnessed the rapid development of machine translation, especially for data-driven approaches such as statistical machine translation (SMT) and neural machine translation (NMT). Due to the dominance of NMT at the present time, priority is given to collecting important, up-to-date NMT papers. The list is still incomplete and the categorization might be inappropriate. Each paper is given its Google Scholar citation count, which will be updated monthly.

We will keep adding papers and improving the list. Any suggestions are welcome!

10 Must Reads

Statistical Machine Translation


Word-based Models

Phrase-based Models

Syntax-based Models

Discriminative Training

System Combination


Neural Machine Translation


Model Architecture

Attention Mechanism

Open Vocabulary

Training Objective


Low-resource Language Translation

Semi-supervised Learning

Unsupervised Learning

Pivot-based Methods

Data Augmentation Methods

Transfer Learning

Meta Learning

Multi-task Learning

Prior Knowledge Integration

Word/Phrase Constraints

Syntactic/Semantic Constraints

Coverage Constraints

Document-level Translation


Visualization and Interpretability

Linguistic Interpretation

Fairness and Diversity



Speech Translation


Ensemble and Reranking

Domain Adaptation

Quality Estimation

Automatic Post-Editing

Word Translation and Bilingual Lexicon Induction

Poetry Translation