Machine Translation

alvations edited this page Dec 14, 2016 · 12 revisions

Statistical machine translation (SMT) is a rapidly growing area within Computational Linguistics. NLTK should provide a pathway for students who want to learn about the basic algorithms. View NLTK activity on SMT.

Existing functionality

Existing functionality is mostly in the translate submodule. It includes:

  • IBM Models 1-3 translate/ibm{1,2,3}.py
  • MT evaluation metrics
  • BLEU translate/bleu_score.py
  • RIBES translate/ribes_score.py
  • ChrF translate/chrf_score.py
  • GLEU translate/gleu_score.py
  • Gale-Church Sentence Aligner translate/gale_church.py
  • Aligned sentence reader corpus/reader/aligned.py
  • Grow-Diagonal-Final-And Phrase Extraction translate/phrase_based.py

Planned functionality

We would like to add functionality in the following areas:

Third-party implementations

Existing Python implementations that could possibly be incorporated into NLTK

Useful links

You can’t perform that action at this time.
You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session.
Press h to open a hovercard with more details.