Skip to content

Rj7/NMT-for-morphologically-rich-languages

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Improving Neural Machine Translation for Morphologically rich languages

Machine Translation aims to provide a seamless communication and interaction, thereby overcoming human language barriers. Recently, Neural Machine Translation (NMT) approaches have been very successful and achieve state-of-the-art performance in many language pairs. NMT systems consist of millions of neurons that are optimised to learn the input-output mapping between the source and the target languages. However, these systems produce poor translation quality under low-resource conditions and are unable to handle a large vocabulary particularly for languages with rich morphology such as Turkish, Tamil and German.

In this project, we present a source vocabulary expansion technique to handle the problem of translating rare and unknown words by incorporating morphological information in the words. The effectiveness of the proposed technique is demonstrated by translating from two morphologically rich languages to English. Using this technique, we achieve a performance gain of approximately 2-3 BLEU points for both German -> English and Turkish -> English.

About

Neural Machine Translation for Morphologically rich languages - Report

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages