Skip to content

Latest commit

 

History

History
53 lines (45 loc) · 2.22 KB

File metadata and controls

53 lines (45 loc) · 2.22 KB

Machine Translation Using Transformer

Data

The used translation dataset IWSLT14 is from fairseq: https://github.com/facebookresearch/fairseq/blob/main/examples/translation/README.md#iwslt14-german-to-english-transformer

# install dependencies
bash scripts/install_libs.sh
# download data
mkdir -p local/data/exp
bash examples/translation-iwslt14-en-de local/data/iwslt14

Training

# Ours de -> en, args are: data_dir exp_dir cuda_device src_lang tgt_lang
bash examples/translation-iwslt14-en-de/train.sh local/data/iwslt14/data-converted-en-de-raw local/data/exp/iwslt14-en-de 0 de en 
# Ours de - en, eval 
bash examples/translation-iwslt14-en-de/predict.sh local/data/iwslt14/data-converted-en-de-raw local/data/exp/iwslt14-en-de 0 en de
# Ours en -> de, train 
bash examples/translation-iwslt14-en-de/train.sh local/data/iwslt14/data-converted-en-de-raw local/data/exp/iwslt14-en-de 0 en de
# Ours en -> de, eval 
bash examples/translation-iwslt14-en-de/predict.sh local/data/iwslt14/data-converted-en-de-raw local/data/exp/iwslt14-en-de 0 en de
# fairseq de -> en, args are: data_dir cuda_device src_lang tgt_lang
bash examples/translation-iwslt14-en-de/fairseq_run.sh local/data/iwslt14 0 de en
# fairseq en -> de
bash examples/translation-iwslt14-en-de/fairseq_run.sh local/data/iwslt14 0 en de

BLEU Results

Model de -> en en -> de
Transformer 33.27 27.72
Tied Transformers 35.10 29.07
fairseq 34.54 28.61
Ours 34.36 28.33

Reference

Note

This example is tested, the commit id: 96342488422662624ff16da4ba015e0ce21a40a5, switch to this commit if some paths are not valid anymore.

git reset --hard 96342488422662624ff16da4ba015e0ce21a40a5