Skip to content
Switch branches/tags

Latest commit


Git stats


Failed to load latest commit information.
Latest commit message
Commit time

Dynamic Transfer Learning for Low-Resource Neural Machine Translation


[July, 2020] Updated repo with scripts and notes on experimental settings

This repo implements the following papers and associated features based on OpenNMT-tf1.15:

Transfer Learning in Multilingual Neural Machine Translation

Adapting Multilingual Neural Machine Translation to Unseen Languages

Experimental Settings



Experiments utilize the Ted Talks data, for its low-resource nature (ranging from ~5k to ~200k parallel examples) for more than 50 languages paired with English, from Qi et al.



Prepare data for src/s - tgt/s pair/s (if flag is specified, tgt-lang-id is appended on the src side):

./scripts/ ['src1-en en-src1 src2-en en-src2'] [flag] [exp-id]

Preprocess (clean, detokenize, and subword segmentation with sentencepiece):

./scripts/ [exp-id] [subword-size]

Pre-Training Parent Model

Train a parent model, that exhibits a relatively high-resource data (e.g. Portuguese-English / Pt-En).

./ [exp-id] [gpu-device]

Progressive Adaptation (ProgAdapt) to New Translation Directions

Steps for ProgAdapt of the parent model Pt-En to child low-resource pair Galician-English / Gl-En.


./scripts/ 'gl-en' [child-model_exp-id]

Data Preprocessing

./scripts/ [child-model_exp-id] [subword-size]

ProgAdapt Training

Training first customizes the parent model by taking in to consideration the child model (Gl-En) newly generated vocabulary:

./ [parent-model_exp-id] [child-model_exp-id] [gpu-device]

Progressive Growth (ProgGrow) with New Translation Directions

ProgGrow differs from progAdapt by incorporating the Pt-En parent model translation direction, while learning the new low-resource pair Gl-En (child model) direction.


./scripts/ 'pt-en gl-en' flag [child-model_exp-id]

Data Preprocessing

./scripts/ [child-model_exp-id] [subword-size]

ProgGrow Training

./ [parent-model_exp-id] [child-model_exp-id] [gpu-device]

More Options

At time of transfer-learning you can optionally:

  • Load specific components of the parent model. See load_weights in config_adapt.yml for more options:

['encoder', 'decoder', 'shared_embeddings', 'src_embs', 'tgt_embs', 'optim', 'projection'].

  • Freeze sub-networks (i.e. selectively optimize the encoder or decoder). See freeze in config_adapt.yml for options.

  • In addition to encoder and/or decoder only customization, you can pre-train a parent model with an encoder-decoder shared vocab and customize for the child model. See --shared_vocab and --new_shared_vocab options in ./

Note: to replicate the experiments reported in our work, please see further details in the experimental section of each paper.


title={Transfer learning in multilingual neural machine translation with dynamic vocabulary},
author={Lakew, Surafel M and Erofeeva, Aliia and Negri, Matteo and Federico, Marcello and Turchi, Marco},
journal={arXiv preprint arXiv:1811.01137},

title={Adapting Multilingual Neural Machine Translation to Unseen Languages},
author={Lakew, Surafel M and Karakanta, Alina and Federico, Marcello and Negri, Matteo and Turchi, Marco},
journal={arXiv preprint arXiv:1910.13998},


No releases published


No packages published