Code for "Sparse MoE with Language-Guided Routing for Multilingual Machine Translation". Our implementation is based on fairseq-moe.
- Authors: Xinyu Zhao, Xuxi Chen, Yu Cheng and Tianlong Chen
- Paper: OpenReview
conda create -n fairseq python=3.8 -y && conda activate fairseq
git clone https://github.com/pytorch/fairseq && cd fairseq
pip install --editable ./
pip install fairscale==0.4.0 hydra-core==1.0.7 omegaconf==2.0.6
pip install boto3 zmq iopath tensorboard nltk
pip install sacrebleu[ja] sacrebleu[ko] wandb
wandb login
- Raw data download: [OPUS-100](https://object.pouta.csc.fi/OPUS-100/v1.0/opus-100-corpus-v1.0.tar.gz && tar -xzf opus-100-corpus-v1.0.tar.gz)
- Preprocessing pipeline: num-multi
- For the training script, see the example in train_scripts/train.sh
- To load language embedding, please first replace the lang_dict.txt in the processed data folder with assets/lang_dict.txt with a consistent index.
- First, generate translation data:
bash eval_scripts/generate.sh -d opus16 -s save_dir -n 8 -c 1.0
(8 gpus. 1.0 capacity factor) - Then compute BLEU:
bash eval_scripts/eval.sh -d opus16 -s save_dir
@inproceedings{
zhao2024sparse,
title={Sparse MoE with Language Guided Routing for Multilingual Machine Translation},
author={Xinyu Zhao and Xuxi Chen and Yu Cheng and Tianlong Chen},
booktitle={The Twelfth International Conference on Learning Representations},
year={2024},
url={https://openreview.net/forum?id=ySS7hH1smL}
}