Skip to content

jongwooko/MUSC

Repository files navigation

[Official] Synergy with Translation Artifacts for Training and Inference in Multilingual Tasks

This repository contains code for the paper "Synergy with Translation Artifacts for Training and Inference in Multilingual Tasks" presented in EMNLP 2022.

Reproducibility Checklist

  • We used "bert-base-multilingual-cased". Vocab size is about 120,000 and the number of parameters is about 180M.
  • We used GeForce RTX 3090. For training MUSC on XNLI (the largest time-consuming task), about 2 days are required.

How to start

All steps start from the root directory.

  1. Set conda env
cd data
bash install_tools.sh
  1. Download datasets
source activate fsxlt
conda install -c conda-forge transformers
pip install networkx==1.11

cd data
bash scripts/download_data.sh
  1. MUSC (refer to exps folder)
source activate fsxlt
pip install -r requirements.txt

Contact

References

About

Code for the paper "Synergy with Translation Artifacts for Training and Inference in Multilingual Tasks" (EMNLP 2022)

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published