Code for paper - Lexical-constrained-aware neural machine translation
The code is implemented on fairseq v0.6.1, follow the same steps to install and prepare the processed fairseq dataset, the WMT process script is here.
Step 1: Install fairseq.
## you may want to build a conda environment first.
git clone https://github.com/ghchen18/leca.git
cd leca
pip install --editable .
Step 2: Process dataset
Follow the steps in the fairseq repo. More dataset can be found in WMT Translation Task. Because of the difference between the used dictionaries, the data preprocessing should use the preprocess.py
in this repo instead of the official fairseq repo.
See scripts/run.sh. You may need to revise the variables in the shell scripts first according to your case.
@inproceedings{chen2020leca,
title = {Lexical-Constraint-Aware Neural Machine Translation via Data Augmentation},
author = {Chen, Guanhua and Chen, Yun and Wang, Yong and Li, Victor O.K.},
booktitle = {Proceedings of {IJCAI} 2020: Main track},
pages = {3587--3593},
year = {2020},
month = {7},
}