Skip to content

nttcslab-nlp/mbe-nmt

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Context-aware Neural Machine Translation with Mini-batch Embedding

This respository includes the example scripts of the following paper:

Context-aware Neural Machine Translation with Mini-batch Embedding
Makoto Morishita, Jun Suzuki, Tomoharu Iwata, Masaaki Nagata
https://www.aclweb.org/anthology/2021.eacl-main.214/

Requirements

pip install "sacrebleu[ja]"
  • NVIDIA GPU with CUDA

Data preprocessing

This will download the corpora and preprocess the files.

$ cd ./corpus
$ ./process.sh

Build fairseq

In order to run fairseq, you need to build.

$ cd ./tools/fairseq_doc
$ pip install --editable .

Training

The training scripts are available in ./en-ja/. You may need to change the PROJECT_DIR variable in the scripts.

This is an example of training a MBE enc model.

$ cd ./en-ja
$ nohup train_model_mbe_enc.sh 1 &> train_model_mbe_enc.log &

Contact

Please send an issue on GitHub or contact us by email.

NTT Communication Science Laboratories
Makoto Morishita
makoto.morishita.gr -a- hco.ntt.co.jp

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published