Skip to content
No description, website, or topics provided.
Python Shell
Branch: master
Clone or download
Latest commit cdfac54 Nov 26, 2019
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
scripts add scripts Nov 26, 2019
.gitignore Initial commit Nov 11, 2019
LICENSE Initial commit Nov 22, 2019
LICENSE_nyu Initial commit Nov 22, 2019
README.md Update README.md Nov 22, 2019
bontune_wmt.sh Initial commit Nov 22, 2019
data.py Initial commit Nov 22, 2019
decode.py Initial commit Nov 22, 2019
decode_wmt.sh Initial commit Nov 22, 2019
distill.py Initial commit Nov 22, 2019
joint_wmt.sh Initial commit Nov 22, 2019
mle_wmt.sh Initial commit Nov 22, 2019
model.py initial commit Nov 22, 2019
mscoco.py Initial commit Nov 22, 2019
run.py Initial commit Nov 22, 2019
test.py Initial commit Nov 22, 2019
train.py Initial commit Nov 22, 2019
tune_wmt.sh Initial commit Nov 22, 2019
utils.py Initial commit Nov 22, 2019

README.md

Minimizing the Bag-of-Ngrams Difference for Non-Autoregressive Neural Machine Translation

PyTorch implementation of the models described in the paper Minimizing the Bag-of-Ngrams Difference for Non-Autoregressive Neural Machine Translation .

Dependencies

Python

  • Python 3.6
  • PyTorch >= 0.4
  • Numpy
  • NLTK
  • torchtext 0.2.1
  • torchvision
  • revtok
  • multiset
  • ipdb

Related code

Downloading Datasets

The original translation corpora can be downloaded from (IWLST'16 En-De, WMT'16 En-Ro, WMT'14 En-De). We recommend you to download the preprocessed corpora released in dl4mt-nonauto. Set correct path to data in data_path() function located in data.py before you run the code.

BoN-Joint

Combine the BoN objective and the cross-entropy loss to train NAT from scratch. This process usually takes about 5 days.

$ sh joint_wmt.sh

Take a checkpoint and train the length prediction model. This process usually takes about 1 day.

$ sh tune_wmt.sh

Decode the test set. This process usually takes about 20 seconds.

$ sh decode_wmt.sh

BoN-FT

First, train a NAT model using the cross-entropy loss. This process usually takes about 5 days.

$ sh mle_wmt.sh

Then, take a pre-trained checkpoint and finetune the NAT model using the BoN objective. This process usually takes about 3 hours.

$ sh bontune_wmt.sh

Take a finetuned checkpoint and train the length prediction model. This process usually takes about 1 day.

$ sh tune_wmt.sh

Decode the test set. This process usually takes about 20 seconds.

$ sh decode_wmt.sh

Reinforce-NAT

We also implement Reinforce-NAT (line 1294-1390) described in the paper Retrieving Sequential Information for Non-Autoregressive Neural Machine Translation. See RSI-NAT for the usage.

Citation

If you find the resources in this repository useful, please consider citing:

@article{Shao:19,
  author    = {Chenze Shao, Yang Feng, Jinchao Zhang, Fandong Meng, Xilin Chen, Jie Zhou},
  title     = {Minimizing the Bag-of-Ngrams Difference for Non-Autoregressive Neural Machine Translation},
  year      = {2019},
  journal   = {arXiv preprint arXiv:1911.09320},
}
You can’t perform that action at this time.