Skip to content

Non-autoregressive neural machine translation with monolingual data

License

Notifications You must be signed in to change notification settings

jzhou316/nar-mt-mono

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

11 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

nar-mt-mono

MIT License

Non-autoregressive neural machine translation with monolingual data

Paper link:
Improving Non-autoregressive Neural Machine Translation with Monolingual Data (ACL 2020)
Jiawei Zhou, Phillip Keung

Data

Paired Data

From the github repo.

Download the datasets and extract at the current directory. All the corpus are tokenized and BLEU is evaluated on the tokenized corpus.

Monolingual Data

See the scripts and notes in data_procs, which includes the general pipeline of

  • data downloading
  • data processing

that are consistent with the NAR literature.

Results

BLEU scores

Model WMT16 WMT16 WMT14 WMT14
En -> Ro Ro -> En En -> De De -> En
Our NAR baseline 31.21 32.06 23.57 29.01
+ monolingual data 31.96 33.57 25.73 30.18
+ longer training till convergence 32.30 33.56 26.54 30.80

Citing

@article{zhou2020improving,
  title={Improving Non-autoregressive Neural Machine Translation with Monolingual Data},
  author={Zhou, Jiawei and Keung, Phillip},
  journal={arXiv preprint arXiv:2005.00932},
  year={2020}
}

About

Non-autoregressive neural machine translation with monolingual data

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages