Skip to content

This is the official code for our paper "Simple and Scalable Nearest Neighbor Machine Translation" (ICLR 2023).

Notifications You must be signed in to change notification settings

dirkiedai/sk-mt

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

28 Commits
 
 
 
 
 
 

Repository files navigation

Simple and Scalable Nearest Neighbor Machine Translation

Official Code for our paper "Simple and Scalable Nearest Neighbor Machine Translation" (ICLR 2023).

This project impliments our SK-MT(short for Simple and Scalable kNN-MT) as well as vanilla kNN-MT. The implementation is built upon THUMT and heavily inspired by adaptive-knn-mt and KoK. Many thanks to the authors for making their code avaliable.

We also provide the implementation built upon fairseq, which can be found in fairseq branch. The performance of SK-MT we reported in our paper are evaluated based on THUMT framework.

Requirements and Installation

  • pytorch version >= 1.1.0
  • python version >= 3.6

You need to install PyTorch based on your hardware condition. Take cu116 as an example, you can install the environment by

pip install --upgrade pip

pip3 install torch==1.12.0+cu116 --extra-index-url https://download.pytorch.org/whl/cu116

pip install torch-scatter -f https://data.pyg.org/whl/torch-1.12.0+cu116.html

pip install numpy=1.23

pip install tensorboardX cffi cython dataclasses hydra-core regex sacremoses sacrebleu tqdm nltk matplotlib absl-py sklearn tensorboard bitarray six

pip install -U git+https://github.com/pltrdy/pyrouge

Instructions

Pre-trained Model

The pre-trained translation model can be downloaded from this site. We use the De-En Single Model and follow adaptive-knn-mt to evaluate the performance of the kNN-MT and adaptive kNN-MT. We provide the Transformer-based model we use in our experiments under those two frameworks: fairseq Model and THUMT Model.

Data

The raw data can be downloaded in this site, and you should preprocess them with moses toolkits and the bpe-codes provided by pre-trained model. To implement SK-MT, we recommend to follow copyisallyouneed to perform text retrieval using BM25. The obtained textual data can be used in THUMT framework. Moreover, if you favor fairseq, you are required to follow its instruction to preprocess and binarize the textual data. For convenience, We also provide pre-processed textual data for THUMT and binarized data for fairseq.

Update: We also provide the scripts to retrieve reference samples.

Domain Adaptation

This section provides instructions to perform SK-MT based on THUMT framework. More information about the implementations on fairseq framework can be found in the fairseq branch.

Retrieval and Preprocessing

bash scripts/domain_adaptation/preprocess.sh

Inference with SK-MT

bash scripts/domain_adaptation/run_sk_mt.sh

The batch size and other parameters should be adjusted by yourself depending on the hardware condition. We recommend to adopt the following hyper-parameters to replicate good SK-MT results.

tm counts $k$ $\tau$
SK-MT$_{1}$ 2 1 100
SK-MT$_{2}$ 16 2 100

Inference with NMT

bash scripts/domain_adaptation/run_nmt.sh

Online Learning

Inference with SK-MT

bash scripts/online_learning/run_sk_mt.sh

The recommeded hyper-parameters are the same as what used in Domain Adaptation.

Citation

If you find this repo helpful for your research, please cite the following paper:

@inproceedings{DBLP:conf/iclr/DaiZLCLD023,
  author       = {Yuhan Dai and
                  Zhirui Zhang and
                  Qiuzhi Liu and
                  Qu Cui and
                  Weihua Li and
                  Yichao Du and
                  Tong Xu},
  title        = {Simple and Scalable Nearest Neighbor Machine Translation},
  booktitle    = {The Eleventh International Conference on Learning Representations,
                  {ICLR} 2023, Kigali, Rwanda, May 1-5, 2023},
  publisher    = {OpenReview.net},
  year         = {2023},
  url          = {https://openreview.net/pdf?id=uu1GBD9SlLe},
  timestamp    = {Fri, 30 Jun 2023 14:55:53 +0200},
  biburl       = {https://dblp.org/rec/conf/iclr/DaiZLCLD023.bib},
  bibsource    = {dblp computer science bibliography, https://dblp.org}
}

Contact

If you have questions, suggestions and bug reports, please email dirkiedye@gmail.com or zrustc11@gmail.com.

About

This is the official code for our paper "Simple and Scalable Nearest Neighbor Machine Translation" (ICLR 2023).

Topics

Resources

Stars

Watchers

Forks