Skip to content
/ MAPS-mt Public

[TACL 2024] MAPS enables LLMs🤖 to mimic the human😁 translation process.

Notifications You must be signed in to change notification settings

zwhe99/MAPS-mt

Repository files navigation

Logo

🗺️ MAPS: Multi-Aspect Prompting and Selection

Implementaion of our paper:

Exploring Human-Like Translation Strategy with Large Language Models

🔥 Update

  • [March 19, 2024]: Accepted to TACL 2024!
  • [June 14, 2023]: This work now has a Demo. Try it!
  • [June 10, 2023]: interactive.py now enables running of MAPS-mt in an interactive mode.
  • [June 10, 2023]: We now support translation between any pair of these languages: English, Chinese, Japanese, French, and German.

MAPS

Motivation

intro

The difference between machine and human translation in an English-Chinese example. Typical neural machine translation is a source-target mapping process, while human translators can take complex steps to ensure the quality and accuracy of the translation.



Framework

method

MAPS aims to enable LLMs to mimic the human translation process by multi-aspect prompting and selection.


Dependencies

  • Download COMET and BLEURT checkpoints:

    wget https://unbabel-experimental-models.s3.amazonaws.com/comet/wmt21/wmt21-comet-qe-da.tar.gz
    tar -xf wmt21-comet-qe-da.tar.gz -C eval_ckpt/
    
    wget https://storage.googleapis.com/bleurt-oss-21/BLEURT-20.zip .
    unzip -d eval_ckpt/ BLEURT-20.zip
  • Create conda env

    conda create -n maps -c pytorch -c nvidia python==3.8.13 krb5 git pytorch==2.0.0 pytorch-cuda=11.7
  • Install other python packages

    pip3 install -r requirements.txt
    

Reproduce the main results

Run

Preparation

  • Set your openai API_KEY in model/openai/translate.py
  • Set Alpaca checkpoint file in run-maps-alpaca.sh and run-translation-alpaca.sh

Run MAPS

  • text-davinci-003: sh run-maps.sh
  • Alpaca: sh run-maps-alpaca.sh

Run other methods

  • text-davinci-003: sh run-translation-003.sh
  • Alpaca: sh run-translation-alpaca.sh

Note: The translation results have already been generated and saved in the output directory. Therefore, the scripts won't repeat the inference. If you want to regenerate the results, simply delete the contents within the output directory.


Evaluation

sh run-evaluation.sh > evaluation.log


Interactive

If you just want to have a try, you can try the interactive script like this without need of GPU or CUDA (only text-davinci-003 now):

# Preparation
wget https://unbabel-experimental-models.s3.amazonaws.com/comet/wmt21/wmt21-comet-qe-da.tar.gz
tar -xf wmt21-comet-qe-da.tar.gz -C eval_ckpt/   
conda create -n maps -c pytorch python==3.8.13 pytorch==2.0.0  
conda activate maps
pip3 install -r requirements.txt
# Interactive
(maps) zwhe@zhiweideMacBook-Pro MAPS-mt % python3 interactive.py --lang-pair en-zh

Enter source English sentence: Joint Aid for Dogs is a high specification joint and muscle supplement with glucosamine for dogs, designed to aid freedom of movement.

Output:

method

Remember to set your openai API_KEY in model/openai/translate.py. You can also take a look at the demo website.

Citation

@article{he2023exploring,
    author = {He, Zhiwei and Liang, Tian and Jiao, Wenxiang and Zhang, Zhuosheng and Yang, Yujiu and Wang, Rui and Tu, Zhaopeng and Shi, Shuming and Wang, Xing},
    title = "{Exploring Human-Like Translation Strategy with Large Language
                    Models}",
    journal = {Transactions of the Association for Computational Linguistics},
    volume = {12},
    pages = {229-246},
    year = {2024},
    month = {03},
    issn = {2307-387X},
    doi = {10.1162/tacl_a_00642},
    url = {https://doi.org/10.1162/tacl\_a\_00642},
    eprint = {https://direct.mit.edu/tacl/article-pdf/doi/10.1162/tacl\_a\_00642/2346100/tacl\_a\_00642.pdf},
}

About

[TACL 2024] MAPS enables LLMs🤖 to mimic the human😁 translation process.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages