Improving Translation Faithfulness of Large Language Models via Augmenting Instructions

Overview

We introduces a novel method named SWIE (Segment-Weighted Instruction Embedding), which utilizes parameterized adapters to encode instruction and introduces segmented weight to enable a natural integration of instruction representations and global representations. In order to further improve the model translation faithfulness, we present OVERMISS , an instruction dataset that utilizes our proposed framework to collect contrastive negative samples that specifically target over-translation and miss-translation issues. The paper has released in arxiv, please refer it for more details.

Figure 1: The model structure of SWIE

Figure 2: An instance of translation instruction and an instance of OVERMISS

Environment

python 3.8.3
transformers 4.28.0.dev0
deepspeed==0.8.3
numpy==1.21
torch==2.0.1+cu117
accelerate==0.16.0
datasets==2.9.0
sentencepiece
sacrebleu

Dataset

Training Data

Parrot-hint: open-source at https://github.com/wxjiao/ParroT

OverMiss: train_data/overmiss_hf.json

Test Data

Flores: directory test/Flores

WMT22/WMT22-concat/WMT22-zero-shot : directory test/WMT22

How to Use

Train

for LLaMA-7b:

sh train_scripts/finetune_4gpu_llama.sh

for BLOOMZ-3b:

sh train_scripts/finetune_8gpu.sh

for BLOOMZ-7b1-mt:

sh train_scripts/finetune_4gpu.sh

Inference

Run the following script to get model inference result.

sh infer_scripts/run_infer.sh

Experiment

The experiment results are show as the following table.

Citation

Please kindly cite us if you find the paper/code helpful.

@inproceedings{chen2023improving,
    title={Improving Translation Faithfulness of Large Language Models via Augmenting Instructions},
    author={Yijie Chen and Yijin Liu and Fandong Meng and Yufeng Chen and Jinan Xu and Jie Zhou},
    year={2023},
    eprint={2308.12674},
    archivePrefix={arXiv},
    primaryClass={cs.CL}
}

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
fig		fig
infer_scripts		infer_scripts
src		src
test		test
train_data		train_data
train_scripts		train_scripts
utils		utils
README.md		README.md
ds_config1.json		ds_config1.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fig

fig

infer_scripts

infer_scripts

src

src

test

test

train_data

train_data

train_scripts

train_scripts

utils

utils

README.md

README.md

ds_config1.json

ds_config1.json

Repository files navigation

Improving Translation Faithfulness of Large Language Models via Augmenting Instructions

Overview

Environment

Dataset

Training Data

Test Data

How to Use

Train

Inference

Experiment

Citation

About

Releases

Packages

Languages

pppa2019/swie_overmiss_llm4mt

Folders and files

Latest commit

History

Repository files navigation

Improving Translation Faithfulness of Large Language Models via Augmenting Instructions

Overview

Environment

Dataset

Training Data

Test Data

How to Use

Train

Inference

Experiment

Citation

About

Resources

Stars

Watchers

Forks

Languages