PromptST: Abstract Prompt Learning for End-to-End Speech Translation

This is an implementation of the EMNLP 2023 paper "PromptST: Abstract Prompt Learning for End-to-End Speech Translation" (read the paper here).

👀 Overview

The motivation of our PromptST model is to broaden the abstract representation power of the encoder of S2T models.

Result on CoVoST En-X dataset

We report case-sensitive detokenized BLEU via the sacrebleu toolkit.

Model	En-De	En-Ca	En-Ar	En-Tr	Avg.
Continue Train	25.9	33.3	19.3	17.6	24.0
PromptST	26.4	33.7	19.6	17.9	24.4

The BLEU score of adding PromptST to different layers on the dev set.

Model	En-De	En-Ca	En-Ar	En-Tr
0-24 layers	29.9	36.7	23.5	20.4
20-24 layers	29.8	36.8	23.4	20.6
16-24 layers	29.9	36.5	23.7	20.5
12-24 layers	30.1	37.4	23.8	21.0

If you find this repo useful, please cite:

@inproceedings{yu-etal-2023-promptst,
    title = "{P}rompt{ST}: Abstract Prompt Learning for End-to-End Speech Translation",
    author = "Yu,Tengfei and Ding,Liang  and Liu,Xuebo and Chen,Kehai and Zhang,Meishan and Tao,Dacheng and Zhang,Min",
    booktitle = "Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing",
    year = "2023",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2023.emnlp-main.627",
    pages = "10140--10154",
}

Speech-Senteval Benchmark

You can download the Speech-Senteval Benchmark at Here

⬇️ Download Trained Models

The models are trained based on pytorch.

	Model
En-De	Download
En-Ca	Download
En-Ar	Download
En-Tr	Download

Training & Generation Instruction

⚙️ Requirements and Installation

PyTorch version >= 1.5.0
Python version >= 3.6
transformers == 4.27.3
For training new models, you'll also need an NVIDIA GPU and NCCL

git clone git@github.com:ytf-philip/PromptST.git
cd PromptST
pip3 install -r requirements.txt

🧪 Probing Task Analysis

Download Speech-Senteval
Unzip speech_senteval and save it to the root/data path (the unzipped dataset should contain two folders, "probing_text" and "sent_audio")
Extract every layer representation and conduct probing tasks

Run the probing task script:

bash probing_task/bash_example.sh

🚀 Train PromptST (Example: en-de)

Preprocessing Data

Download Common Voice audio clips and transcripts (version 4).
Use data_process/dataset_de.py to save the processed dataset offline.

python data_process/dataset_de.py

Training

To train the model, take En-De as an example; you may run:

python -m torch.distributed.launch --nproc_per_node=8 --master_port 21303  ./PromptST/main/en_de/en_de_continue.py

Evaluation

Convert model

python  ./main/model_convert.py

Inference

python -m torch.distributed.launch --nproc_per_node=4 --master_port 21393 --model_path ./output/model_convert ./main/en_de/en_de_inference.py

Contact

If you have any questions related to the code or the paper, feel free to email Tengfei Yu (921692739@qq.com).

Name		Name	Last commit message	Last commit date
Latest commit History 20 Commits
PromptST		PromptST
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

PromptST

PromptST

LICENSE

LICENSE

README.md

README.md

requirements.txt

requirements.txt

Repository files navigation

PromptST: Abstract Prompt Learning for End-to-End Speech Translation

👀 Overview

Result on CoVoST En-X dataset

Speech-Senteval Benchmark

⬇️ Download Trained Models

Training & Generation Instruction

⚙️ Requirements and Installation

🧪 Probing Task Analysis

🚀 Train PromptST (Example: en-de)

Contact

About

Releases

Packages

Contributors 2

Languages

License

ytf-philp/PromptST

Folders and files

Latest commit

History

Repository files navigation

PromptST: Abstract Prompt Learning for End-to-End Speech Translation

👀 Overview

Result on CoVoST En-X dataset

Speech-Senteval Benchmark

⬇️ Download Trained Models

Training & Generation Instruction

⚙️ Requirements and Installation

🧪 Probing Task Analysis

🚀 Train PromptST (Example: en-de)

Contact

About

Resources

License

Stars

Watchers

Forks

Languages