TAB (Tuning with Auxiliary Branch)

Code and pretrained models for ACL 2023 main conference short paper: "Modality Adaption or Regularization? A Case Study on End-to-End Speech Translation".

Environment Configuration

conda create -n tab python=3.6
conda activate tab
conda install pytorch==1.8.0 torchaudio==0.8.0 cudatoolkit=11.1 -c pytorch -c conda-forge
# cudatoolkit=10.2 is also acceptable.

mkdir ~/TAB && cd ~/TAB && mkdir data checkpoints
git clone https://github.com/hannlp/tab Fairseq-S2T
cd Fairseq-S2T
pip install --upgrade pip
pip install --editable ./
pip install sacrebleu==1.4.13 pandas sacremoses sentencepiece configargparse gpustat tensorboard editdistance

Examples on MuST-C en-de

cd ~/TAB/Fairseq-S2T/egs/mustc/st

Data Preparation

ln -s path/to/mustc_root/ende ~/TAB/data/mustc.en-de

bash run.sh --tgt_lang de
# It may take a bit of time.

Once you have pre-processed the data, the directory should look like this:

ls ~/TAB/data/mustc.en-de/st
config_share.yaml  dev.tsv      spm_unigram10000_st_share.model  spm_unigram10000_st_share.vocab  tst-COMMON.en
dev.en             fbank80.zip  spm_unigram10000_st_share.txt    train.tsv                        tst-COMMON.tsv

Tuning with auxiliary branch

Download the pre-trained ASR and MT models and modify the parameters in conf/tab.yaml such as load-pretrained-xxx-from to match the corresponding models.
Some other parameters in the paper can be adjusted, including: use-auxiliary-branch,consistency-type,consistency-weight,replacement-probability-strategy,replacement-probability,uncertainty-gamma.
Run the following command to start training (Note that gpu_num * max_tokens * update_freq should be 320k. And exp_tag can be set to mark and distinguish different experiments).

bash run.sh --stage 1 --stop_stage 1 --tgt_lang de --train_config tab --exp_tag uncertainty_gamma0.5_alpha5 --gpu_num 4 --max_tokens 20000 --update_freq 4

The trained models will be saved at ~/TAB/checkpoints/mustc.en-de/st/{DATE}_tab_{EXP_TAG}

Evaluate

bash run.sh --stage 2 --stop_stage 2 --tgt_lang de --exp_name {DATE}_tab_{EXP_TAG} --gpu_num 1 --test_subset dev,tst-COMMON

Contact and Citation

To keep things simple, we have not included the scripts for the pre-training stage. If you have any questions, please do not hesitate to contact me at hanyuchen114@gmail.com. If you find this repository helpful, please cite it as:

@inproceedings{han-etal-2023-modality,
    title = "Modality Adaption or Regularization? A Case Study on End-to-End Speech Translation",
    author = "Han, Yuchen  and
      Xu, Chen  and
      Xiao, Tong  and
      Zhu, Jingbo",
    booktitle = "Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)",
    month = jul,
    year = "2023",
    address = "Toronto, Canada",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2023.acl-short.115",
    pages = "1340--1348",
    abstract = "Pre-training and fine-tuning is a paradigm for alleviating the data scarcity problem in end-to-end speech translation (E2E ST). The commonplace {''}modality gap{''} between speech and text data often leads to inconsistent inputs between pre-training and fine-tuning. However, we observe that this gap occurs in the early stages of fine-tuning, but does not have a major impact on the final performance. On the other hand, we find that there has another gap, which we call the {''}capacity gap{''}: high resource tasks (such as ASR and MT) always require a large model to fit, when the model is reused for a low resource task (E2E ST), it will get a sub-optimal performance due to the over-fitting. In a case study, we find that the regularization plays a more important role than the well-designed modality adaption method, which achieves 29.0 for en-de and 40.3 for en-fr on the MuST-C dataset.",
}

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
docs		docs
egs		egs
examples		examples
fairseq		fairseq
fairseq_cli		fairseq_cli
scripts		scripts
tests		tests
.gitignore		.gitignore
.gitmodules		.gitmodules
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
Fairseq-README.md		Fairseq-README.md
FairseqS2T-README.md		FairseqS2T-README.md
LICENSE		LICENSE
README.md		README.md
hubconf.py		hubconf.py
pyproject.toml		pyproject.toml
setup.py		setup.py
train.py		train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

TAB (Tuning with Auxiliary Branch)

Environment Configuration

Examples on MuST-C en-de

Data Preparation

Tuning with auxiliary branch

Evaluate

Contact and Citation

About

Releases

Packages

Languages

License

hannlp/TAB

Folders and files

Latest commit

History

Repository files navigation

TAB (Tuning with Auxiliary Branch)

Environment Configuration

Examples on MuST-C en-de

Data Preparation

Tuning with auxiliary branch

Evaluate

Contact and Citation

About

Resources

License

Code of conduct

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages