Does Joint Training Really Help Cascaded Speech Translation?

This repository contains code for the paper "Does Joint Training Really Help Cascaded Speech Translation?" (arXiv) in EMNLP 2022, based on fairseq.

Cite This Work

To cite this work, please use the following .bib:

@InProceedings{tran22:joint_training_cascaded_speech_translation,
	author={Tran, Viet Anh Khoa and Thulke, David and Gao, Yingbo and Herold, Christian and Ney, Hermann},  	
	title={Does Joint Training Really Help Cascaded Speech Translation?},  
	booktitle={Conference on Empirical Methods in Natural Language Processing},
	year=2022,  
	address={Abu Dhabi, United Arab Emirates},  
	month=nov,  
	booktitlelink={https://2022.emnlp.org/},
}

Requirements and Installation (adapted from fairseq)

PyTorch version 1.7.1
torchaudio 0.7.2
Python version >= 3.7
To install fairseq and develop locally:

git clone https://github.com/tran-khoa/joint-training-cascaded-st
cd joint-training-cascaded-st
pip install --editable ./
cd projects/speech_translation
pip install -r requirements.txt

# on MacOS:
# CFLAGS="-stdlib=libc++" pip install --editable ./

For faster training install NVIDIA's apex library:

git clone https://github.com/NVIDIA/apex
cd apex
pip install -v --no-cache-dir --global-option="--cpp_ext" --global-option="--cuda_ext" \
  --global-option="--deprecated_fused_adam" --global-option="--xentropy" \
  --global-option="--fast_multihead_attn" ./

Running experiments

The implementation is located in projects/speech_translation. Please refer to the scripts in projects/speech_translation/experiments. The term joint-seq refers to Top-K-Train in the paper, tight refers to 'Tight-Integration' as introduced in Tight integrated end-to-end training for cascaded speech translation.

License (adapted from fairseq)

fairseq(-py) is MIT-licensed. The license applies to the pre-trained models as well.

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
docs		docs
examples		examples
fairseq		fairseq
fairseq_cli		fairseq_cli
projects/speech_translation		projects/speech_translation
scripts		scripts
tests		tests
.gitignore		.gitignore
.gitmodules		.gitmodules
.isort.cfg		.isort.cfg
.pre-commit-config.yaml		.pre-commit-config.yaml
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
hubconf.py		hubconf.py
pyproject.toml		pyproject.toml
setup.cfg		setup.cfg
setup.py		setup.py
train.py		train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Does Joint Training Really Help Cascaded Speech Translation?

Cite This Work

Requirements and Installation (adapted from fairseq)

Running experiments

License (adapted from fairseq)

About

Languages

License

tran-khoa/joint-training-cascaded-st

Folders and files

Latest commit

History

Repository files navigation

Does Joint Training Really Help Cascaded Speech Translation?

Cite This Work

Requirements and Installation (adapted from fairseq)

Running experiments

License (adapted from fairseq)

About

Topics

Resources

License

Code of conduct

Stars

Watchers

Forks

Languages