ESRT

Bandwidth-Efficient and Privacy-Preserving Edge-Cloud Many-to-Many Speech Translation.

ESRT supports many-to-many speech-to-text translation across 45 languages (45 × 44 directions). It uses an edge-cloud split inference architecture to protect voice privacy and reduce bandwidth by transmitting only compressed acoustic features instead of raw audio.

Setup

uv venv --python 3.10
source .venv/bin/activate
uv pip install -r requirements.txt

Test Data

git clone https://huggingface.co/datasets/yxdu/fleurs_eng_test ./fleurs_eng_test

Inference

Two-stage inference: edge side and cloud side.

#Offline for Quick Testing
python test_inference.py

#Online deployment guide coming soon.

Training

Training code will be open-sourced in a future release. Validated on:

GPU: NVIDIA A100 80GB × 8
NPU: Huawei Ascend 910C 64GB × 8

Supported Languages

Family	Languages
Afro-Asiatic	Arabic, Hebrew
Austroasiatic	Khmer, Vietnamese
Austronesian	Indonesian, Malay, Tagalog
Dravidian	Tamil
Indo-European	Bengali, Bulgarian, Catalan, Czech, Danish, Dutch, English, French, German, Greek, Hindi, Croatian, Italian, Norwegian, Persian, Polish, Portuguese, Romanian, Russian, Slovak, Slovenian, Spanish, Swedish, Urdu
Japonic	Japanese
Koreanic	Korean
Kra–Dai	Lao, Thai
Sino-Tibetan	Chinese, Burmese, Cantonese
Turkic	Azerbaijani, Kazakh, Turkish, Uzbek
Uralic	Finnish, Hungarian

Citation

@misc{du2026bandwidthefficientprivacypreservingedgecloudmanytomany,
      title={Bandwidth-Efficient and Privacy-Preserving Edge-Cloud Many-to-Many Speech Translation}, 
      author={Yexing Du and Kaiyuan Liu and Youcheng Pan and Bo Yang and Ming Liu and Bing Qin and Yang Xiang},
      year={2026},
      eprint={2605.28642},
      archivePrefix={arXiv},
      primaryClass={cs.AI},
      url={https://arxiv.org/abs/2605.28642}, 
}

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
README.md		README.md
requirements.txt		requirements.txt
test_inference.py		test_inference.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ESRT

Bandwidth-Efficient and Privacy-Preserving Edge-Cloud Many-to-Many Speech Translation.

Setup

Test Data

Inference

Training

Supported Languages

Citation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

ESRT

Bandwidth-Efficient and Privacy-Preserving Edge-Cloud Many-to-Many Speech Translation.

Setup

Test Data

Inference

Training

Supported Languages

Citation

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages