This is an implementation of paper "End-to-end Speech Translation via Cross-modal Progressive Training" (Interspeech2021)
-
Updated
May 1, 2022 - Python
This is an implementation of paper "End-to-end Speech Translation via Cross-modal Progressive Training" (Interspeech2021)
Limit the use of end-to-end data for Speech Translation (by leveraging Automatic Speech Recognition and Machine Translation data instead) using zero-shot multilingual text translation techniques.
Systems submitted to IWSLT 2022 by the MT-UPC group.
code for paper "Cross-modal Contrastive Learning for Speech Translation" (NAACL 2022)
PyTorch toolkit for streaming speech recognition, speech translation and simultaneous translation based on fairseq.
Code for the paper "Does Joint Training Really Help Cascaded Speech Translation?" (EMNLP 2022)
Code for EMNLP 2022 main conference paper "Information-Transport-based Policy for Simultaneous Translation"
Speech to text and translation client-server using Google cloud
SHAS: Approaching optimal Segmentation for End-to-End Speech Translation
Revisiting End-to-End Speech-to-Text Translation From Scratch
Zero -- A neural machine translation system
Code for the INTERSPEECH 2023 paper "Learning When to Speak: Latency and Quality Trade-offs for Simultaneous Speech-to-Speech Translation with Offline Models"
The project for speech translation
Code for ACL 2022 main conference paper "STEMM: Self-learning with Speech-text Manifold Mixup for Speech Translation".
Code for ACL 2023 main conference paper "Understanding and Bridging the Modality Gap for Speech Translation".
Code for ACL 2023 main conference paper "Back Translation for Speech-to-text Translation Without Transcripts".
unsupervised spoken utterances scoring
Source code for ACL 2023 paper "End-to-End Simultaneous Speech Translation with Differentiable Segmentation"
A realtime speech transcription and translation application using Whisper OpenAI and free translation API. Interface made using Tkinter. Code written fully in Python.
A fast parallel PyTorch implementation of the "CIF: Continuous Integrate-and-Fire for End-to-End Speech Recognition" https://arxiv.org/abs/1905.11235.
Add a description, image, and links to the speech-translation topic page so that developers can more easily learn about it.
To associate your repository with the speech-translation topic, visit your repo's landing page and select "manage topics."