#

speech-translation

Here are 50 public repositories matching this topic...

NVIDIA / NeMo

A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)

machine-translation tts speech-synthesis neural-networks deeplearning speaker-recognition asr multimodal speech-translation large-language-models speaker-diariazation generative-ai

Updated Jun 19, 2024
Python

PaddlePaddle / PaddleSpeech

Easy-to-use Speech Toolkit including Self-Supervised Learning model, SOTA/Streaming ASR with punctuation, Streaming TTS with text frontend, Speaker Verification System, End-to-End Speech Translation and Keyword Spotting. Won NAACL2022 Best Demo Award.

Updated Jun 19, 2024
Python

mt-upc / ZeroSwot

Pushing the Limits of Zero-shot End-to-End Speech Translation

translation speech-translation

Updated Jun 19, 2024
Python

ictnlp / ComSpeech

Code for ACL 2024 main conference paper "Can We Achieve High-quality Direct Speech-to-Speech Translation Without Parallel Speech Data?".

text-to-speech machine-translation speech-translation non-autoregressive-translation speech-to-speech-translation zero-shot-speech-translation

Updated Jun 19, 2024
Python

KevKibe / African-Whisper

🚀 Framework for seamless fine-tuning of Whisper model on a multi-lingual dataset and deployment to prod.

speech speech-recognition speech-to-text whisper asr speech-translation speech-transcription

Updated Jun 18, 2024
Python

ictnlp / StreamSpeech

StreamSpeech is an “All in One” seamless model for offline and simultaneous speech recognition, speech translation and speech synthesis.

Updated Jun 18, 2024
Python

espnet / espnet

End-to-End Speech Processing Toolkit

text-to-speech deep-learning chainer end-to-end machine-translation pytorch speech-synthesis speech-recognition kaldi voice-conversion speaker-diarization speech-separation speech-enhancement spoken-language-understanding speech-translation singing-voice-synthesis

Updated Jun 18, 2024
Python

hlt-mt / FBK-fairseq

Repository containing the open source code of works published at the FBK MT unit.

deep-learning pytorch speech-to-text subtitling gender-bias speech-translation simultaneous-translation

Updated Jun 17, 2024
Python

macairececile / speech-to-pictograms

Code from the paper "Towards Speech-to-Pictograms Translation" (Interspeech 2024)

machine-translation speech-recognition pictograms speech-translation interspeech2024 augmentative-and-alternative-communication

Updated Jun 10, 2024

zhangshaolei1998 / Awesome-Simultaneous-Translation

Paper list of simultaneous translation / streaming translation, including text-to-text machine translation and speech-to-text translation.

nlp natural-language-processing streaming awesome paper machine-translation text-translation paperlist speech-translation simultaneous-translation simultaneous-machine-translation

Updated Jun 7, 2024

echogarden-project / echogarden

Easy-to-use speech toolset. Written in TypeScript. Includes tools for synthesis, recognition, alignment, speech translation, language detection, source separation and more.

text-to-speech speech language-detection speech-synthesis speech-recognition speech-to-text source-separation language-identification forced-alignment speech-translation speech-alignment

Updated May 26, 2024
TypeScript

microsoft / SpeechT5

Unified-Modal Speech-Text Pre-Training for Spoken Language Processing

speech-synthesis speech-recognition speech-translation speech-pretraining speecht5 speech2c speechlm speechut speech-text-pretraining vatlm vallex

Updated Apr 24, 2024
Python

mllpresearch / ESO-dataset

ESO speech dataset: an English-language speech corpus of the oncology domain for ASR training and benchmarking and MT benchmarking.

machine-translation automatic-speech-recognition oncology domain-adaptation speech-corpus speech-translation large-language-models llm

Updated Apr 15, 2024

csikasote / bigc

This repository contains the data resources for the LacunaFund supported project, Multimodal datasets for the Bemba Language of Zambia.

machine-translation speech-recognition zambia multimodal-learning speech-translation bemba-language image-grounded-conversations africa-language

Updated Apr 1, 2024

George0828Zhang / torch_cif

A fast parallel PyTorch implementation of the "CIF: Continuous Integrate-and-Fire for End-to-End Speech Recognition" https://arxiv.org/abs/1905.11235.

speech torch pytorch speech-recognition alignment automatic-speech-recognition speech-to-text cif asr monotonic speech-translation continuous-integrate-and-fire

Updated Feb 10, 2024
Python

Dadangdut33 / Speech-Translate

A realtime speech transcription and translation application using Whisper OpenAI and free translation API. Interface made using Tkinter. Code written fully in Python.

python translate whisper tkinter-python speech-translation speech-transcription

Updated Jan 18, 2024
Python

ictnlp / DASpeech

Code for NeurIPS 2023 paper "DASpeech: Directed Acyclic Transformer for Fast and High-quality Speech-to-Speech Translation".

machine-translation speech-translation speech-to-speech speech-to-speech-translation

Updated Jan 16, 2024
Python

mt-upc / SegAugment

SEGAUGMENT: Maximizing the Utility of Speech Translation Data with Segmentation-based Augmentations

data-augmentation audio-segmentation speech-translation

Updated Dec 21, 2023
Python

ictnlp / DiSeg

Source code for ACL 2023 paper "End-to-End Simultaneous Speech Translation with Differentiable Segmentation"

segment streaming machine-translation speech segmentation sequence-segmentation speech-translation simultaneous-translation simultaneous-machine-translation streaming-speech-to-text

Updated Dec 6, 2023
Python

yaya-sy / speechscorer

unsupervised spoken utterances scoring

speech speech-recognition whisper self-supervised-learning speech-translation hubert

Updated Nov 21, 2023
Python

Improve this page

Add a description, image, and links to the speech-translation topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the speech-translation topic, visit your repo's landing page and select "manage topics."