speech-to-text

Star

Here are 1,570 public repositories matching this topic...

SYSTRAN / faster-whisper

Star

Faster Whisper transcription with CTranslate2

deep-learning inference transformer speech-recognition openai speech-to-text quantization whisper

Updated Aug 16, 2025
Python

m-bain / whisperX

Star

WhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarization)

speech speech-recognition speech-to-text whisper asr

Updated Jul 2, 2025
Python

jianchang512 / pyvideotrans

Star

Translate the video from one language to another and add dubbing. 将视频从一种语言翻译为另一种语言，同时支持语音识别转录、语音合成、字幕翻译。

text-to-speech speech-to-text video-transition

Updated Sep 18, 2025
Python

speechbrain / speechbrain

Star

A PyTorch-based Speech Toolkit

Updated Sep 25, 2025
Python

Uberi / speech_recognition

Star

Speech recognition module for Python, supporting several engines and APIs, online and offline.

audio python speech-recognition speech-to-text

Updated Sep 14, 2025
Python

KoljaB / RealtimeSTT

Star

A robust, efficient, low-latency speech-to-text library with advanced voice activity detection, wake word activation and instant transcription.

python realtime speech-to-text

Updated Jul 11, 2025
Python

nl8590687 / ASRT_SpeechRecognition

Star

A Deep-Learning-Based Chinese Speech Recognition System 基于深度学习的中文语音识别系统

python tensorflow keras cnn python3 speech-recognition speech-to-text ctc chinese-speech-recognition asrt

Updated Sep 6, 2025
Python

FunAudioLLM / SenseVoice

Star

Multilingual Voice Understanding Model

multilingual python ai pytorch speech-recognition speech-to-text asr cross-lingual speech-emotion-recognition audio-event-classification aigc llm gpt-4o

Updated Aug 15, 2025
Python

modelscope / FunClip

Star

Open-source, accurate and easy-to-use video speech recognition & clipping tool, LLM based AI clipping intergrated.

speech-recognition speech-to-text gradio video-clip subtitles-generator video-subtitles llm gradio-python-llm

Updated Jul 11, 2025
Python

Gradio WebUI for creators and developers, featuring key TTS (Edge-TTS, kokoro) and zero-shot Voice Cloning (E2 & F5-TTS, CosyVoice), with Whisper audio processing, YouTube download, Demucs vocal isolation, and multilingual translation.

text-to-speech translator audiobook podcasts tts speech-synthesis subtitles speech-recognition webui speech-to-text karaoke transcription gradio whisper voice-conversion voice-cloning yt-dlp faster-whisper whisperx

Updated Jul 20, 2025
Python

huggingface / speech-to-speech

Star

Speech To Speech: an effort for an open-sourced and modular GPT4-o

python machine-learning ai speech speech-synthesis assistant speech-to-text language-model speech-translation

Updated Apr 15, 2025
Python

jianchang512 / stt

Star

Voice Recognition to Text Tool / 一个离线运行的本地音视频转字幕工具，输出json、srt字幕、纯文字格式

speech speech-recognition speech-to-text stt

Updated Aug 29, 2025
Python

ictnlp / LLaMA-Omni

Star

LLaMA-Omni is a low-latency and high-quality end-to-end speech interaction model built upon Llama-3.1-8B-Instruct, aiming to achieve speech capabilities at the GPT-4o level.

speech-to-text speech-to-speech large-language-models multimodal-large-language-models speech-language-model speech-interaction

Updated May 19, 2025
Python

ahmetoner / whisper-asr-webservice

Sponsor

Star

OpenAI Whisper ASR Webservice API

docker speech speech-recognition automatic-speech-recognition speech-to-text asr openai-whisper

Updated Jul 1, 2025
Python

tensorflow / lingvo

Star

Lingvo

nlp research translation tensorflow machine-translation speech distributed tts speech-synthesis mnist speech-recognition lm seq2seq speech-to-text gpu-computing language-model asr

Updated Sep 26, 2025
Python

Blaizzy / mlx-audio

Sponsor

Star

A text-to-speech (TTS), speech-to-text (STT) and speech-to-speech (STS) library built on Apple's MLX framework, providing efficient speech analysis on Apple Silicon.

text-to-speech transformers speech-synthesis speech-recognition speech-to-text audio-processing mlx multimodal apple-silicon