WhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarization)
-
Updated
Aug 21, 2024 - Python
WhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarization)
Faster Whisper transcription with CTranslate2
Translate the video from one language to another and add dubbing. 将视频从一种语言翻译为另一种语言,并支持api调用
A PyTorch-based Speech Toolkit
Speech recognition module for Python, supporting several engines and APIs, online and offline.
A Deep-Learning-Based Chinese Speech Recognition System 基于深度学习的中文语音识别系统
Open-source, accurate and easy-to-use video speech recognition & clipping tool, LLM based AI clipping intergrated.
Speech To Speech: an effort for an open-sourced and modular GPT4-o
Lingvo
Multilingual Voice Understanding Model
🎙Speech recognition using the tensorflow deep learning framework, sequence-to-sequence neural networks
Voice Recognition to Text Tool / 一个离线运行的本地语音识别转文字服务,输出json、srt字幕带时间戳、纯文字格式
OpenAI Whisper ASR Webservice API
LLaMA-Omni is a low-latency and high-quality end-to-end speech interaction model built upon Llama-3.1-8B-Instruct, aiming to achieve speech capabilities at the GPT-4o level.
Multilingual Automatic Speech Recognition with word-level timestamps and confidence
Kalliope is a framework that will help you to create your own personal assistant.
A robust, efficient, low-latency speech-to-text library with advanced voice activity detection, wake word activation and instant transcription.
Toolkit for efficient experimentation with Speech Recognition, Text2Speech and NLP
the open-source virtual assistant for Ubuntu based Linux distributions
A TensorFlow Implementation of DC-TTS: yet another text-to-speech model
Add a description, image, and links to the speech-to-text topic page so that developers can more easily learn about it.
To associate your repository with the speech-to-text topic, visit your repo's landing page and select "manage topics."