🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
-
Updated
Jun 2, 2024 - Python
🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
🎤📄 An innovative tool that transforms audio or video files into text transcripts and generates concise meeting minutes. Stay organized and efficient in your meetings, and get ready for Phase 2 where we'll be open for contributions to enable real-time meeting transcription! 🚀
WhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarization)
Fully Functional Voice Based Natural Language UI
ICASSP 2023-2024 Papers: A complete collection of influential and exciting research papers from the ICASSP 2023-24 conferences. Explore the latest advancements in acoustics, speech and signal processing. Code included. Star the repository to support the advancement of audio and signal processing!
Polyfill Web Speech API with Cognitive Services Bing Speech for both speech-to-text and text-to-speech service.
Achieve your goals and keep your data private with Lotti. This life tracking app is designed to help you stay motivated and on track, all while keeping your personal information safe and secure. Now with on-device speech recognition.
OpenVINO™ is an open-source toolkit for optimizing and deploying AI inference
Speech recognition module for Python, supporting several engines and APIs, online and offline.
An automated speech trainer. Beeps a sound every time you pronounce an unwanted word
Chrome/Edge BROWSER EXTENSION that can RECOGNIZE any live audio/video streaming then TRANSLATE it for FREE (using unofficial online Google Translate API) then display it as LIVE CAPTION / LIVE SUBTITLE!
HTML Web template that can RECOGNIZE any live audio/video streaming (using Chrome webkitSpeechRecognition API) then TRANSLATE it for FREE (using unofficial online Google Translate API) then display it as LIVE CAPTION / LIVE SUBTITLE
💬📝 A small dictation app using OpenAI's Whisper speech recognition model.
这是一个全自动(音频)视频翻译项目。利用Whisper识别声音,AI大模型翻译字幕,最后合并字幕视频,生成翻译后的视频。
🧠 Leon is your open-source personal assistant.
Transcribe any audio to text, translate and edit subtitles 100% locally with a web UI. Powered by whisper models!
This project implements a Speech Emotion Recognition (SER) model using TensorFlow Lite, specifically designed for deployment on microcontrollers like the Arduino Nano BLE33. The model is trained on the RAVDESS dataset and can recognize seven emotions: Angry, Disgust, Fear, Happy, Neutral, Sad, and Surprise.
A PyTorch-based Speech Toolkit
State-of-the-Art Deep Learning scripts organized by models - easy to train and deploy with reproducible accuracy and performance on enterprise-grade infrastructure.
Speech Note Linux app. Note taking, reading and translating with offline Speech to Text, Text to Speech and Machine translation.
Add a description, image, and links to the speech-recognition topic page so that developers can more easily learn about it.
To associate your repository with the speech-recognition topic, visit your repo's landing page and select "manage topics."