A realtime speech transcription and translation application using Whisper OpenAI and free translation API. Interface made using Tkinter. Code written fully in Python.
-
Updated
Jan 18, 2024 - Python
A realtime speech transcription and translation application using Whisper OpenAI and free translation API. Interface made using Tkinter. Code written fully in Python.
A data annotation pipeline to generate high-quality, large-scale speech datasets with machine pre-labeling and fully manual auditing.
Speech to Phoneme, Bandwidth Extension and Speaker Verification using the Vibravox dataset.
🚀 Framework for seamless fine-tuning of Whisper model on a multi-lingual dataset and deployment to prod.
Long audio alignment using Kaldi
Identifying individual speakers in an audio stream based on the unique characteristics found in individual voices using Python
An offline AI-powered video analysis tool with object detection (YOLO), image captioning (BLIP), speech transcription (Whisper), audio event detection (PANNs), and AI-generated summaries (LLMs via Ollama). It ensures privacy and offline use with a user-friendly GUI.
Speech Transcription API is a RESTful service that processes audio input and converts speech into text using state-of-the-art speech recognition models. Ideal for building transcription tools, smart assistants, and voice-controlled applications.
Navigate websites by clicking your fingers and saying the link you want to visit.
CLI tool that continuously transcribes audio from the device's built-in microphone to a text file. Runs in the background, providing an ongoing log of ambient audio as text.
SPEAR-ASR and SPEAR-WakeUp Software Development Kit for Android
Speech transcription and speech diarization
SPEAR-ASR and SPEAR-WakeUp Software Development Kit in Python for Linux
Whisper Transcription Service
Real time caption generator using Microsoft Azure speech services
A yarp plugin to perform speech transcription using openai whisper
An open-source AI writing tool for realtime speech transcription.
Add a description, image, and links to the speech-transcription topic page so that developers can more easily learn about it.
To associate your repository with the speech-transcription topic, visit your repo's landing page and select "manage topics."