speech-transcription

Here are 17 public repositories matching this topic...

Dadangdut33 / Speech-Translate

A realtime speech transcription and translation application using Whisper OpenAI and free translation API. Interface made using Tkinter. Code written fully in Python.

python translate whisper tkinter-python speech-translation speech-transcription

Updated Jan 18, 2024
Python

Appen / UHV-OTS-Speech

Star

A data annotation pipeline to generate high-quality, large-scale speech datasets with machine pre-labeling and fully manual auditing.

speech-recognition speech-processing audio-segmentation gender-classification speaker-diarization synthetic-speech-detection topic-detection speech-seperation speaker-identification accent-detection speech-transcription speech-annotation

Updated Mar 25, 2023
Forth

jhauret / vibravox

Star

Speech to Phoneme, Bandwidth Extension and Speaker Verification using the Vibravox dataset.

pytorch hydra datasets speaker-verification speech-enhancement pytorch-lightning speech-transcription bandwidth-extension

Updated Jun 16, 2025
Python

KevKibe / African-Whisper

Sponsor

Star

🚀 Framework for seamless fine-tuning of Whisper model on a multi-lingual dataset and deployment to prod.

speech speech-recognition speech-to-text whisper asr speech-translation speech-transcription

Updated Feb 27, 2025
Python

srinivr / kaldi-long-audio-alignment

Star

Long audio alignment using Kaldi

speech-recognition automatic-speech-recognition speech-to-text kaldi transcription asr speechrecognition split-audio longaudio-alignment audio-segments speech-transcription

Updated Apr 22, 2021
Shell

PranavPutsa1006 / Speaker-Diarization

Star

Identifying individual speakers in an audio stream based on the unique characteristics found in individual voices using Python

deep-learning neural-networks speech-to-text mfcc speaker-diarization spectral-clustering voice-activity-detection speech-segmentation speech-detection speech-transcription embeddings-extraction

Updated Jun 18, 2023
Jupyter Notebook

arashsajjadi / ai-powered-video-analyzer

Star

An offline AI-powered video analysis tool with object detection (YOLO), image captioning (BLIP), speech transcription (Whisper), audio event detection (PANNs), and AI-generated summaries (LLMs via Ollama). It ensures privacy and offline use with a user-friendly GUI.

gui privacy yolo image-captioning object-detection whisper offline-processing speech-transcription llm whisper-ai blip2 ollama panns image-captioning-ai ollama-api yolo11 ai-video-analysis audio-event-detection llm-summarization

Updated Feb 23, 2025
Python

laviprog / speech-transcription

Star

Speech Transcription API is a RESTful service that processes audio input and converts speech into text using state-of-the-art speech recognition models. Ideal for building transcription tools, smart assistants, and voice-controlled applications.

python docker sqlalchemy docker-compose postgresql alembic speech-to-text transcription fastapi speech-transcription whisperx

Updated May 25, 2025
Python

capjamesg / awsnap.js

Sponsor

Star

Navigate websites by clicking your fingers and saying the link you want to visit.

webaudio-api audio-classification tensorflow-js speech-transcription

Updated Oct 1, 2023
HTML

otonomee / mic2transcript

Star

CLI tool that continuously transcribes audio from the device's built-in microphone to a text file. Runs in the background, providing an ongoing log of ambient audio as text.

audio cli speech openai transcription whisper cli-tool speech-transcription