Skip to content

hamza276/RealTimeSpeech

Repository files navigation

Real-Time Speech Stream Processor (Deepgram)

Running code

Python 3.13.1 app that:

  • Continuously records microphone audio to a WAV file
  • Streams audio to Deepgram for real-time speech-to-text
  • Broadcasts each transcript segment over a local WebSocket server
  • Includes a minimal browser client that prints incoming text

Architecture

Concurrency model: asyncio + threads

  • Thread (microphone): captures raw PCM (linear16) chunks from the microphone into queues
  • Thread (WAV writer): writes audio to disk continuously (no interruptions)
  • Async task (Deepgram): streams audio to Deepgram over WebSocket + parses transcript messages
  • Async task (WS server): broadcasts transcript events to all connected clients

Setup (Windows PowerShell)

python -m venv .venv
.\.venv\Scripts\Activate.ps1
python -m pip install -U pip
pip install -e ".[dev]"

Configuration

Create a .env file (auto-loaded) from the template:

Copy-Item .env.example .env

Then edit .env and set DEEPGRAM_API_KEY.

You can also set it directly:

$env:DEEPGRAM_API_KEY="YOUR_DEEPGRAM_API_KEY"

Common optional settings:

  • WS_HOST (default 0.0.0.0)
  • WS_PORT (default 8765)
  • WS_PATH (default /ws)
  • OUT_DIR (default recordings)
  • AUDIO_INPUT_DEVICE (optional int index or device name)
  • SEND_INTERIM (default false) – set true to stream interim results to clients

See .env.example for the full list.

Run

speech-stream-processor

Open web/client.html (connects to ws://localhost:8765/ws).

Message format

Each message broadcasted to clients is JSON:

{"type":"transcript","text":"hello world","is_final":true,"confidence":0.98,"start":0.0,"duration":1.2,"received_at":"2025-01-01T00:00:00+00:00"}

Docker

docker build -t speech-stream-processor .
docker run -e DEEPGRAM_API_KEY="your_key" -p 8765:8765 speech-stream-processor

Note: microphone access from Docker containers varies by OS.

Tests

python -m pytest

Project Structure

src/speech_stream_processor/
  audio/        # Microphone capture + WAV recording
  broadcast/    # Fan-out broadcaster
  deepgram/     # Deepgram streaming client
  server/       # Local WebSocket server
  app.py        # Orchestrator
  config.py     # Environment configuration
web/client.html # Browser client
tests/          # Unit tests

About

Real-Time Speech Stream Processor (Deepgram)

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published