Real-Time Speech Stream Processor (Deepgram)

Python 3.13.1 app that:

Continuously records microphone audio to a WAV file
Streams audio to Deepgram for real-time speech-to-text
Broadcasts each transcript segment over a local WebSocket server
Includes a minimal browser client that prints incoming text

Architecture

Concurrency model: asyncio + threads

Thread (microphone): captures raw PCM (linear16) chunks from the microphone into queues
Thread (WAV writer): writes audio to disk continuously (no interruptions)
Async task (Deepgram): streams audio to Deepgram over WebSocket + parses transcript messages
Async task (WS server): broadcasts transcript events to all connected clients

Setup (Windows PowerShell)

python -m venv .venv
.\.venv\Scripts\Activate.ps1
python -m pip install -U pip
pip install -e ".[dev]"

Configuration

Create a .env file (auto-loaded) from the template:

Copy-Item .env.example .env

Then edit .env and set DEEPGRAM_API_KEY.

You can also set it directly:

$env:DEEPGRAM_API_KEY="YOUR_DEEPGRAM_API_KEY"

Common optional settings:

WS_HOST (default 0.0.0.0)
WS_PORT (default 8765)
WS_PATH (default /ws)
OUT_DIR (default recordings)
AUDIO_INPUT_DEVICE (optional int index or device name)
SEND_INTERIM (default false) – set true to stream interim results to clients

See .env.example for the full list.

Run

speech-stream-processor

Open web/client.html (connects to ws://localhost:8765/ws).

Message format

Each message broadcasted to clients is JSON:

{"type":"transcript","text":"hello world","is_final":true,"confidence":0.98,"start":0.0,"duration":1.2,"received_at":"2025-01-01T00:00:00+00:00"}

Docker

docker build -t speech-stream-processor .
docker run -e DEEPGRAM_API_KEY="your_key" -p 8765:8765 speech-stream-processor

Note: microphone access from Docker containers varies by OS.

Tests

python -m pytest

Project Structure

src/speech_stream_processor/
  audio/        # Microphone capture + WAV recording
  broadcast/    # Fan-out broadcaster
  deepgram/     # Deepgram streaming client
  server/       # Local WebSocket server
  app.py        # Orchestrator
  config.py     # Environment configuration
web/client.html # Browser client
tests/          # Unit tests

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
src/speech_stream_processor		src/speech_stream_processor
tests		tests
web		web
.env.example		.env.example
.gitignore		.gitignore
Dockerfile		Dockerfile
README.md		README.md
Running-code.png		Running-code.png
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Real-Time Speech Stream Processor (Deepgram)

Architecture

Setup (Windows PowerShell)

Configuration

Run

Message format

Docker

Tests

Project Structure

About

Uh oh!

Releases

Packages

Languages

hamza276/RealTimeSpeech

Folders and files

Latest commit

History

Repository files navigation

Real-Time Speech Stream Processor (Deepgram)

Architecture

Setup (Windows PowerShell)

Configuration

Run

Message format

Docker

Tests

Project Structure

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages