Sonus is a scalable, cloud-native automated transcription and diarization system built on Google Cloud Platform (GCP). It utilizes WhisperX to provide high-accuracy speech-to-text conversion with speaker identification (diarization).
- High Accuracy: Uses OpenAI's Whisper model via WhisperX for state-of-the-art recognition.
- Speaker Diarization: Automatically identifies and separates different speakers.
- Scalable Architecture: Built on GCP Cloud Run Jobs to handle variable workloads cost-effectively.
- Event-Driven: Uses Pub/Sub for asynchronous task processing.
- Infrastructure as Code: Fully managed via Terraform/OpenTofu.
- Format Support: Handles various audio (mp3, wav, m4a, flac) and video (mp4, mov, avi, mkv) formats.
The system consists of two main components:
- Activator: A scheduled job that scans sources (e.g., Google Drive) for new files and publishes tasks to Pub/Sub.
- Transcriber: A worker job triggered by Pub/Sub messages that processes the audio/video files using WhisperX and saves the results.
graph LR
Source["Source (e.g., Drive)"] --> Activator["Activator (Cloud Run)"]
Activator --> Topic["Pub/Sub Topic"]
Topic --> Transcriber["Transcriber (Cloud Run)"]
Transcriber --> Storage["Storage (Output)"]
sonus-activator/: Source code for the trigger service.sonus-transcriber/: Source code for the WhisperX processing service.terraform/: Infrastructure configuration (Terraform/OpenTofu).docs/: Detailed technical documentation.examples/: Sample scripts and files for testing.
- Google Cloud Platform Account
- Terraform or OpenTofu
- Docker
- Python 3.11+
- GCP CLI (
gcloud)
Navigate to the terraform/ directory and configure your variables. You can create a terraform.tfvars file:
project_id = "your-gcp-project-id"
region = "your-preferred-region" # e.g., europe-west1Initialize and apply the Terraform configuration:
cd terraform
tofu init
tofu applyThis will create:
- Artifact Registry repositories
- Cloud Run Jobs (Activator & Transcriber)
- Cloud Storage Buckets
- Pub/Sub Topics and Subscriptions
- Service Accounts and IAM roles
Build and push the Docker images for both services:
# Example for Activator
cd sonus-activator
gcloud builds submit --tag REGION-docker.pkg.dev/PROJECT_ID/sonus/activator:latest .
# Example for Transcriber
cd sonus-transcriber
gcloud builds submit --tag REGION-docker.pkg.dev/PROJECT_ID/sonus/transcriber:latest .- Create a virtual environment:
python -m venv .venv source .venv/bin/activate - Install dependencies:
pip install -r sonus-activator/requirements.txt pip install -r sonus-transcriber/requirements.txt
The project uses pytest.
# Run Activator tests
pytest sonus-activator/tests
# Run Transcriber tests
pytest sonus-transcriber/testsFor a detailed deep-dive into the setup, configuration variables, and troubleshooting, please refer to the StepByStep Guide.
CC0 1.0 Universal (Public Domain)