Real-time English speech translation for church services. Audio in, translated audio out — simultaneously to multiple languages.
Built for congregations that serve speakers of multiple languages. Runs entirely offline on local hardware after setup, with no subscriptions or cloud services required.
- Real-time — Low-latency English speech translation (speed depends on hardware)
- Two ASR backends — Batched Whisper (default, GPU) or Parakeet TDT (
--parakeet, CPU) for sub-second streaming updates - Simultaneous — Multiple languages at once (Spanish + Haitian Creole out of the box)
- Stereo channel split — Two languages on one stereo output (Spanish left / Haitian Creole right)
- GPU accelerated — AMD ROCm 7.2.2 or NVIDIA CUDA (required for Whisper backends — CPU-only Whisper is not supported)
- Fully offline — No internet required after initial setup and model download
- Headless — Runs as a systemd service, starts automatically at boot
- Open source — All components are FOSS
Microphone → Whisper ASR → English text
├─→ Opus-MT (en→es) → Piper TTS → Spanish audio out
└─→ Opus-MT (en→ht) → Piper TTS → Haitian Creole audio out
| Component | Minimum | Recommended |
|---|---|---|
| CPU | Modern x86_64 | AMD Ryzen / Intel Core (recent gen) |
| RAM | 8 GB | 16 GB+ |
| GPU | Required — AMD RDNA2+ (RX 6000+, Radeon 680M/780M/890M) or NVIDIA Maxwell+ (GTX 900+, RTX) | ≥8 GB VRAM for large-v3 |
| Storage | 15 GB free | 20 GB free |
| Audio in | Any microphone | USB audio interface |
| Audio out | Any speaker | Separate output per language |
GPU is mandatory. The translator runs
large-v3Whisper by default, which is ~3× slower than real time on CPU — unusable for live services.run.pyrefuses to start if no GPU is detected.
Tested on:
- AMD Ryzen AI 9 HX 370 + Radeon 890M (ROCm)
- NVIDIA RTX 3060 (CUDA)
Requires a fresh Debian 13 (Trixie) install. The installer handles everything else.
git clone https://github.com/LandmarkAdministrator/translator.git ~/translator
cd ~/translator
./install.shThe installer auto-detects your GPU. To specify manually:
./install.sh --rocm # AMD GPUs (RX 6000+, Radeon 680M / 780M / 890M)
./install.sh --cuda # NVIDIA GPUs (GTX 900+, RTX series)
./install.sh --rocm --parakeet # Also install the Parakeet streaming backendWhat the installer does:
- Enables required Debian repositories
- Installs GPU drivers (ROCm 7.2.2 or NVIDIA kernel driver)
- Creates Python virtual environment
- Installs Python dependencies with the correct GPU backend
- Downloads ML models (~4 GB)
- (Optional, with
--parakeet) Installsonnxruntime-rocm+ Parakeet TDT 0.6b v3 ONNX model - Sets up the systemd service
The Parakeet backend can also be added later by running
./scripts/install_parakeet.sh from inside the activated venv.
See docs/SETUP.md for the full step-by-step guide and troubleshooting.
source venv/bin/activate
# Interactive setup — select audio devices, test the pipeline, save config
python run.py --setup
# Run with saved configuration (batched Whisper on GPU — default)
python run.py
# Parakeet TDT 0.6b v3 streaming ASR (CPU, ~sub-second commits)
python run.py --parakeet
# Test GPU and all components
python run.py --test
# List available audio devices
python run.py --list-devicesThe default batched Whisper path waits for an utterance to finish before
translating. --parakeet emits partial transcripts in real time with a
token-level LocalAgreement-2 commit policy.
# Install and enable the systemd user service
./scripts/install_service.sh install
# Start / stop / status
systemctl --user start church-translator
systemctl --user stop church-translator
systemctl --user status church-translator
# View live logs
journalctl --user -u church-translator -fThe service starts automatically at boot (no login required).
Audio devices and enabled languages are configured interactively with python run.py --setup and saved to config/settings.yaml. Edit that file directly (or re-run --setup) to change settings — there is no separate per-language YAML to edit. The ASR model is fixed at large-v3 and is not configurable.
The system supports any language pair that has:
- A Helsinki-NLP Opus-MT model on HuggingFace (English → target)
- A Piper TTS voice for the target language
Adding a new language currently requires a small code change:
- Add the Opus-MT model ID to
TranslationService.MODEL_MAPin src/pipeline/translation.py. - Add the Piper voice entry to
VOICE_MAPin src/pipeline/tts.py. - Add the language to the
all_languageslist in scripts/setup.py. - Download the models with
python scripts/download_models.py --all. - Re-run
python run.py --setupto enable the new language.
| Stage | AMD 890M (ROCm) | NVIDIA RTX 3060 (CUDA) | CPU only |
|---|---|---|---|
| ASR (2s audio) | ~200ms | ~180ms | ~500ms |
| Translation | ~50ms | ~40ms | ~160ms |
| TTS | ~100ms | ~80ms | ~100ms |
| End-to-end latency | ~0.8s | ~0.7s | ~2–3s |
translator/
├── install.sh # Automated installer (--rocm | --cuda | --parakeet)
├── run.py # Main entry point (default batch Whisper | --parakeet)
├── requirements/
│ ├── base.txt # Runtime deps (audio, yaml, librosa, hf hub…)
│ └── ml.txt # ML deps (torch, transformers, faster-whisper…)
├── config/
│ └── settings.yaml # Generated by --setup (devices, languages)
├── src/
│ ├── audio/ # Audio input/output with PipeWire/ALSA
│ ├── config/ # Settings loader (config/settings.yaml)
│ ├── pipeline/ # ASR (batched Whisper / Parakeet), translation, TTS
│ └── utils/ # GPU setup, logging
├── scripts/
│ ├── download_models.py # Download ML models
│ ├── install_parakeet.sh # Parakeet backend installer (onnxruntime-rocm + ONNX model)
│ ├── install_rocm.sh # ROCm 7.2.2 installer
│ ├── install_service.sh # Systemd service installer
│ ├── test_gpu.py # GPU verification
│ └── env.sh # Environment setup (AMD iGPU)
├── systemd/
│ └── translator.service # Service file reference (generated by install_service.sh)
└── docs/
├── SETUP.md # Full installation guide
└── DEPLOYMENT.md # Deployment and operations guide
This project uses the following open-source components:
| Component | License | Notes |
|---|---|---|
| faster-whisper | MIT | Speech recognition |
| CTranslate2 | MIT | Inference engine |
| Helsinki-NLP Opus-MT | CC-BY 4.0 | Translation models |
| Piper TTS | MIT | Text-to-speech |
| piper-tts | MIT | Python TTS package |
| PyTorch | BSD 3-Clause | ML framework |
| Transformers | Apache 2.0 | Model loading |
| sounddevice | MIT | Audio I/O |
| NumPy | BSD 3-Clause | Array processing |
| loguru | MIT | Logging |
Translation model note: The Helsinki-NLP Opus-MT models are released under CC-BY 4.0, which requires attribution for redistribution. This project does not redistribute the models — they are downloaded directly from HuggingFace during setup.
GPU drivers: AMD ROCm is open source (MIT/Apache 2.0). NVIDIA kernel drivers are proprietary but freely available. PyTorch bundles its own CUDA runtime — no separate CUDA toolkit installation is needed.
This project is released under the MIT License. You are free to use, modify, and distribute it for any purpose, including commercial use.
See CONTRIBUTING.md.
Built for multilingual church congregations. Inspired by the need to include everyone in worship regardless of language.