Distill is a local-first live translation setup made of a Chrome extension and a FastAPI backend. The extension captures tab audio, streams it to the backend over WebSocket, and plays translated audio back through your selected output device.
backend/is the active server. It handles translation sessions, settings, and profile storage.extension/is the active Chrome extension you load into Chrome.- User profiles are stored locally in SQLite at
backend/profiles.db. web/still contains older prototype assets and Convex-related code, but it is not part of the current local setup documented here.
- The extension captures tab audio in Chrome.
- Audio is streamed to
ws://localhost:8000/ws/translate. - The backend uses either
AzureTranslationClientorAzureConversationClientfor the STT path, depending onSTT_PROVIDER. - The backend produces translated text from the incoming speech.
- In the live extension flow, translated speech is synthesized with
AzureTtsClient. - Translated audio is streamed back to the extension for playback.
- The extension plays the translated audio through the selected output device.
cd backend
cp .env.example .env
uv sync
uv run uvicorn main:app --reload --port 8000Once the backend is running:
- Health check:
http://localhost:8000/health - Local dashboard:
http://localhost:8000/
Edit backend/.env and set the keys required for the path you are running.
For the current live Azure path:
AZURE_SPEECH_KEYAZURE_SPEECH_REGION
Other keys used elsewhere in the backend:
SPEECHMATICS_API_KEYMINIMAX_API_KEYSUPERMEMORY_API_KEY
The backend also reads provider and tuning settings such as:
STT_PROVIDERTTS_PROVIDERTRANSLATION_TRIGGER_CHAR_THRESHOLDAZURE_SEGMENTATION_SILENCE_MSSPEECHMATICS_MAX_DELAY
cd extension
npm install
npm run buildThen load it in Chrome:
- Open
chrome://extensions - Enable
Developer mode - Click
Load unpacked - Select
extension/dist
- Open a tab with audio, such as Google Meet or YouTube
- Click the Distill extension icon
- Choose source and target language
- Choose an output device
- Start translation
Translated audio plays through the output device selected in the extension.
Profiles are managed by the backend and stored locally in SQLite. The relevant API routes are:
GET /api/profilesPOST /api/profilesGET /api/voice-profilePOST /api/voice-profilePATCH /api/voice-status
The storage layer lives in backend/services/profile_store.py.
- The extension expects the backend on
localhost:8000. - The backend serves a small local dashboard at
/for health and some settings. - The current live extension flow uses Azure for the active STT and TTS path.
- Some legacy Convex files still exist under
web/, but the current README no longer treats them as part of the supported setup.
- Extension: React, TypeScript, Vite, Chrome MV3
- Backend: Python, FastAPI, WebSocket, SQLite, uv
- Live speech path: Azure Speech Translation or Azure Conversation Transcriber, plus Azure Speech Synthesis
- Other integrated services in the backend: Speechmatics, MiniMax, Supermemory