First public release of the LiveKit Voice AI Starter.
Most voice AI demos are a single script. This is the whole loop, structured the way you would actually ship it, and split into three pieces you can run, deploy, and swap independently.
What's inside
agent/: LiveKit voice worker. Deepgram nova-3 STT, OpenAI gpt-4.1-mini, Cartesia TTS, Silero VAD, and the multilingual turn detector. Web and SIP on the same worker, explicit dispatch.backend/: FastAPI token server. Mints LiveKit room tokens atPOST /api/v1/tokenusing the standard LiveKit schema, with a clean API to service to repository layout and a JWTUserslice to copy.frontend/: React, Vite, and Tailwind on LiveKit's Agents UI. Audio visualizer, live transcript, and text chat.
Highlights
- One command per service to run locally, one
.env.exampleeach. - Docker Compose brings up Postgres, backend, agent, and frontend together.
- Web and telephony (SIP) on the same agent.
- Swappable STT, LLM, and TTS providers; self-hosted or LiveKit Cloud with a one-line change.
- Standard token endpoint, so LiveKit client SDKs connect with no glue.
- Typed end to end, tested, linted, with CI and a pre-commit hook across all three packages.
Stack
STT, LLM, TTS: Deepgram nova-3, OpenAI gpt-4.1-mini, Cartesia (all swappable). Realtime: LiveKit Agents, WebRTC, Silero VAD, turn detector. Backend: FastAPI, async SQLModel and Postgres, PyJWT. Frontend: React 19, Vite, TypeScript, Tailwind v4.
Get started
git clone https://github.com/mahimairaja/livekit-starter
cd livekit-starter
cp .env.example .env # add LIVEKIT_* + OPENAI/DEEPGRAM/CARTESIA
docker compose up --buildThen open http://localhost:5173 and click Start conversation. See the README for manual setup and per-package docs.
License
MIT.