LiveKit Voice AI: WhatsApp Real-Time Integration

A cutting-edge, low-latency voice AI system integrating WhatsApp Business calls with the LiveKit Multimodal Agent Framework. Bridging the gap between traditional telephony and state-of-the-art Generative Voice AI.

System Architecture: The Voice Bridge

This project implements a sophisticated Telephony-to-AI Gateway. It allows users to call a WhatsApp Business number and interact with a high-performance AI agent in real-time.

The Flow:

Incoming Call: Meta triggers a Webhook on our FastAPI Server (whatsapp_server.py).
WebRTC Signaling: The server performs SDP (Session Description Protocol) negotiation between Meta and LiveKit.
Room Orchestration: A unique LiveKit room is created and the caller is added as a participant.
Agent Dispatch: The LiveKit Agent Worker (agent.py) is dispatched to the room.
Multimodal Interaction: The agent uses Google's Realtime Multimodal Model to listen, reason, and speak back to the caller with ultra-low latency.

Key Technical Features

Real-Time Multimodal AI: Powered by Google's latest voice models for natural, human-like conversations.
Advanced Audio Processing: Integrated BVC Noise Cancellation to ensure clarity even in noisy environments.
Meta Webhook Security: Full implementation of HMAC-SHA256 signature verification to protect against unauthorized requests.
Full Call Lifecycle Management: Automatic room creation, participant signaling, and automated teardown to optimize resource usage.
Outbound Calling Capability: Dedicated endpoint to initiate AI-driven calls directly to customers.

🛠️ Technical Stack

Platform: LiveKit (Cloud or Self-Hosted).
Backend: FastAPI, Python 3.10+, Node.js (for token generation).
AI Models: Google Gemini Multimodal (Voice: Aoede).
Integration: Meta Graph API (WhatsApp Business SDK).
Signaling: WebRTC (SDP Offer/Answer), HTTPX for async communication.

⚙️ Setup & Configuration

1. Environment Variables (`.env.local`)

The system requires deep integration with Meta and LiveKit:

# Meta / WhatsApp
META_ACCESS_TOKEN="your_token"
META_PHONE_NUMBER_ID="your_id"
META_APP_SECRET="your_secret"
META_WEBHOOK_VERIFY_TOKEN="your_verify_token"

# LiveKit
LIVEKIT_URL="wss://your-project.livekit.cloud"
LIVEKIT_API_KEY="your_key"
LIVEKIT_API_SECRET="your_secret"

# LLM
GOOGLE_API_KEY="your_google_key"

2. Installation

# Setup Python Environment
python -m venv venv
source venv/Scripts/activate
pip install -r requirements.txt

# Setup Token Generator (Optional)
npm install

Execution Guide

To run the complete system, you need to execute two components:

Step 1: Start the LiveKit Agent Worker

The worker listens for rooms that need an AI assistant.

python agent.py dev

Step 2: Start the WhatsApp Integration Server

The server handles incoming webhooks and dispatches the workers.

uvicorn whatsapp_server:app --reload --port 8000

Performance & Security

Latency: Sub-second response times using WebRTC.
Scalability: LiveKit's JobContext allows for thousands of concurrent voice sessions.
Integrity: Every Meta request is validated using SHA256 hmac signatures before processing.

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
.env.local		.env.local
.gitignore		.gitignore
README.md		README.md
agent.py		agent.py
config.yaml		config.yaml
generateToken.js		generateToken.js
package-lock.json		package-lock.json
package.json		package.json
whatsapp_server.py		whatsapp_server.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

LiveKit Voice AI: WhatsApp Real-Time Integration

System Architecture: The Voice Bridge

The Flow:

Key Technical Features

🛠️ Technical Stack

⚙️ Setup & Configuration

1. Environment Variables (`.env.local`)

2. Installation

Execution Guide

Step 1: Start the LiveKit Agent Worker

Step 2: Start the WhatsApp Integration Server

Performance & Security

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

LiveKit Voice AI: WhatsApp Real-Time Integration

System Architecture: The Voice Bridge

The Flow:

Key Technical Features

🛠️ Technical Stack

⚙️ Setup & Configuration

1. Environment Variables (.env.local)

2. Installation

Execution Guide

Step 1: Start the LiveKit Agent Worker

Step 2: Start the WhatsApp Integration Server

Performance & Security

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

1. Environment Variables (`.env.local`)

Packages