Give your OpenClaw agent a phone number. Real-time voice conversations over a regular phone line — powered by Deepgram, Twilio, and your LLM.
ClawVoice bridges phone calls (via Twilio or Telnyx) with Deepgram's Voice Agent API, allowing real-time spoken conversations with your OpenClaw agent.
Phone Call → Twilio/Telnyx (PSTN)
→ ClawVoice (WebSocket bridge)
→ Deepgram Voice Agent API (STT + TTS + turn-taking)
→ OpenClaw Gateway (/v1/chat/completions)
- Someone calls your Twilio/Telnyx phone number
- Twilio sends a webhook to ClawVoice, which starts a media stream
- Audio is bridged to Deepgram's Voice Agent API via WebSocket
- Deepgram handles speech-to-text (Nova-3), text-to-speech (Aura-2), and semantic turn detection
- LLM requests are proxied back to your OpenClaw gateway's chat completions endpoint
- The caller hears the agent's response spoken back in real-time
Key features:
- Semantic turn detection (not just VAD) — understands when someone is done speaking
- Barge-in support — interrupt the agent mid-sentence
- ~90ms TTS latency with Deepgram Aura-2
- Session pre-warming for faster first response
- Markdown stripping — responses are cleaned for voice output
- Twilio webhook signature validation
- Node.js >= 18
- OpenClaw >= 2026.3.12
- A Deepgram account (comes with $200 free credit)
- A Twilio or Telnyx account with a phone number
cd ~/.openclaw/workspace
git clone https://github.com/clawnify/clawvoice.git
cd clawvoice
npm install && npm run buildAdd to your openclaw.json:
{
"plugins": {
"allow": ["clawvoice"],
"load": {
"paths": ["/home/user/.openclaw/workspace/clawvoice"]
},
"entries": {
"clawvoice": {
"enabled": true,
"config": {
"voiceProvider": "deepgram",
"telephonyProvider": "twilio",
"serve": { "port": 8000 },
"greeting": "Hello! How can I help you today?"
}
}
}
}
}Enable the chat completions endpoint:
openclaw config set gateway.http.endpoints.chatCompletions.enabled truegit clone https://github.com/clawnify/clawvoice.git
cd clawvoice
npm install && npm run build| Variable | Required | Description |
|---|---|---|
DEEPGRAM_API_KEY |
Yes | Deepgram API key |
TWILIO_ACCOUNT_SID |
If using Twilio | Twilio Account SID |
TWILIO_AUTH_TOKEN |
If using Twilio | Twilio Auth Token |
TELNYX_API_KEY |
If using Telnyx | Telnyx API key |
TELNYX_PUBLIC_KEY |
If using Telnyx | Telnyx public key |
OPENCLAW_GATEWAY_URL |
No | Gateway URL (default: http://127.0.0.1:18789) |
OPENCLAW_GATEWAY_TOKEN |
No | Gateway auth token |
CLAWVOICE_DEBUG |
No | Enable debug logging |
All config goes under plugins.entries.clawvoice.config in openclaw.json:
| Key | Default | Description |
|---|---|---|
voiceProvider |
"deepgram" |
Voice provider |
telephonyProvider |
"twilio" |
"twilio" or "telnyx" |
deepgram.apiKey |
env var | Deepgram API key |
deepgram.voice |
"aura-2-asteria-en" |
Deepgram TTS voice |
deepgram.language |
"en" |
Language code |
serve.port |
8000 |
Voice server port |
serve.host |
"127.0.0.1" |
Voice server bind address |
publicUrl |
— | Public URL for webhooks (tunnel URL) |
voiceModel |
"anthropic/claude-haiku-4-5-20251001" |
Model for voice responses |
greeting |
"Hello! How can I help you today?" |
Greeting when call connects |
- Get a phone number in the Twilio Console
- Go to Phone Numbers → your number → Voice Configuration
- Set A Call Comes In → Webhook →
https://your-public-url/voice/twilio/incoming(POST)
If running behind a Cloudflare Tunnel, the public URL would be your tunnel hostname.
ClawVoice is built with a provider-agnostic architecture:
src/
providers/
types.ts # VoiceProvider interface
deepgram.ts # Deepgram Voice Agent API
bridges/
types.ts # TelephonyBridge interface
twilio.ts # Twilio media stream bridge
telnyx.ts # Telnyx media stream bridge
server.ts # HTTP/WebSocket server
plugin/
index.ts # OpenClaw plugin registration
utils.ts # Markdown stripping, signature validation
Adding a new voice provider: Implement the VoiceProvider interface in src/providers/.
Adding a new telephony provider: Implement the TelephonyBridge interface in src/bridges/.
Voice providers on the roadmap:
- ElevenLabs — high-quality TTS with 80+ voices (using OpenClaw's built-in STT for the inbound side)
- OpenAI Realtime — GPT-4o native voice
- Google Cloud Speech — STT + TTS
Telephony:
- Telnyx — already scaffolded, lower cost alternative to Twilio
When loaded as an OpenClaw plugin, ClawVoice registers:
| Surface | What |
|---|---|
| Tools | voice_call_status, voice_call_info |
| Service | Background HTTP/WebSocket server |
| CLI | openclaw clawvoice status, openclaw clawvoice config |
| Skill | Voice call guidelines for the agent |
| Endpoint | Description |
|---|---|
GET /health |
Health check + active call count |
GET /voice/status |
List active calls with details |
POST /voice/twilio/incoming |
Twilio webhook (returns TwiML) |
WS /voice/twilio/media |
Twilio media stream |
POST /voice/telnyx/webhook |
Telnyx webhook |
WS /voice/telnyx/media |
Telnyx media stream |
POST /v1/chat/completions |
LLM proxy (internal, authenticated) |
- Twilio webhook requests are validated via
X-Twilio-Signature(HMAC-SHA1) - The LLM proxy endpoint is authenticated with a random secret generated at startup
- The voice server binds to
127.0.0.1by default — expose via reverse proxy or tunnel
| Service | Cost |
|---|---|
| Deepgram Voice Agent | ~$4.50/hr |
| Twilio (phone + minutes) | ~$1/mo + $0.085/min |
| Telnyx (alternative) | ~$0.50/mo + $0.025/min |
| OpenClaw (your LLM keys) | Varies by provider |
MIT
