This is a minimal production-like prototype for a hackathon. It handles inbound calls, records voicemails, and notifies a driver via SMS.
- Receives inbound calls via Twilio.
- Plays a Japanese greeting and records voicemail.
- Processes recording callback to generate a structured "Request Card".
- Sends an SMS briefing to a driver.
- Includes a mock LLM/STT logic for prototype demonstration.
- MiniMax TTS Integration: Generates audio briefings for drivers.
- Agora Voice Call: Real-time voice communication (with silent fallback to demo mode).
- Provides
/healthand/debug/latestendpoints.
- Node.js (v18+)
- Twilio Account (Account SID, Auth Token, Phone Number)
- Agora App ID (for voice calls)
- MiniMax API Key & Group ID (for TTS)
ngrokfor exposing local server to Twilio
-
Clone the repository:
git clone <repository-url> cd drivedesk
-
Install dependencies:
npm install
-
Configuration: Copy
.env.exampleto.envand fill in your details:cp .env.example .env
TWILIO_ACCOUNT_SID,TWILIO_AUTH_TOKEN,TWILIO_PHONE_NUMBER: Twilio credentials.DRIVER_PHONE: The driver's phone number.PUBLIC_BASE_URL: The public URL of your server (e.g., ngrok).AGORA_APP_ID,AGORA_APP_CERTIFICATE: Agora credentials.MINIMAX_API_KEY,MINIMAX_GROUP_ID: MiniMax credentials for TTS.
-
Start the server:
npm run dev
The server runs on
http://localhost:3000by default.
POST /voice/incoming: Twilio webhook for inbound calls.POST /voice/recording-callback: Twilio webhook for recording completion.POST /api/briefing/tts: Generate TTS audio from text (MiniMax).GET /health: Check server status.GET /debug/latest: View the latest generated Request Card JSON.
The frontend is a Next.js application located in the /frontend directory.
-
Navigate to frontend directory:
cd frontend -
Install dependencies:
npm install
-
Configuration: The
.env.localfile is already created. Open it and set your Agora App ID:NEXT_PUBLIC_AGORA_APP_ID=your_agora_app_id_here NEXT_PUBLIC_BACKEND_BASE_URL=http://localhost:3000
-
Run the frontend:
npm run dev
Access the UI at
http://localhost:3001.
- Open two browser tabs/windows of the frontend app.
- Tab 1 (Customer):
- Select "Customer" role.
- Click "Call Driver".
- If mic fails (e.g., remote desktop), it silently falls back to demo mode but keeps "Calling..." status.
- Tab 2 (Driver):
- Select "Driver" role.
- Click "Start Shift (Join)".
- You will see "Incoming Call...".
- Click "Ignore" (or wait 10s on Customer side) to trigger Voicemail.
- Voicemail & TTS Demo:
- Customer UI switches to "Leave Voicemail" mode.
- No Mic Fallback: Type a message in the text area (e.g. "Pickup at Airport") and click "Generate Request".
- Driver clicks "Refresh" to see the request.
- Driver clicks "Play Briefing":
- Backend fetches audio from MiniMax API.
- Audio plays in browser.
- Click Stop (Square Icon) to interrupt playback immediately.