Skip to content

mirAI-AItechLab/driverdesk-2

Repository files navigation

DriveDesk Relay Backend

This is a minimal production-like prototype for a hackathon. It handles inbound calls, records voicemails, and notifies a driver via SMS.

Features

  • Receives inbound calls via Twilio.
  • Plays a Japanese greeting and records voicemail.
  • Processes recording callback to generate a structured "Request Card".
  • Sends an SMS briefing to a driver.
  • Includes a mock LLM/STT logic for prototype demonstration.
  • MiniMax TTS Integration: Generates audio briefings for drivers.
  • Agora Voice Call: Real-time voice communication (with silent fallback to demo mode).
  • Provides /health and /debug/latest endpoints.

Prerequisites

  • Node.js (v18+)
  • Twilio Account (Account SID, Auth Token, Phone Number)
  • Agora App ID (for voice calls)
  • MiniMax API Key & Group ID (for TTS)
  • ngrok for exposing local server to Twilio

Setup

  1. Clone the repository:

    git clone <repository-url>
    cd drivedesk
  2. Install dependencies:

    npm install
  3. Configuration: Copy .env.example to .env and fill in your details:

    cp .env.example .env
    • TWILIO_ACCOUNT_SID, TWILIO_AUTH_TOKEN, TWILIO_PHONE_NUMBER: Twilio credentials.
    • DRIVER_PHONE: The driver's phone number.
    • PUBLIC_BASE_URL: The public URL of your server (e.g., ngrok).
    • AGORA_APP_ID, AGORA_APP_CERTIFICATE: Agora credentials.
    • MINIMAX_API_KEY, MINIMAX_GROUP_ID: MiniMax credentials for TTS.
  4. Start the server:

    npm run dev

    The server runs on http://localhost:3000 by default.

Endpoints

  • POST /voice/incoming: Twilio webhook for inbound calls.
  • POST /voice/recording-callback: Twilio webhook for recording completion.
  • POST /api/briefing/tts: Generate TTS audio from text (MiniMax).
  • GET /health: Check server status.
  • GET /debug/latest: View the latest generated Request Card JSON.

Frontend Setup (DriveDesk Relay UI)

The frontend is a Next.js application located in the /frontend directory.

Setup

  1. Navigate to frontend directory:

    cd frontend
  2. Install dependencies:

    npm install
  3. Configuration: The .env.local file is already created. Open it and set your Agora App ID:

    NEXT_PUBLIC_AGORA_APP_ID=your_agora_app_id_here
    NEXT_PUBLIC_BACKEND_BASE_URL=http://localhost:3000
  4. Run the frontend:

    npm run dev

    Access the UI at http://localhost:3001.

Usage Flow

  1. Open two browser tabs/windows of the frontend app.
  2. Tab 1 (Customer):
    • Select "Customer" role.
    • Click "Call Driver".
    • If mic fails (e.g., remote desktop), it silently falls back to demo mode but keeps "Calling..." status.
  3. Tab 2 (Driver):
    • Select "Driver" role.
    • Click "Start Shift (Join)".
    • You will see "Incoming Call...".
    • Click "Ignore" (or wait 10s on Customer side) to trigger Voicemail.
  4. Voicemail & TTS Demo:
    • Customer UI switches to "Leave Voicemail" mode.
    • No Mic Fallback: Type a message in the text area (e.g. "Pickup at Airport") and click "Generate Request".
    • Driver clicks "Refresh" to see the request.
    • Driver clicks "Play Briefing":
      • Backend fetches audio from MiniMax API.
      • Audio plays in browser.
      • Click Stop (Square Icon) to interrupt playback immediately.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors