Real-time multimodal sales assistant built for the Gemini Live Agent Challenge.
Sal helps sales reps prepare for meetings, provides live coaching via Gemini Live, and autonomously captures action items during the call.
- Challenge Category: Gemini Live Agent (Multimodal).
- Proactive Agentic Behavior: Sal autonomously identifies and updates action items in Firestore as they happen—no manual entry required.
- Native Multimodal Loop: Streams live audio/video context and handles native Gemini audio output with sub-second latency.
- Enterprise-Ready Stack: Built with Next.js 15, FastAPI, Google ADK, and deployed on Cloud Run with IAP.
Use this table to quickly locate implementation evidence for the challenge rubric.
| Challenge Criterion | Implementation Detail | Code Evidence |
|---|---|---|
| Multimodal Input | Real-time audio/video streaming via Protobuf | realtime.proto, capture.ts |
| Agentic Behavior | Proactive action-item capture & autonomous tools | live_agent.py, tools.py |
| Multimodal Output | Native-audio responses & visual guidance | audioPlayer.ts, adk_event_projection.py |
| Cloud Native | Production-ready GCP (Cloud Run, Firestore, IAP) | skaffold.yaml, main.tf |
| UX & Design | Calendar-first sales workflow & meeting prep | GoogleCalendarSyncButton.tsx |
The system is layered to ensure low-latency communication between the browser and the Gemini Live API.
flowchart TB
subgraph Client["Client (Next.js)"]
direction LR
Rep["Sales rep"] --> Web["Meeting Assist UI"]
end
subgraph Gateway["Gateway Services (Python)"]
direction LR
Rpc["ConnectRPC APIs<br/>(Calendar & Sessions)"]
Realtime["Realtime WebSocket<br/>(Audio/Video Stream)"]
Agent["Google ADK live agent"]
end
subgraph Cloud["Cloud APIs"]
direction LR
Firestore["Firestore"]
CalendarAPI["Google Calendar API"]
Gemini["Gemini 2.5 Flash native audio"]
end
Web -->|"session, calendar RPCs"| Rpc
Web <-->|"audio/video stream"| Realtime
Realtime --> Agent
Agent -->|"run_live"| Gemini
Rpc -->|"meeting state"| Firestore
Agent -->|"proactive notes/actions"| Firestore
Follow this "Happy Path" to verify Sal's core functionality.
- Node.js >= 20, pnpm >= 10, Python >= 3.12, uv >= 0.5
- A Google API key with Gemini access
- Setup
gmake setupand setGOOGLE_API_KEYinapps/agent-gateway/.env. - Launch:
gmake dev(starts both the Next.js frontend and Python gateway). - Open: Navigate to
http://localhost:3000/projects/sal-agent. - Get Started: Click on the "Get Started" button and tell Sal about what you're selling.
- Select: Pick the "Discovery Call" from the list.
- Start: Choose one of the live chat options, "Audio Only" or "Video", and click "Start Live Assist".
- Verify Proactive Capture:
- Speak into your mic: "Hey Sal, remind me to follow up with the client next Tuesday."
- Observe the "Action Items" panel. Sal will autonomously add the task to Firestore.
- Mute Sal & Toggle the Picture in Picture Display: Mute Sal so he listens in the background, and toggle the picture in picture display to see his guidance in realtime.
- Join a live meeting: Tell Sal that you're about to hop on the call, and work through your action items during the call. Watch Sal mark each action item complete, and add new ones as things come up.
- Post-call: After the call, ask Sal for a call debrief.
- Node.js >= 20, pnpm >= 10, Python >= 3.12, uv >= 0.5
- A Google API key with Gemini access
gmake setupgmake lint
gmake buildgmake devIMPORTANT: Root web app is bound to the route
/projects/sal-agent
Production deployment is managed via Skaffold and Terraform in us-east1.
- Compute: Cloud Run services behind a Global Load Balancer.
- Security: Identity-Aware Proxy (IAP) for authenticated access.
- Persistence: Firestore (Native Mode).
- Secrets: Google Secret Manager.
High-level deploy flow:
gmake auth
gmake deploy-prod/
├── apps/
│ ├── web/ # Next.js 15 web application
│ └── agent-gateway/ # Python realtime gateway app
├── core/ # Shared UI components and TS/Lint configs
├── protos/ # Protobuf definitions and generated SDKs
├── deployments/ # Skaffold and Kubernetes manifests
└── infra/ # Terraform stacks for Google Cloud
MIT