Skip to content

crigout/sal-agent

Repository files navigation

Sal

Real-time multimodal sales assistant built for the Gemini Live Agent Challenge.

Sal helps sales reps prepare for meetings, provides live coaching via Gemini Live, and autonomously captures action items during the call.

Project Highlights

  • Challenge Category: Gemini Live Agent (Multimodal).
  • Proactive Agentic Behavior: Sal autonomously identifies and updates action items in Firestore as they happen—no manual entry required.
  • Native Multimodal Loop: Streams live audio/video context and handles native Gemini audio output with sub-second latency.
  • Enterprise-Ready Stack: Built with Next.js 15, FastAPI, Google ADK, and deployed on Cloud Run with IAP.

Judge Quick Scan

Use this table to quickly locate implementation evidence for the challenge rubric.

Challenge Criterion Implementation Detail Code Evidence
Multimodal Input Real-time audio/video streaming via Protobuf realtime.proto, capture.ts
Agentic Behavior Proactive action-item capture & autonomous tools live_agent.py, tools.py
Multimodal Output Native-audio responses & visual guidance audioPlayer.ts, adk_event_projection.py
Cloud Native Production-ready GCP (Cloud Run, Firestore, IAP) skaffold.yaml, main.tf
UX & Design Calendar-first sales workflow & meeting prep GoogleCalendarSyncButton.tsx

Architecture

The system is layered to ensure low-latency communication between the browser and the Gemini Live API.

flowchart TB
    subgraph Client["Client (Next.js)"]
      direction LR
      Rep["Sales rep"] --> Web["Meeting Assist UI"]
    end

    subgraph Gateway["Gateway Services (Python)"]
      direction LR
      Rpc["ConnectRPC APIs<br/>(Calendar & Sessions)"]
      Realtime["Realtime WebSocket<br/>(Audio/Video Stream)"]
      Agent["Google ADK live agent"]
    end

    subgraph Cloud["Cloud APIs"]
      direction LR
      Firestore["Firestore"]
      CalendarAPI["Google Calendar API"]
      Gemini["Gemini 2.5 Flash native audio"]
    end

    Web -->|"session, calendar RPCs"| Rpc
    Web <-->|"audio/video stream"| Realtime
    Realtime --> Agent
    Agent -->|"run_live"| Gemini
    Rpc -->|"meeting state"| Firestore
    Agent -->|"proactive notes/actions"| Firestore
Loading

Reproducible Testing: Judge’s Guide

Follow this "Happy Path" to verify Sal's core functionality.

Prerequisites

  • Node.js >= 20, pnpm >= 10, Python >= 3.12, uv >= 0.5
  • A Google API key with Gemini access

Full Walkthrough

  1. Setup gmake setup and set GOOGLE_API_KEY in apps/agent-gateway/.env.
  2. Launch: gmake dev (starts both the Next.js frontend and Python gateway).
  3. Open: Navigate to http://localhost:3000/projects/sal-agent.
  4. Get Started: Click on the "Get Started" button and tell Sal about what you're selling.
  5. Select: Pick the "Discovery Call" from the list.
  6. Start: Choose one of the live chat options, "Audio Only" or "Video", and click "Start Live Assist".
  7. Verify Proactive Capture:
    • Speak into your mic: "Hey Sal, remind me to follow up with the client next Tuesday."
    • Observe the "Action Items" panel. Sal will autonomously add the task to Firestore.
  8. Mute Sal & Toggle the Picture in Picture Display: Mute Sal so he listens in the background, and toggle the picture in picture display to see his guidance in realtime.
  9. Join a live meeting: Tell Sal that you're about to hop on the call, and work through your action items during the call. Watch Sal mark each action item complete, and add new ones as things come up.
  10. Post-call: After the call, ask Sal for a call debrief.

Technical Setup

Prerequisites

  • Node.js >= 20, pnpm >= 10, Python >= 3.12, uv >= 0.5
  • A Google API key with Gemini access

Local Install

gmake setup

Validation

gmake lint
gmake build

Run

gmake dev

IMPORTANT: Root web app is bound to the route /projects/sal-agent


Google Cloud Deployment

Production deployment is managed via Skaffold and Terraform in us-east1.

  • Compute: Cloud Run services behind a Global Load Balancer.
  • Security: Identity-Aware Proxy (IAP) for authenticated access.
  • Persistence: Firestore (Native Mode).
  • Secrets: Google Secret Manager.

High-level deploy flow:

gmake auth
gmake deploy-prod

Appendix

Repository Structure

/
├── apps/
│   ├── web/                # Next.js 15 web application
│   └── agent-gateway/      # Python realtime gateway app
├── core/                   # Shared UI components and TS/Lint configs
├── protos/                 # Protobuf definitions and generated SDKs
├── deployments/            # Skaffold and Kubernetes manifests
└── infra/                  # Terraform stacks for Google Cloud

License

MIT

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors