Skip to content

alexechoi/deepmind-vercel-hack

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

51 Commits
 
 
 
 
 
 

Repository files navigation

Shadow

image image

Try it out: https://yourshadow.app

An ambient AI companion with a desktop pet interface, built for the Zero to Agent Hack (DeepMind x Vercel).

Shadow is a pixel-art cat that lives on your screen, listens via voice or text, and orchestrates a suite of AI agents to take real actions on your behalf — send Slack messages, generate infographics, build and deploy landing pages, and create audio content.

Architecture

frontend/
├── app/
│   ├── api/agents/        # Agent API routes (intent router, build, design, audio, slack)
│   ├── auth/              # Supabase auth callback
│   ├── login/             # Login page
│   ├── signup/            # Signup page
│   ├── settings/          # User settings (Slack webhook, display name)
│   └── page.tsx           # Main Shadow experience
├── components/
│   ├── Shadow.tsx          # Desktop pet (pixel-art cat with state-driven animation)
│   ├── ShadowExperience.tsx# Main UI wrapper (speech bubble, quick actions, input)
│   ├── SettingsForm.tsx    # User settings form
│   ├── TranscriptDisplay.tsx
│   └── ResultCard.tsx
├── hooks/
│   └── useGeminiLive.ts   # Real-time voice chat via Gemini Live API
├── lib/
│   ├── agents/router.ts   # Intent routing to agent endpoints
│   ├── gemini-live/       # Audio pipeline (recorder, streamer, worklets)
│   ├── supabase/          # Auth client (browser + server)
│   └── templates/         # HTML templates for generated landing pages
├── scripts/
│   ├── setup-db.sql       # Supabase profiles table + RLS policies
│   └── generate-voices.ts # Generate MP3 clips via ElevenLabs
└── public/
    ├── audio/             # Sound effects (sent-message, on-it, etc.)
    └── shadow/            # Pet sprite assets (webp, wav, png)

Tech Stack

Layer Technology
Framework Next.js 16, React 19
Styling Tailwind CSS v4
AI Google Gemini (text + image generation, native audio)
Voice Gemini Live (real-time streaming), ElevenLabs (TTS / jingles)
Auth Supabase (email/password, SSR)
Deployment Vercel (hosting + programmatic deployments via API)
Messaging Slack (incoming webhooks)

Sponsor Integrations

Vercel

  • Hosting & Deployment — The entire Next.js 16 app is deployed on Vercel.
  • AI SDK (ai + @ai-sdk/google) — Powers all server-side agent logic via generateText and generateObject for intent detection, content generation, and message drafting.
  • Deployments API — The Build agent programmatically creates and deploys landing pages by calling POST /v13/deployments, then configures project settings via the Vercel REST API — users get a live URL in seconds.
  • Blob Storage — Template assets (fonts, images) are served from vercel-storage.com.
  • Analytics — Generated landing pages include @vercel/analytics for page-view tracking out of the box.

Google DeepMind (Gemini)

  • Gemini Live API (gemini-2.5-flash-native-audio-latest) — Real-time bidirectional voice chat between the user and Shadow. Raw PCM audio streams in (16 kHz) and out (24 kHz) over WebSocket with live transcription.
  • Gemini 2.5 Flash — Text generation for Slack message drafting, intent detection (generateObject with Zod schemas), and landing-page content creation.
  • Gemini Image Generation (gemini-2.5-flash-image) — The Design agent generates infographics on demand via Gemini's native image output.

Supabase

  • Auth — Email/password authentication with SSR support (@supabase/ssr). Middleware refreshes sessions and protects routes.
  • Database — A profiles table stores per-user settings (display name, Slack webhook URL, default channel).
  • Row-Level Security — RLS policies ensure users can only read and write their own profile data.
  • User Metadata — Slack configuration is persisted in user_metadata via supabase.auth.updateUser(), making it available across sessions.

ElevenLabs

  • Text-to-Speech — The Audio agent calls the ElevenLabs TTS API (eleven_flash_v2_5 model) to generate voice clips from text.
  • Sound Generation — Short jingles and audio effects are created via the /v1/sound-generation endpoint.
  • UI Sound Effects — A build script (scripts/generate-voices.ts) pre-generates MP3 clips ("on it", "done", "sent message", etc.) using the same voice for a consistent personality.

Features

Desktop Pet

A pixel-art cat companion with multiple states — idle roaming, listening, thinking, and replying — with smooth animations and sound effects.

Voice Chat

Real-time bidirectional audio via Gemini Live (gemini-2.5-flash-native-audio-preview). Supports both microphone input and text messages.

Agent Actions

Shadow can execute actions through a set of specialized agents, routed by intent:

  • Slack — Send messages to Slack channels via webhooks
  • Build — Generate and deploy landing pages to Vercel (waitlist, pricing, launch templates)
  • Design — Create infographics using Gemini's image generation
  • Audio — Generate text-to-speech and jingles via ElevenLabs

Auth & Settings

Supabase-powered authentication with per-user settings for Slack webhook URL, display name, and default channel.

Getting Started

Prerequisites

  • Node.js 18+
  • A Supabase project
  • A Google AI API key
  • (Optional) ElevenLabs API key for audio generation
  • (Optional) Vercel API token for page deployments
  • (Optional) Slack incoming webhook URL

Setup

  1. Clone the repository

    git clone https://github.com/your-username/deepmind-vercel-hack.git
    cd deepmind-vercel-hack/frontend
  2. Install dependencies

    npm install
  3. Configure environment variables

    Copy .env.local.example or create frontend/.env.local with the following:

    # Gemini
    GOOGLE_GENERATIVE_AI_API_KEY=       # Server-side (AI SDK)
    NEXT_PUBLIC_GEMINI_API_KEY=         # Client-side (Gemini Live)
    
    # Supabase
    NEXT_PUBLIC_SUPABASE_URL=
    NEXT_PUBLIC_SUPABASE_ANON_KEY=
    SUPABASE_SERVICE_ROLE_KEY=
    
    # App
    NEXT_PUBLIC_APP_URL=http://localhost:3000
    
    # Optional
    ELEVENLABS_API_KEY=                 # Audio generation
    VERCEL_API_TOKEN=                   # Page deployments
    VERCEL_TEAM_ID=                     # Vercel team (if applicable)
    SLACK_WEBHOOK_URL=                  # Default Slack webhook
  4. Set up the database

    Run scripts/setup-db.sql in your Supabase SQL editor to create the profiles table and RLS policies.

  5. Start the dev server

    npm run dev

    Open http://localhost:3000.

Available Scripts

Script Description
npm run dev Start the development server
npm run build Create a production build
npm run start Run the production build
npm run lint Run ESLint

License

Apache 2.0 — see LICENSE for details.

About

Zero to Agent Hack

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages