Status: Alpha / Active Development
Stack: Next.js, Cloud Firestore, Gemini Multimodal Live via Vertex AI (WebSockets)
VantAIge is an AI-powered "Marketing Director" that combines real-time situational awareness (vision/audio) with a deep, persistent memory of brand identity (Vibe Profiles).
| Requirement | Details |
|---|---|
| 🔗 Public Code Repository | https://github.com/dexkcd/vantaige |
| 🖥️ Spin-Up Instructions | See Quick Start below |
| ☁️ Proof of Google Cloud Deployment | See GCP Proof below |
| 🏗️ Architecture Diagram | See Architecture Diagram below (Pro tip: Export or screenshot this for the submission image carousel) |
Reproducible local setup for judges:
- Node.js 18+
- Python 3.10+ (3.12+ recommended)
- gcloud CLI (install) — authenticated with
gcloud auth application-default login - Google Cloud project with billing enabled; APIs:
aiplatform.googleapis.com, Firestore
git clone https://github.com/dexkcd/vantaige.git
cd vantaige
npm installCreate .env.local in the project root:
GOOGLE_CLOUD_PROJECT=your_gcp_project_id
GOOGLE_CLOUD_LOCATION=us-central1
NEXT_PUBLIC_WS_URL=ws://localhost:8000/wscd backend
python3 -m venv .venv
source .venv/bin/activate # Windows: .venv\Scripts\activate
pip install -r requirements.txt
uvicorn main:app --reload --host 0.0.0.0 --port 8000npm run devVisit http://localhost:3000. Choose New Session or Continue Session (with passcode), then Start conversation.
VantAIge runs on Google Cloud Run and uses multiple GCP services. Code references:
| GCP Service | Code Location | Purpose |
|---|---|---|
| Vertex AI (Gemini Live) | backend/main.py |
WebSocket bridge to Gemini Multimodal Live API |
| Vertex AI (Imagen, Veo) | src/app/actions/memory.ts |
Image generation (Imagen 4.0), video generation (Veo 3.1) |
| Cloud Firestore | src/lib/firestore.ts |
Vibe profiles, session logs, brand assets, marketing plans |
| Firebase Storage | src/lib/storage.ts |
Brand asset images, Veo video output |
| Cloud Run | scripts/deploy-cloud-run.sh |
Deployment script for both services |
Optional proof: A short screen recording of the app running on GCP (e.g. Cloud Run console, logs) can be included in your submission.
flowchart TB
subgraph Client["🖥️ Frontend (Next.js)"]
UI[Studio UI]
Canvas[Webcam + Screen Capture]
Mic[Microphone]
end
subgraph GCP["☁️ Google Cloud"]
subgraph CloudRun["Cloud Run"]
WS[Python WebSocket Bridge<br/>backend/main.py]
Next[Next.js App]
end
subgraph VertexAI["Vertex AI"]
Live[Gemini Multimodal Live API]
Imagen[Imagen 4.0]
Veo[Veo 3.1]
end
Firestore[(Cloud Firestore)]
Storage[Firebase Storage]
end
Canvas --> WS
Mic --> WS
WS <-->|WebSocket| Live
UI --> Next
Next -->|REST| Imagen
Next -->|REST| Veo
Next --> Firestore
Next --> Storage
Live -.->|tool calls| Next
Pro tip: Screenshot this diagram or export it as an image for the submission file upload / image carousel.
Transition AI from a "chat box" to a proactive partner that understands physical products, digital designs, and brand DNA simultaneously.
- Dual-Stream Input (Webcam + Screen-Share)
- Camera lens toggle (front ↔ back) with live restart
- Low-Latency Compositor (1FPS Gemini Stream)
- Proactive Visual Auditing (Interruption logic for brand mismatches)
- True Barge-in Support (VAD-driven buffer clearing)
- Affective Intelligence (Tone detection & adaptation)
- Zero-Latency Hand-off (Syncing UI assets with voice)
- Vibe Profile: Persistent brand DNA storage (Firestore)
- Session Continuity: Recalling past decisions & palettes
- Real-time Updates:
upsert_vibe_profilemid-call - Cross-session trend analysis (Gemini analysis over session logs + roadmap)
- Nano Banana Asset Gen: Image generation during live calls (Imagen 4.0 via Vertex AI)
- Short-Form Video: TikTok/YouTube Shorts (9:16) via Veo 3.1, with a daily limit of 5 videos or 30 seconds total per brand (UTC day) to control generation costs
- Launch Pack Sidebar: Pinned assets & copy for review
- Strategic Grounding: Google Search integration for trends
- Blog Post Generation: Long-form blog articles (Markdown or HTML) generated from live sessions and the Vibe Profile, auto-pinned in the Launch Pack Review tab so they can be copy-pasted into Medium, Ghost, Substack, or custom blogs.
- Kanban Bridge: Turning ideas into "Draft Plans"
- Roadmap Task Detail View with modal, image/video posts, status workflow
- Session Management: New session creates passcode; Continue session restores by passcode
- Mobile-first Studio shell: Responsive header/actions, flexible panel heights
- GTM Strategy: Define a Go-To-Market strategy with phases (e.g. Awareness, Launch). AI or user can create/update the strategy; tasks and assets can be assigned to phases and viewed in a phase-grouped Strategy Flow.
- PROJECT.md — Full technical architecture, deployment, IAM, and extension guide
- backend/README.md — Python WebSocket bridge setup
- AGENTS.md — Coding standards and workflow details
Last Updated: 2026-03-07