AI-powered system that transforms static pitch decks into dynamic, interactive presentations.
Traditional presentations are static and force founders to follow a rigid narrative. Investors often want to drill deeper into specific aspects such as market size, technology, or financials, but switching slides or answering verbally breaks the flow.
Presentation Agent solves this by turning a pitch deck into a live AI presentation system.
Users upload:
- A Brand Guide
- Supporting technical documents
- Knowledge base PDFs
The system builds a RAG-powered knowledge layer and generates slides dynamically in response to questions.
This allows the presenter to explore the presentation in a choose-your-own-adventure format where the audience can ask questions and the system generates new slides, visuals, and narration in real time.
- Interactive RAG: Conversations with your pitch deck. Ask a question, and the agent regenerates slides and audio on the fly.
- Brand Consistency: Upload a Brand Guide (PDF) to ensure all generated content aligns with your visual identity.
- Deep Knowledge: Ingest technical documents (PDFs) into a local Chroma vector store for accurate, grounded responses.
- Multimodal Output: Generates rich React-based slides, speaker notes, and high-quality Text-to-Speech (TTS) audio.
- Premium UI: Sleek, glassmorphic design system with smooth Framer Motion transitions.
- Backend: Python 3.11+, FastAPI, uv
- Frontend: React 18, Vite, TypeScript, Framer Motion
- AI/LLM: Google Gemini (via Google Generative AI API)
- Database/Storage: ChromaDB (Vector Store), Redis (Session Cache)
- Deployment: Docker, Docker Compose, Nginx
The system follows a modular AI-agent architecture built around Retrieval-Augmented Generation (RAG), allowing presentation content to be dynamically generated based on user questions and retrieved knowledge.
Users begin a session from the landing page.
Users upload a brand guide and supporting PDFs that define both style and content.
The system analyzes the brand guide to infer colors, design language, and presentation tone.
Documents are processed into chunks and stored in the retrieval layer for grounded generation.
The presenter asks a natural-language question to drill deeper into a topic.
The system produces dynamic slides tailored to the question while preserving brand consistency.
Generated material can be reviewed and extended through follow-up questions.
The fastest way to get the project running is using Docker Compose.
- Docker & Docker Compose
- A valid Gemini API Key (Get one at Google AI Studio)
Copy the example environment file and add your API key:
cp .env.docker.example .envEdit .env and set:
GEMINI_API_KEY=your_key_here
docker compose up --buildOnce the containers are healthy, access the application at:
- Frontend: http://localhost:8080
- API Health Check: http://localhost:8080/api/health
If you prefer to run the services individually without Docker:
cd backend
curl -LsSf https://astral.sh/uv/install.sh | sh # Install uv if you don't have it
uv sync
cp ../.env.example .env
# Edit .env and set GEMINI_API_KEY, REDIS_HOST=localhost
uv run uvicorn src.main:app --reloadNote: Requires a running Redis instance on localhost:6379.
cd frontend
npm install
npm run devThe frontend will be available at http://localhost:5173.
Presentation Agent is designed to be easily deployable to any cloud provider that supports Docker.
- Clone the repository to your server.
- Follow the Quick Start with Docker steps.
- Use a reverse proxy (like the included Nginx setup) to handle SSL/TLS.
- Link your GitHub repository.
- Set the root directory for the build.
- Configure the environment variables (secrets) in the provider's dashboard.
- Most platforms will automatically detect the
docker-compose.ymlor the individual Dockerfiles.
Presentation Agent integrates Google Gemini models to power its generation and multimodal capabilities.
Gemini 2.5 Flash
- Natural language reasoning
- Intent classification
- Slide content generation
- Question answering over retrieved context
Gemini 2.5 Flash Preview TTS
- Generates narration audio for presentation slides
- Enables multimodal presentation output
The system uses a Retrieval-Augmented Generation (RAG) pipeline:
- Embedding Model:
all-MiniLM-L6-v2(SentenceTransformers) - Vector Database: ChromaDB
- Document Processing: PDF ingestion and semantic chunking
When a user asks a question:
- Documents are embedded and stored in ChromaDB.
- Relevant context is retrieved using semantic search.
- The context is sent to Gemini 2.5 Flash.
- Gemini generates structured presentation content.
- The system optionally generates narration using Gemini TTS.
This architecture enables real-time interactive presentations powered by AI.
Watch the full demonstration here:
The demo shows:
- Uploading brand & knowledge documents
- Running the ingestion pipeline
- Asking questions about the pitch
- Dynamic slide generation
- Real-time multimodal presentation output
presentation-agent/
├── backend/ # FastAPI App, ChromaDB, Gemini Logic
│ ├── src/ # Source code
│ ├── scripts/ # Utility scripts (Chroma inspection, etc.)
│ └── tests/ # Comprehensive Test Suite
├── frontend/ # React + TypeScript Web App
│ ├── src/ # UI Components & Branding Logic
│ └── public/ # Static Assets
├── docker-compose.yml # Full stack orchestration
└── specs/ # Architecture and Design Specs# Backend Tests
cd backend
uv run python -m pytest
# Frontend Tests
cd frontend
npm testDistributed under the MIT License. See LICENSE for more information.








