Presentation Agent

AI-powered system that transforms static pitch decks into dynamic, interactive presentations.

Traditional presentations are static and force founders to follow a rigid narrative. Investors often want to drill deeper into specific aspects such as market size, technology, or financials, but switching slides or answering verbally breaks the flow.

Presentation Agent solves this by turning a pitch deck into a live AI presentation system.

Users upload:

A Brand Guide
Supporting technical documents
Knowledge base PDFs

The system builds a RAG-powered knowledge layer and generates slides dynamically in response to questions.

This allows the presenter to explore the presentation in a choose-your-own-adventure format where the audience can ask questions and the system generates new slides, visuals, and narration in real time.

Key Features

Interactive RAG: Conversations with your pitch deck. Ask a question, and the agent regenerates slides and audio on the fly.
Brand Consistency: Upload a Brand Guide (PDF) to ensure all generated content aligns with your visual identity.
Deep Knowledge: Ingest technical documents (PDFs) into a local Chroma vector store for accurate, grounded responses.
Multimodal Output: Generates rich React-based slides, speaker notes, and high-quality Text-to-Speech (TTS) audio.
Premium UI: Sleek, glassmorphic design system with smooth Framer Motion transitions.

Tech Stack

Backend: Python 3.11+, FastAPI, uv
Frontend: React 18, Vite, TypeScript, Framer Motion
AI/LLM: Google Gemini (via Google Generative AI API)
Database/Storage: ChromaDB (Vector Store), Redis (Session Cache)
Deployment: Docker, Docker Compose, Nginx

Architecture

The system follows a modular AI-agent architecture built around Retrieval-Augmented Generation (RAG), allowing presentation content to be dynamically generated based on user questions and retrieved knowledge.

How It Works

Step 1 — Start from the main interface

Users begin a session from the landing page.

Step 2 — Upload brand and knowledge sources

Users upload a brand guide and supporting PDFs that define both style and content.

Step 3 — Extract brand identity

The system analyzes the brand guide to infer colors, design language, and presentation tone.

Step 4 — Ingest and index documents

Documents are processed into chunks and stored in the retrieval layer for grounded generation.

Step 5 — Ask presentation questions

The presenter asks a natural-language question to drill deeper into a topic.

Step 6 — Generate presentation output

The system produces dynamic slides tailored to the question while preserving brand consistency.

Step 7 — Review and continue the interactive flow

Generated material can be reviewed and extended through follow-up questions.

Quick Start with Docker (Recommended)

The fastest way to get the project running is using Docker Compose.

1. Prerequisites

Docker & Docker Compose
A valid Gemini API Key (Get one at Google AI Studio)

2. Environment Setup

Copy the example environment file and add your API key:

cp .env.docker.example .env

Edit .env and set:

GEMINI_API_KEY=your_key_here

3. Spin Up

docker compose up --build

Once the containers are healthy, access the application at:

Frontend: http://localhost:8080
API Health Check: http://localhost:8080/api/health

Local Development (Manual Setup)

If you prefer to run the services individually without Docker:

1. Backend Setup

cd backend
curl -LsSf https://astral.sh/uv/install.sh | sh  # Install uv if you don't have it
uv sync
cp ../.env.example .env
# Edit .env and set GEMINI_API_KEY, REDIS_HOST=localhost
uv run uvicorn src.main:app --reload

Note: Requires a running Redis instance on localhost:6379.

2. Frontend Setup

cd frontend
npm install
npm run dev

The frontend will be available at http://localhost:5173.

Cloud Deployment

Deployment Strategy

Presentation Agent is designed to be easily deployable to any cloud provider that supports Docker.

Option A: VPS (DigitalOcean, AWS EC2, etc.)

Clone the repository to your server.
Follow the Quick Start with Docker steps.
Use a reverse proxy (like the included Nginx setup) to handle SSL/TLS.

Option B: Container Services (Render, Railway, Fly.io)

Link your GitHub repository.
Set the root directory for the build.
Configure the environment variables (secrets) in the provider's dashboard.
Most platforms will automatically detect the docker-compose.yml or the individual Dockerfiles.

Google Cloud AI Usage

Presentation Agent integrates Google Gemini models to power its generation and multimodal capabilities.

AI Models Used

Gemini 2.5 Flash

Natural language reasoning
Intent classification
Slide content generation
Question answering over retrieved context

Gemini 2.5 Flash Preview TTS

Generates narration audio for presentation slides
Enables multimodal presentation output

Retrieval & Knowledge Layer

The system uses a Retrieval-Augmented Generation (RAG) pipeline:

Embedding Model: all-MiniLM-L6-v2 (SentenceTransformers)
Vector Database: ChromaDB
Document Processing: PDF ingestion and semantic chunking

When a user asks a question:

Documents are embedded and stored in ChromaDB.
Relevant context is retrieved using semantic search.
The context is sent to Gemini 2.5 Flash.
Gemini generates structured presentation content.
The system optionally generates narration using Gemini TTS.

This architecture enables real-time interactive presentations powered by AI.

Demo Video

Watch the full demonstration here:

👉 Demo Video Link

The demo shows:

Uploading brand & knowledge documents
Running the ingestion pipeline
Asking questions about the pitch
Dynamic slide generation
Real-time multimodal presentation output

Project Structure

presentation-agent/
├── backend/             # FastAPI App, ChromaDB, Gemini Logic
│   ├── src/             # Source code
│   ├── scripts/         # Utility scripts (Chroma inspection, etc.)
│   └── tests/           # Comprehensive Test Suite
├── frontend/            # React + TypeScript Web App
│   ├── src/             # UI Components & Branding Logic
│   └── public/          # Static Assets
├── docker-compose.yml   # Full stack orchestration
└── specs/               # Architecture and Design Specs

Testing

# Backend Tests
cd backend
uv run python -m pytest

# Frontend Tests
cd frontend
npm test

License

Distributed under the MIT License. See LICENSE for more information.

Name		Name	Last commit message	Last commit date
Latest commit History 103 Commits
backend		backend
demo_assets/mock_pdfs		demo_assets/mock_pdfs
docs		docs
frontend		frontend
readme-assets		readme-assets
scripts		scripts
.env.docker.example		.env.docker.example
.env.example		.env.example
.gitignore		.gitignore
.python-version		.python-version
README.md		README.md
RUN_GUIDE.md		RUN_GUIDE.md
docker-compose.yml		docker-compose.yml

Folders and files

Latest commit

History

Repository files navigation

Presentation Agent

Key Features

Tech Stack

Architecture

How It Works

Step 1 — Start from the main interface

Step 2 — Upload brand and knowledge sources

Step 3 — Extract brand identity

Step 4 — Ingest and index documents

Step 5 — Ask presentation questions

Step 6 — Generate presentation output

Step 7 — Review and continue the interactive flow

Quick Start with Docker (Recommended)

1. Prerequisites

2. Environment Setup

3. Spin Up

Local Development (Manual Setup)

1. Backend Setup

2. Frontend Setup

Cloud Deployment

Deployment Strategy

Option A: VPS (DigitalOcean, AWS EC2, etc.)

Option B: Container Services (Render, Railway, Fly.io)

Google Cloud AI Usage

AI Models Used

Retrieval & Knowledge Layer

Demo Video

Project Structure

Testing

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages