Skip to content

HatemSal/presentation-agent

Repository files navigation

Presentation Agent

AI-powered system that transforms static pitch decks into dynamic, interactive presentations.

Traditional presentations are static and force founders to follow a rigid narrative. Investors often want to drill deeper into specific aspects such as market size, technology, or financials, but switching slides or answering verbally breaks the flow.

Presentation Agent solves this by turning a pitch deck into a live AI presentation system.

Users upload:

  • A Brand Guide
  • Supporting technical documents
  • Knowledge base PDFs

The system builds a RAG-powered knowledge layer and generates slides dynamically in response to questions.

This allows the presenter to explore the presentation in a choose-your-own-adventure format where the audience can ask questions and the system generates new slides, visuals, and narration in real time.


Key Features

  • Interactive RAG: Conversations with your pitch deck. Ask a question, and the agent regenerates slides and audio on the fly.
  • Brand Consistency: Upload a Brand Guide (PDF) to ensure all generated content aligns with your visual identity.
  • Deep Knowledge: Ingest technical documents (PDFs) into a local Chroma vector store for accurate, grounded responses.
  • Multimodal Output: Generates rich React-based slides, speaker notes, and high-quality Text-to-Speech (TTS) audio.
  • Premium UI: Sleek, glassmorphic design system with smooth Framer Motion transitions.

Tech Stack

  • Backend: Python 3.11+, FastAPI, uv
  • Frontend: React 18, Vite, TypeScript, Framer Motion
  • AI/LLM: Google Gemini (via Google Generative AI API)
  • Database/Storage: ChromaDB (Vector Store), Redis (Session Cache)
  • Deployment: Docker, Docker Compose, Nginx

Architecture

The system follows a modular AI-agent architecture built around Retrieval-Augmented Generation (RAG), allowing presentation content to be dynamically generated based on user questions and retrieved knowledge.

Architecture Diagram


How It Works

Step 1 — Start from the main interface

Users begin a session from the landing page.

Home Screen

Step 2 — Upload brand and knowledge sources

Users upload a brand guide and supporting PDFs that define both style and content.

Documents Upload

Documents Upload 2

Step 3 — Extract brand identity

The system analyzes the brand guide to infer colors, design language, and presentation tone.

Brand Guide Processing

Step 4 — Ingest and index documents

Documents are processed into chunks and stored in the retrieval layer for grounded generation.

Starting Ingestion

Step 5 — Ask presentation questions

The presenter asks a natural-language question to drill deeper into a topic.

Generating Answer

Step 6 — Generate presentation output

The system produces dynamic slides tailored to the question while preserving brand consistency.

Slides Answer

Slides Answer 2

Step 7 — Review and continue the interactive flow

Generated material can be reviewed and extended through follow-up questions.

Review Screen


Quick Start with Docker (Recommended)

The fastest way to get the project running is using Docker Compose.

1. Prerequisites

2. Environment Setup

Copy the example environment file and add your API key:

cp .env.docker.example .env

Edit .env and set:

  • GEMINI_API_KEY=your_key_here

3. Spin Up

docker compose up --build

Once the containers are healthy, access the application at:


Local Development (Manual Setup)

If you prefer to run the services individually without Docker:

1. Backend Setup

cd backend
curl -LsSf https://astral.sh/uv/install.sh | sh  # Install uv if you don't have it
uv sync
cp ../.env.example .env
# Edit .env and set GEMINI_API_KEY, REDIS_HOST=localhost
uv run uvicorn src.main:app --reload

Note: Requires a running Redis instance on localhost:6379.

2. Frontend Setup

cd frontend
npm install
npm run dev

The frontend will be available at http://localhost:5173.


Cloud Deployment

Deployment Strategy

Presentation Agent is designed to be easily deployable to any cloud provider that supports Docker.

Option A: VPS (DigitalOcean, AWS EC2, etc.)

  1. Clone the repository to your server.
  2. Follow the Quick Start with Docker steps.
  3. Use a reverse proxy (like the included Nginx setup) to handle SSL/TLS.

Option B: Container Services (Render, Railway, Fly.io)

  1. Link your GitHub repository.
  2. Set the root directory for the build.
  3. Configure the environment variables (secrets) in the provider's dashboard.
  4. Most platforms will automatically detect the docker-compose.yml or the individual Dockerfiles.

Google Cloud AI Usage

Presentation Agent integrates Google Gemini models to power its generation and multimodal capabilities.

AI Models Used

Gemini 2.5 Flash

  • Natural language reasoning
  • Intent classification
  • Slide content generation
  • Question answering over retrieved context

Gemini 2.5 Flash Preview TTS

  • Generates narration audio for presentation slides
  • Enables multimodal presentation output

Retrieval & Knowledge Layer

The system uses a Retrieval-Augmented Generation (RAG) pipeline:

  • Embedding Model: all-MiniLM-L6-v2 (SentenceTransformers)
  • Vector Database: ChromaDB
  • Document Processing: PDF ingestion and semantic chunking

When a user asks a question:

  1. Documents are embedded and stored in ChromaDB.
  2. Relevant context is retrieved using semantic search.
  3. The context is sent to Gemini 2.5 Flash.
  4. Gemini generates structured presentation content.
  5. The system optionally generates narration using Gemini TTS.

This architecture enables real-time interactive presentations powered by AI.


Demo Video

Watch the full demonstration here:

👉 Demo Video Link

The demo shows:

  • Uploading brand & knowledge documents
  • Running the ingestion pipeline
  • Asking questions about the pitch
  • Dynamic slide generation
  • Real-time multimodal presentation output

Project Structure

presentation-agent/
├── backend/             # FastAPI App, ChromaDB, Gemini Logic
│   ├── src/             # Source code
│   ├── scripts/         # Utility scripts (Chroma inspection, etc.)
│   └── tests/           # Comprehensive Test Suite
├── frontend/            # React + TypeScript Web App
│   ├── src/             # UI Components & Branding Logic
│   └── public/          # Static Assets
├── docker-compose.yml   # Full stack orchestration
└── specs/               # Architecture and Design Specs

Testing

# Backend Tests
cd backend
uv run python -m pytest

# Frontend Tests
cd frontend
npm test

License

Distributed under the MIT License. See LICENSE for more information.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors