An intelligent multi-agent system for filmmakers that automates pre-production workflows by generating professional scripts, cinematic storyboards with AI-generated images, and detailed shot lists.
- Problem Statement
- Solution
- Architecture
- Features
- Technology Stack
- Setup Instructions
- Usage
- API Reference
- Project Structure
- AI Agent Concepts
- Contributing
Film pre-production is time-consuming and requires multiple specialized skills:
- Scriptwriting requires narrative structure and dialogue expertise
- Storyboarding needs visual storytelling and artistic skills
- Shot lists demand technical cinematography knowledge
Challenges:
- Time-intensive manual creation process
- Requires diverse skillsets (writer, artist, cinematographer)
- Difficult to visualize concepts before production
- Expensive to hire professionals for each task
FrameWork is an AI-powered multi-agent system that automates the entire pre-production pipeline:
- Intelligent Classification - Understands what the user already has vs. what needs to be generated
- Script Generation - Creates professional screenplays using Google Gemini Pro
- Visual Storyboarding - Generates scene breakdowns with AI-generated cinematic images (Pollinations.AI)
- Technical Shot Lists - Produces detailed camera specifications and equipment breakdowns
- Real-Time Updates - WebSocket-powered live progress tracking
Result: From a simple prompt to complete pre-production materials in minutes, not days.
+------------------------------------------------------------------+
| Frontend (Next.js/React) |
| |
| +-------------+ +---------------+ +--------------------+ |
| | Home Page | | Project Page | | Components | |
| | (Creation) | | (Dashboard) | | - ScriptView | |
| | | | | | - StoryboardView | |
| | | | | | - ShotListView | |
| +-------------+ +---------------+ +--------------------+ |
| | | | |
| +------------------+-------------------+ |
| | |
| HTTP REST + WebSocket |
+----------------------------+------------------------------------|
|
+----------------------------+------------------------------------|
| Backend (FastAPI) |
| |
| +----------------------------------------------------------+ |
| | API Layer (main.py) | |
| | | |
| | POST /projects/create | |
| | GET /projects/{id} | |
| | POST /projects/{id}/run | |
| | PUT /projects/{id}/script (triggers change) | |
| | WS /ws/{id} | |
| +----------------------------------------------------------+ |
| | |
| +----------------------------------------------------------+ |
| | Orchestrator (Pipeline + Router) | |
| | | |
| | Pipeline: | |
| | - Sequential execution | |
| | - Dependency management | |
| | - skip_script parameter | |
| | | |
| | Router: | |
| | - Classification routing | |
| | - Agent sequence determination | |
| +----------------------------------------------------------+ |
| | |
| +----------------------------------------------------------+ |
| | Multi-Agent System (5 Agents) | |
| | | |
| | +------------------+ +---------------------------+ | |
| | | InputClassifier | | ChangeDetectionAgent | | |
| | | (Gemini Pro) | | (Gemini Pro) | | |
| | | | | | | |
| | | - Analyzes input | | - Analyzes script diffs | | |
| | | - Detects user | | - Calculates change % | | |
| | | content | | - Evaluates significance | | |
| | | - Extracts data | | - Decides: regenerate? | | |
| | +------------------+ +---------------------------+ | |
| | | |
| | +----------------+ +------------------+ | |
| | | ScriptAgent | | StoryboardAgent | | |
| | | (Gemini Pro) | | (Gemini Pro + | | |
| | | | | Pollinations) | | |
| | | - Generates | | | | |
| | | screenplay | | - Parses script | | |
| | | - Formats | | - Generates imgs | | |
| | +----------------+ +------------------+ | |
| | | |
| | +----------------+ | |
| | | ShotListAgent | | |
| | | (Gemini Pro) | | |
| | | | | |
| | | - Technical | | |
| | | breakdown | | |
| | | - Camera specs | | |
| | +----------------+ | |
| +----------------------------------------------------------+ |
| | |
| +----------------------------------------------------------+ |
| | Database Layer (MongoDB) | |
| | | |
| | - Project persistence | |
| | - Script versions (for change detection) | |
| | - State management | |
| | - Stage tracking (pending/running/done/failed) | |
| +----------------------------------------------------------+ |
| |
| +----------------------------------------------------------+ |
| | WebSocket Manager | |
| | | |
| | - Real-time progress broadcasts | |
| | - Connection management | |
| | - Heartbeat (ping/pong) | |
| +----------------------------------------------------------+ |
+------------------------------------------------------------------+
User Input
|
v
+---------------------+
| InputClassifier |
| (Gemini Pro) |
|---------------------|
| - Analyzes prompt |
| - Detects content |
| - Extracts scripts |
+---------------------+
|
v
+---------------------+
| ScriptAgent |
| (Gemini Pro) |
|---------------------|
| - Generates script |
| - Formats properly |
+---------------------+
|
v
+---------------------+
| StoryboardAgent |
| (Gemini + Pollinate)|
|---------------------|
| - Parses script |
| - Generates images |
| - 8-15 frames |
+---------------------+
|
v
+---------------------+
| ShotListAgent |
| (Gemini Pro) |
|---------------------|
| - Technical breakdown|
| - Camera specs |
+---------------------+
|
v
Final Output
(Script + Storyboard + Shot List)
User Edits Script in UI
|
v
+---------------------+
| Frontend sends |
| PUT /script |
| with new content |
+---------------------+
|
v
+-------------------------+
| ChangeDetectionAgent |
| (Gemini Pro) |
|-------------------------|
| 1. Calculate diff |
| (old vs new script) |
| 2. Count changes |
| 3. Calculate % |
| 4. LLM analyzes |
| semantic meaning |
| 5. Evaluate impact |
+-------------------------+
|
+----------+----------+
| |
v v
+------------------+ +------------------+
| Minor Changes | | Significant |
| (<3% or typos) | | Changes |
| | | (>3% or scenes) |
| DECISION: | | |
| Skip Regen | | DECISION: |
+------------------+ | Regenerate |
| +------------------+
| |
v v
+------------------+ +---------------------------+
| Save script | | 1. Save script |
| Return success | | 2. Clear storyboard |
| No pipeline run | | 3. Clear shot list |
+------------------+ | 4. Run Pipeline |
| (skip_script=True) |
+---------------------------+
|
v
+---------------------------+
| Pipeline executes: |
| |
| StoryboardAgent |
| (regenerates) |
| | |
| v |
| ShotListAgent |
| (regenerates) |
+---------------------------+
|
v
+---------------------------+
| WebSocket broadcasts |
| progress to frontend |
| (real-time updates) |
+---------------------------+
|
v
Updated Output
(Same Script + New Storyboard + New Shot List)
- AI Script Generation - Professional screenplays with proper formatting
- Editable Scripts with Smart Regeneration - Edit scripts in-place, AI analyzes changes and auto-regenerates storyboard/shot list if needed
- AI Storyboard Creation - Visual breakdowns with generated images
- Shot List Automation - Technical cinematography specifications
- Intelligent Classification - Detects user-provided content vs. generation needs
- Real-Time Progress - WebSocket updates during generation
- Auto-Pipeline Execution - Pipeline starts automatically on project creation
- Content Preservation - Never rewrites user-provided scripts
- Multi-Agent System - 5 LLM-powered agents (Script, Storyboard, Shot List, Classifier, Change Detection)
- Loop Agents - Change detection triggers regeneration loop when script is edited
- Agent Evaluation - AI evaluates script changes for significance before regenerating
- Custom Tools - PollinationsAI for free image generation
- State Management - MongoDB persistence + WebSocket sessions
- Error Handling - Graceful fallbacks and safety filter management
- Logging & Observability - Comprehensive console logs and progress tracking
- Beautiful UI - Modern design with custom color palette
- FastAPI - Modern Python web framework
- Google Gemini Pro - LLM for script, storyboard, and shot list generation
- Pollinations.AI - Free text-to-image generation
- MongoDB - NoSQL database for persistence
- Motor - Async MongoDB driver
- Pydantic - Data validation and settings management
- WebSockets - Real-time communication
- Next.js 14 - React framework
- React 18 - UI library
- Tailwind CSS - Utility-first CSS
- WebSocket API - Real-time updates
- Uvicorn - ASGI server
- Python venv - Dependency isolation
- Python 3.10+ (Python 3.11+ recommended)
- Node.js 16+ and npm
- MongoDB (local or cloud instance)
- Google Gemini API Key (Get one here)
git clone <your-repo-url>
cd FrameWorkpython3 -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activatepip install -r requirements.txtCreate a .env file in the project root:
# Google Gemini API
GEMINI_API_KEY=your_gemini_api_key_here
# MongoDB Configuration
MONGODB_URL=mongodb://localhost:27017
MONGODB_DATABASE=frameworkdb
# API Configuration
API_HOST=0.0.0.0
API_PORT=8000
# WebSocket Configuration
WEBSOCKET_HEARTBEAT_INTERVAL=30
# CORS Configuration
CORS_ORIGINS=http://localhost:3000,http://127.0.0.1:3000Important: Replace your_gemini_api_key_here with your actual Gemini API key.
Using Docker:
docker run -d -p 27017:27017 --name mongodb mongo:latestOr use local MongoDB:
mongod --dbpath /path/to/dataOption 1: Using the startup script
chmod +x start_backend.sh
./start_backend.shOption 2: Manual start
cd app/backend
source ../../venv/bin/activate
uvicorn main:app --reload --host 0.0.0.0 --port 8000Backend will be available at: http://localhost:8000
cd app/frontend
npm installCreate app/frontend/.env.local:
NEXT_PUBLIC_API_URL=http://localhost:8000
NEXT_PUBLIC_WS_URL=ws://localhost:8000Option 1: Using the startup script
chmod +x start_frontend.sh
./start_frontend.shOption 2: Manual start
cd app/frontend
npm run devFrontend will be available at: http://localhost:3000
- Navigate to
http://localhost:3000 - Enter a prompt describing your film project:
A sci-fi short film about an AI discovering emotions. Should be 5 minutes, dramatic tone, visually stunning. - Click "Create Project"
- Pipeline auto-starts and generates:
- Professional screenplay
- 8-15 storyboard frames with images
- Technical shot list with camera specs
If you already have a script and want storyboard/shot list:
Here's my script:
FADE IN:
INT. LABORATORY - NIGHT
A scientist works late...
FADE OUT.
Generate a storyboard and shot list from this.
System will:
- Detect and extract your script
- Mark script as "done" (user-provided)
- Generate only storyboard and shot list
- Never rewrite your original script
After generation, you can edit the script directly in the UI:
- Click "Edit Script" button
- Make your changes in the text editor
- Click "Save Changes"
What happens next:
- AI analyzes your changes - Compares old vs. new script
- Calculates impact - Determines change percentage and significance
- Makes intelligent decision:
- Minor edits (<3%) → Script saved, no regeneration
- Typo fixes → Script saved, no regeneration
- New scenes → Automatically regenerates storyboard + shot list
- Character/location changes → Automatically regenerates
- Dialogue tweaks → Evaluated semantically
Example:
Original: "The hero walks slowly..."
Edited: "The hero runs quickly..."
Analysis: Significant change (motion affects visuals)
Action: Regenerating storyboard and shot list
Example:
Original: "The hero walks slowly..."
Edited: "The hero walks slowley..." [typo fix]
Analysis: Insignificant change (typo only)
Action: Saved, no regeneration needed
Watch the generation happen live:
- Script - See text appear in real-time
- Storyboard - Images load as they're generated
- Shot List - Table populates with rows
No refresh needed! Everything updates automatically via WebSocket.
POST /projects/create
Content-Type: application/json
{
"user_prompt": "Your film description",
"title": "Optional project title"
}
Response: {
"_id": "project_id",
"status": "created",
...
}GET /projects/{project_id}
Response: {
"_id": "project_id",
"script": "...",
"storyboard": [...],
"shot_list": [...],
...
}POST /projects/{project_id}/run
Content-Type: application/json
{
"force_rerun": false
}
Response: {
"message": "Pipeline started",
"project_id": "..."
}PUT /projects/{project_id}/script
Content-Type: application/json
{
"script": "FADE IN:\n\nINT. NEW SCENE - DAY..."
}
Response (if changes are significant): {
"message": "Script updated and regeneration started",
"project_id": "...",
"should_regenerate": true,
"regenerate_storyboard": true,
"regenerate_shot_list": true,
"reason": "Scene-level changes detected",
"change_summary": "Added new scene, modified dialogue",
"change_percentage": 18.5
}
Response (if changes are minor): {
"message": "Script updated (no regeneration needed)",
"project_id": "...",
"should_regenerate": false,
"reason": "Minor typo fixes only",
"change_summary": "3 typos corrected",
"change_percentage": 1.2
}const ws = new WebSocket('ws://localhost:8000/ws/{project_id}');
ws.onmessage = (event) => {
const data = JSON.parse(event.data);
// Progress update
if (data.type === 'progress') {
console.log(`${data.stage}: ${data.status}`);
}
// Completion
if (data.type === 'completed') {
console.log('Pipeline completed!');
}
};
// Heartbeat
setInterval(() => ws.send('ping'), 30000);FrameWork/
app/
backend/
main.py # FastAPI application entry point
config/
settings.py # Environment configuration
models/
schemas.py # Pydantic data models
project_status.py # Status enums
database/
mongodb.py # MongoDB connection manager
project_repo.py # CRUD operations
agents/
script_agent.py # Script generation (Gemini)
storyboard_agent.py # Storyboard generation (Gemini + Pollinations)
shot_list_agent.py # Shot list generation (Gemini)
change_detection_agent.py # Script change analysis (Gemini)
classifiers/
input_classifier.py # Content classification (Gemini)
orchestrator/
router.py # Classification & routing logic
pipeline.py # Sequential agent orchestration
websocket/
progress.py # WebSocket manager for real-time updates
utils/
image_generation.py # PollinationsAI custom tool
frontend/
pages/
index.js # Home page (project creation)
project/[id].js # Project dashboard page
components/
ProjectDashboard.jsx # Main dashboard layout
ScriptView.jsx # Script display component
StoryboardView.jsx # Storyboard grid display
ShotListView.jsx # Shot list table
hooks/
useProjectStatus.js # WebSocket hook for real-time updates
.env # Environment variables (create this)
requirements.txt # Python dependencies
start_backend.sh # Backend startup script
start_frontend.sh # Frontend startup script
README.md # This file
This project demonstrates several advanced AI agent concepts:
-
Sequential Agents
- 5 LLM-powered agents execute in strict dependency order
- Script → Storyboard → Shot List
- Each agent depends on the previous agent's output
- Orchestrated by
Pipelineclass
-
Loop Agents
- Script edits trigger change detection
- System can loop back and regenerate downstream artifacts
- Intelligent looping based on significance analysis
- Prevents infinite loops with smart evaluation
-
Agent Powered by LLM
InputClassifier- Analyzes user input (Gemini Pro)ScriptAgent- Generates screenplays (Gemini Pro)StoryboardAgent- Creates visual breakdowns (Gemini Pro)ShotListAgent- Produces technical specs (Gemini Pro)ChangeDetectionAgent- Evaluates script edits (Gemini Pro)
- PollinationsAI Integration
- Custom tool for text-to-image generation
- Zero-cost, no API key required
- Generates cinematic storyboard images
- Located in
app/backend/utils/image_generation.py
- Background Tasks
- Pipeline runs asynchronously using FastAPI's
BackgroundTasks - WebSocket provides progress updates during execution
- Non-blocking user experience
- Pipeline runs asynchronously using FastAPI's
-
Session Management
- WebSocket session management via
WebSocketManager - Persistent connections across component re-renders
- Global WebSocket cache for React Strict Mode compatibility
- WebSocket session management via
-
Persistent State
- MongoDB stores all project data
- Classification results saved
- Stage status tracking (pending/running/done/failed)
- Long-term project history
-
Change Significance Analysis
ChangeDetectionAgentevaluates script edits- Determines if changes warrant regeneration
- Uses both LLM analysis and heuristics
- Calculates change percentage and semantic impact
-
Smart Decision Making
- < 3% changes → Skip regeneration
- Scene-level changes → Always regenerate
- Typo fixes → Skip regeneration
- New characters/locations → Regenerate
-
Logging
- Comprehensive console logging with emojis
- Stage-specific progress messages
- Error tracking and reporting
- WebSocket-broadcasted status updates
-
Tracing (Basic)
- Execution flow tracking via logs
- Pipeline stage monitoring
- Real-time progress visualization
The InputClassifier uses Gemini Pro to analyze user prompts and determine:
- Does the user already have a script?
- Does the user want a storyboard generated?
- Does the user want a shot list?
Examples:
Input: "Write a script about robots"
→ Classification: {script: false, storyboard: false, shot_list: false}
→ Generates: Script + Storyboard + Shot List
Input: "Here's my script: FADE IN... Generate storyboard"
→ Classification: {script: true, storyboard: false, shot_list: false}
→ Generates: Storyboard only (script preserved)
- Professional screenplay format
- Proper scene headings (INT./EXT.)
- Natural dialogue
- Industry-standard formatting
- FADE IN/FADE OUT markers
- Configurable length based on user requirements
- 8-15 key visual frames
- Scene descriptions with cinematography details
- Camera angles for each frame
- Dialogue integration
- AI-Generated Images via Pollinations.AI (free, no API key)
- Cinematic prompts for realistic film frames
- 1:1 minimum ratio with storyboard frames
- Shot types (Wide, Medium, Close-Up, etc.)
- Scene identifiers
- Technical descriptions
- Duration estimates
- Equipment summary
- WebSocket connection per project
- Live progress tracking
- Auto-refresh when stages complete
- Persistent connections across page reloads
- Heartbeat mechanism (ping/pong every 30s)
User Input:
A romantic comedy about two chefs competing in a cooking competition
but falling in love. Should be lighthearted, 8 minutes long.
System Execution:
1. Classification (2 seconds)
Detected: User has nothing, needs everything
→ Sequence: [script, storyboard, shot_list]
2. Script Agent (15-20 seconds)
⏳ Generating professional screenplay...
8-page script with dialogue and scene descriptions
WebSocket: "Script generation complete"
3. Storyboard Agent (40-60 seconds)
⏳ Parsing script into visual frames...
⏳ Generating 12 cinematic images...
12 storyboard frames with images
WebSocket: "Storyboard generation complete"
4. Shot List Agent (20-30 seconds)
⏳ Creating technical breakdown...
12+ shots with camera specs
WebSocket: "Shot list generation complete"
Total Time: ~80-110 seconds
User Input:
Here's my script:
FADE IN:
INT. COFFEE SHOP - DAY
SARAH, 30s, sits alone...
FADE OUT.
Generate storyboard and shot list.
System Execution:
1. Classification (2 seconds)
Detected: User provided script
Extracted and saved user's script
Marked script stage as DONE
→ Sequence: [storyboard, shot_list]
2. Storyboard Agent (40-60 seconds)
⏳ Using user's script...
10 storyboard frames with images
3. Shot List Agent (20-30 seconds)
10 shots from storyboard
Total Time: ~60-90 seconds (faster, skipped script!)
User Action:
User clicks "Edit Script" and makes changes:
OLD:
INT. COFFEE SHOP - DAY
SARAH sits alone reading.
NEW:
INT. COFFEE SHOP - NIGHT
SARAH sits alone reading.
TOM enters, approaches her table.
System Execution:
1. Change Detection (3-5 seconds)
Analyzing differences...
Detected changes:
- Time of day: DAY → NIGHT
- New character: TOM introduced
- Scene structure modified
Analysis Result:
{
"should_regenerate": true,
"reason": "Time change affects lighting, new character affects framing",
"change_percentage": 22.5%
}
2. Script Update (instant)
New script saved to database
3. Selective Regeneration (60-80 seconds)
⏭ Skipping script generation (already saved)
Regenerating storyboard...
- Adjusting for NIGHT lighting
- Adding frames for TOM
12 new storyboard frames
Regenerating shot list...
- Over-the-shoulder shots for TOM
- Lighting equipment for night scene
14 new shots
4. Complete
All artifacts updated
WebSocket notifies frontend in real-time
Total Time: ~65-90 seconds (smart regeneration!)
Alternate Example: Minor Edit (No Regeneration)
User changes:
OLD: "SARAH sits alone reading."
NEW: "SARAH sits alone, reading quietly."
Change Detection:
{
"should_regenerate": false,
"reason": "Minor dialogue addition, no visual impact",
"change_percentage": 2.1%
}
Result:
Script saved
No regeneration (visuals unchanged)
Instant update
The UI uses a carefully selected color scheme:
- Palladian (#EEE9DF) - Warm background
- Blue Fantastic (#2C3B4D) - Primary headers
- Abyssal Anchorfish Blue (#1B2632) - Deep accents
- Burning Flame (#FFB162) - CTAs and highlights
- Truffle Trouble (#A35139) - Secondary accents
- Oatmeal (#C9C1B1) - Borders and neutrals
All agents configured with relaxed safety settings for creative filmmaking:
safety_settings = [
{"category": "HARM_CATEGORY_HARASSMENT", "threshold": "BLOCK_NONE"},
{"category": "HARM_CATEGORY_HATE_SPEECH", "threshold": "BLOCK_NONE"},
{"category": "HARM_CATEGORY_SEXUALLY_EXPLICIT", "threshold": "BLOCK_NONE"},
{"category": "HARM_CATEGORY_DANGEROUS_CONTENT", "threshold": "BLOCK_NONE"},
]Rationale: Filmmaking often involves dramatic, violent, or mature themes. These settings allow creative freedom while maintaining fallback mechanisms.
- API keys stored in
.envfile (not committed to git) - Environment variables loaded via Pydantic settings
.gitignoreprevents accidental commits
app.add_middleware(
CORSMiddleware,
allow_origins=settings.cors_origins,
allow_credentials=True,
allow_methods=["*"],
allow_headers=["*"],
){
_id: string,
title: string,
user_prompt: string,
status: "created" | "processing" | "completed" | "failed",
classification: {
script: boolean, // User has script?
storyboard: boolean, // User has storyboard?
shot_list: boolean // User has shot list?
},
script_stage: {
status: "pending" | "running" | "done" | "failed",
started_at: Date | null,
completed_at: Date | null,
error: string | null
},
script: string | null,
storyboard: Frame[] | null,
shot_list: Shot[] | null,
created_at: Date,
updated_at: Date
}{
frame_number: number,
scene: string,
description: string,
camera_angle: string,
dialogue: string,
image_url: string,
notes: string
}{
shot_number: number,
scene: string,
shot_type: string,
camera_movement: string,
description: string,
duration: string,
equipment: string[],
lens: string,
notes: string
}The system includes comprehensive error handling:
- Safety Filter Blocks: Falls back to keyword-based classification
- JSON Parsing Errors: Multiple cleanup strategies + retry logic
- Agent Failures: Generates basic fallback content
- WebSocket Disconnects: Auto-reconnect with exponential backoff
All errors are logged with:
- Error emoji for visibility
- Detailed error messages
- Stack traces for debugging
- WebSocket broadcasts for frontend awareness
Chosen: Sequential execution
Rationale:
- Storyboard requires script as input
- Shot list requires storyboard as input
- Dependencies enforce sequential order
- Simpler to debug and maintain
Chosen: true = user HAS content, false = needs generation
Rationale:
- More intuitive: "What does the user already have?"
- Easier to extend (add new content types)
- Clear separation between provided vs. generated content
Chosen: WebSocket over polling
Rationale:
- Real-time updates without delay
- Lower server load (no constant polling)
- Better UX (instant feedback)
- Professional appearance
Chosen: Pollinations.AI over commercial APIs
Rationale:
- Free, no API key required
- Good quality for storyboard visualization
- No rate limits or costs
- Easy integration
+-------------------------------------------------------------------+
| FRONTEND (React) |
+-------------------------------------------------------------------+
|
+--------------------+---------------------+
| | |
v v v
Create Project Get Project Data Edit Script
| | |
| | |
v v v
POST /projects/create GET /projects/{id} PUT /projects/{id}/script
| | |
| | |
+--------v--------------------v---------------------v----------------+
| FASTAPI BACKEND |
+--------------------------------------------------------------------+
| | |
| | |
v v v
+------------+ +-------------+ +-------------------+
| Create | | Fetch from | | ChangeDetection |
| Project | | MongoDB | | Agent |
| Record | | Return data | | |
+------------+ +-------------+ | 1. Get old script |
| ^ | 2. Compare with |
v | | new script |
+------------+ | | 3. Calculate diff |
| Auto-run | | | 4. Analyze with |
| Pipeline | | | Gemini Pro |
+------------+ | | 5. Decide action |
| | +-------------------+
| | |
v | +-----------+------------+
+-------------------+ | | |
| PIPELINE | | v v
| ORCHESTRATOR | | Minor Change Significant
+-------------------+ | (Skip regen) (Regenerate)
| | | |
v | v v
+-------------------+ | Save script Save script
| ROUTER | | Return success Clear storyboard
| - Classification | | | Clear shot_list
| - Determine | +---------+ Run Pipeline
| sequence | (skip_script=True)
+-------------------+ |
| |
v v
+-------------------+ +-------------------+
| Agent Sequence | | Agent Sequence |
| (Initial) | | (Regeneration) |
+-------------------+ +-------------------+
| |
v |
IF script=false: |
+-------------------+ |
| ScriptAgent | |
| (Gemini Pro) | |
| - Generate script | |
| - Save to DB | |
+-------------------+ |
| |
v<------------------------------------------------------+
IF storyboard=false:
+-------------------+
| StoryboardAgent |
| (Gemini Pro + |
| Pollinations) |
| - Parse script |
| - Generate frames |
| - Generate images |
| - Save to DB |
+-------------------+
|
v
IF shot_list=false:
+-------------------+
| ShotListAgent |
| (Gemini Pro) |
| - Analyze frames |
| - Technical specs |
| - Save to DB |
+-------------------+
|
v
+-------------------+
| Mark Complete |
| Update status |
+-------------------+
|
v
+-------------------+
| WebSocket Manager |
| - Broadcasts |
| progress at |
| each step |
+-------------------+
|
v
+--------------------------------------------------------------------+
| MONGODB DATABASE |
| |
| - Projects Collection |
| - _id, title, user_prompt |
| - script (versioned for change detection) |
| - storyboard (array of frames with images) |
| - shot_list (array of technical shots) |
| - classification (what user provided) |
| - script_stage, storyboard_stage, shot_list_stage |
| - status (created/processing/completed/failed) |
+--------------------------------------------------------------------+
|
v (Real-time updates)
+--------------------------------------------------------------------+
| WEBSOCKET CONNECTION |
| |
| - Persistent connection per project |
| - Heartbeat (ping/pong every 30s) |
| - Progress messages: {stage, status, message} |
| - Triggers frontend re-render on updates |
+--------------------------------------------------------------------+
|
v
+--------------------------------------------------------------------+
| FRONTEND (React) |
| |
| useProjectStatus Hook: |
| - Maintains WebSocket connection |
| - Updates status state |
| - Triggers data refresh via lastUpdate |
| |
| ProjectDashboard Component: |
| - Fetches latest data when lastUpdate changes |
| - Re-renders with new script/storyboard/shot_list |
| |
| ScriptView Component: |
| - "Edit Script" button |
| - Textarea editor |
| - "Save Changes" triggers PUT /script |
| - Shows analysis result (regenerate or skip) |
| - Shows real-time regeneration progress |
+--------------------------------------------------------------------+
- Entry Point:
PUT /projects/{id}/scriptendpoint inmain.py - Invocation: Called before pipeline execution to analyze changes
- Decision Making: Returns
should_regenerateboolean - Pipeline Control: If true, pipeline runs with
skip_script=True - Database: Compares current script (from DB) vs. new script (from request)
- Feedback Loop: Results sent back to frontend for user notification
If user provides a script:
- Script is extracted and saved AS-IS
- Script agent is SKIPPED (not run)
- Script stage marked as DONE immediately
- Only requested outputs are generated
- Original script NEVER rewritten
generation_config={
"temperature": 0.7,
"top_p": 0.9,
"max_output_tokens": 4000,
"response_mime_type": "application/json", # Forces valid JSON
}Multiple cleanup strategies:
- Remove markdown code blocks
- Strip trailing commas
- Remove text before/after JSON
- Fix single quotes → double quotes
- Retry with error logging
Ensures shot count matches storyboard frame count:
if len(shot_list) < len(storyboard):
# Auto-generate missing shots from remaining frames
for i in range(len(shot_list), len(storyboard)):
shot_list.append(create_shot_from_frame(storyboard[i]))- Script Generation: ~15-20 seconds
- Storyboard Generation: ~40-60 seconds (with images)
- Shot List Generation: ~20-30 seconds
- Total Pipeline: ~80-110 seconds (from scratch)
- With User Script: ~60-90 seconds (skips script generation)
Terminal 1 - Backend:
./start_backend.shTerminal 2 - Frontend:
./start_frontend.shBoth backend and frontend support hot reload:
- Backend: Uvicorn
--reloadflag - Frontend: Next.js automatic refresh
Backend logs:
# Watch backend terminal for:
Gemini API key loaded
Starting pipeline for project...
Generating script...
Script generated successfullyFrontend console:
// Open DevTools (F12) to see:
Creating new WebSocket connection...
WebSocket connected
WebSocket message: {type: 'progress', ...}
Project data refreshedfastapi>=0.109.0
uvicorn>=0.27.0
motor>=3.3.2
pydantic>=2.5.3
pydantic-settings>=2.1.0
python-dotenv>=1.0.0
google-generativeai>=0.3.0
httpx>=0.26.0
{
"dependencies": {
"next": "14.x",
"react": "18.x",
"react-dom": "18.x"
},
"devDependencies": {
"tailwindcss": "^3.x",
"autoprefixer": "^10.x",
"postcss": "^8.x"
}
}- Set
DEBUG=Falsein production - Use environment variables for all secrets
- Set up MongoDB authentication
- Configure CORS for production domain
- Use process manager (PM2, systemd)
- Set up HTTPS/TLS
- Implement rate limiting
- Add authentication/authorization
- Set up monitoring and alerts
# Example Dockerfile (not yet implemented)
FROM python:3.11-slim
WORKDIR /app
COPY requirements.txt .
RUN pip install -r requirements.txt
COPY app/backend ./app/backend
CMD ["uvicorn", "app.backend.main:app", "--host", "0.0.0.0"]Contributions are welcome! Areas for improvement:
- Parallel Agents - Run independent tasks concurrently
- Agent Evaluation - Metrics and quality assessment
- Memory Bank - Long-term context retention
- Context Engineering - Compaction for large inputs
- A2A Protocol - Agent-to-agent communication
- Production Deployment - Docker, cloud hosting
- Export Features - PDF, Final Draft, CSV
- File Uploads - Upload existing scripts/storyboards
MIT License - See LICENSE file for details
- Google Gemini - LLM for content generation
- Pollinations.AI - Free text-to-image generation
- FastAPI - Modern Python web framework
- Next.js - React framework
- MongoDB - Database persistence
For issues, questions, or contributions:
- Create an issue in the repository
- Check logs in backend/frontend terminals
- Review console output for debugging
Built with love for filmmakers by filmmakers