AI-Powered Video Production Tool | AI ι©±ε¨ηε½±θ§εδ½ε·₯ε ·
Features (δΈζ) | Quick Start | Deployment | Usage | Tech Stack
Video Agent Pro is an AI-powered video storyboard generation and editing tool built with Next.js 15 and multiple AI models (Gemini + Volcano Engine + Sora). It provides both conversational AI (Agent Mode) and fine-grained control (Pro Mode) to help creators produce videos from script to final cut.
β οΈ Note: This project requires user authentication. All data is stored in the cloud (Supabase + Cloudflare R2).
- Input script and AI automatically breaks down scenes and shots
- Based on professional 8-principle storyboard rules
- Extracts shot size, camera movement, and descriptions
- AI-generated character design sheets
- Layout: 1/3 face closeup + 2/3 front/side/back views
- Pure white background, official art style
- Powered by Volcano Engine SeeDream 4.0
- Generate 2Γ2 (4 views) or 3Γ3 (9 views) storyboard grids
- Multiple aspect ratios: 16:9, 4:3, 21:9, 1:1, etc.
- Style presets: Cinematic, Anime, Realistic, Cyberpunk
- Reference image support for consistency
- GridPreviewModal: Visual preview of full grid and individual slices
- Click to assign slices to specific shots
- Smart auto-suggestion: first N slices β first N shots
- Confirmation before updating shot data
- Image-to-Video generation (4-6 seconds)
- Powered by Volcano Engine SeeDance 1.0 Pro
- Async task processing with progress tracking
- Agent Mode: Conversational AI control with natural language
- Pro Mode: Manual parameter adjustment for fine control
- Seamless mode switching
- Drag-and-drop scene and shot management
- Zoom (50%-200%) and pan controls
- Dot grid background
- Visual status indicators (draft/generating/done/failed)
- Three states: collapsed/default/expanded
- Video and audio tracks
- Time ruler with 5-second intervals
- Playhead indicator
- Preview and export buttons (UI ready)
- Sora Orchestrator - Automated video generation pipeline
- Character Registration - @username-based character consistency
- Dynamic Aspect Ratio - Auto-detect image ratio for optimal output
- Smart Scene Splitting - >15s scenes auto-split into chunks
- Quality Control - Mandated prompts for high-quality output
- R2 Persistence - Automatic upload to Cloudflare R2
- Upload audio files (all formats)
- Category classification: Music / Voice / Sound Effects
- Auto-convert to Data URL for storage
- Display and delete functionality
- Supabase Auth Integration - Secure user authentication
- Three-tier Role System - admin (free) / vip (80% off) / user (standard price)
- Credits Management - All AI operations consume credits
- Auto Profile Creation - Profile auto-created on first login
- Session Persistence - Cookie-based session with auto-refresh (client + middleware)
- Server-side Refresh - Middleware refreshes expired access tokens and injects
Authorizationfor API routes
- Cancel AI Requests - Stop ongoing AI operations anytime
- Agent Mode Support - Cancel long-running conversations
- Clean Resource Cleanup - Proper cleanup of network requests
- Supabase Database - PostgreSQL cloud storage for all data
- Cloudflare R2 - Media file storage (images, videos, audio)
- Chat History Sync - Three-level scope (project/scene/shot)
- Auto-sync - Automatic data synchronization across devices
β οΈ Guest mode is not supported. Login is required to use all features.
cd finalAgent/video-agent-pro
npm installCreate .env.local file:
# Gemini API (for Grid generation)
NEXT_PUBLIC_GEMINI_API_KEY=your_gemini_api_key
# Volcano Engine API
NEXT_PUBLIC_VOLCANO_API_KEY=your_volcano_api_key
NEXT_PUBLIC_VOLCANO_BASE_URL=https://ark.cn-beijing.volces.com/api/v3
# Model Endpoints (create in Volcano Engine Console)
NEXT_PUBLIC_SEEDREAM_MODEL_ID=ep-xxxxxx-xxxxx # Image generation
NEXT_PUBLIC_SEEDANCE_MODEL_ID=ep-xxxxxx-xxxxx # Video generation
NEXT_PUBLIC_DOUBAO_MODEL_ID=ep-xxxxxx-xxxxx # AI conversation
# Supabase (for cloud storage and authentication)
NEXT_PUBLIC_SUPABASE_URL=your_supabase_url
NEXT_PUBLIC_SUPABASE_ANON_KEY=your_supabase_anon_key
SUPABASE_SERVICE_ROLE_KEY=your_service_role_key # Server-side only
# Cloudflare R2 (optional, for file storage)
R2_ACCOUNT_ID=your_account_id
R2_ACCESS_KEY_ID=your_access_key
R2_SECRET_ACCESS_KEY=your_secret_key
R2_BUCKET_NAME=your_bucket_name
R2_PUBLIC_DOMAIN=https://your-domain.r2.devGet API Keys:
- Gemini: Google AI Studio
- Volcano Engine: Volcano Engine Console
- Supabase: Supabase Dashboard - Create a new project
npm run devVisit http://localhost:3000
- Click "Create New Project" on homepage
- Enter project name and description
- Enter project editing page
- Click "Script" tab in left sidebar
- Input or paste script content
- Click "AI Generate Storyboard"
- AI automatically analyzes and generates scenes and shots
- Click "Characters" tab in left sidebar
- Click "+ Add", fill in character information
- Enter character name, appearance, art style
- Click "AI Generate Character Turnaround"
- Generated image auto-added to reference library
Method 1: Pro Mode (Manual)
- Select a shot on canvas
- Switch to "Pro" mode on right panel
- Select "Grid Multi-View"
- Set Grid size (2x2 or 3x3)
- Set aspect ratio and style preset
- Enter prompt, click "Generate Grid"
- Manually assign slices to shots in preview modal
- Click "Confirm Assignment"
Method 2: Agent Mode (AI Conversation)
- Select shot, switch to "Agent" mode
- Type: "Generate a grid for this shot"
- AI automatically executes generation
Prerequisite: Shot must have Grid image
- Select shot with Grid image
- Switch to "Pro" mode, select "Video Generation"
- Enter video camera movement prompt
- Click "Generate Video", wait 2-3 minutes
- Framework: Next.js 15.1 with App Router + Turbopack
- Frontend: React 19, TypeScript 5.8
- Styling: Tailwind CSS 3.4 (Cinema Dark theme)
- State Management: Zustand + Immer middleware
- Database: Supabase (PostgreSQL) - Cloud only, no local fallback
- Authentication: Supabase Auth (Email/Password + OAuth)
- File Storage: Cloudflare R2 (images, videos, audio)
- AI Models:
- Google Gemini 3 Flash (Agent reasoning, Grid generation)
- Volcano Engine SeeDream 4.0 (Image generation)
- Volcano Engine SeeDance 1.0 Pro (Video generation)
- Sora 2 via Kaponai API (Professional video with character consistency)
- Jimeng (Chinese-optimized image generation)
src/
βββ app/ # Next.js App Router
β βββ api/ # API Routes (22+ endpoints)
β βββ admin/ # Admin dashboard
β βββ auth/ # Authentication pages
β βββ project/[id]/ # Project editing page
βββ components/ # React components (18 directories)
β βββ layout/ # Layout (sidebars, panels, timeline)
β βββ canvas/ # Infinite canvas
β βββ agent/ # Agent components
β βββ chat/ # Chat interface
βββ services/ # Business services (19+ files)
β βββ agentService.ts # Agent core (Function Calling)
β βββ agentToolDefinitions.ts # 28 Agent tools
β βββ geminiService.ts # Gemini Grid generation
β βββ SoraOrchestrator.ts # Sora video orchestration
β βββ KaponaiService.ts # Sora API wrapper
β βββ jimengService.ts # Jimeng integration
βββ lib/ # Core libraries
β βββ dataService.ts # Unified data service (1269 lines)
β βββ storageService.ts # R2 file upload
β βββ auth-middleware.ts # Authentication middleware
βββ store/ # Zustand state management
β βββ useProjectStore.ts # Project state (674 lines)
βββ types/ # TypeScript definitions
βββ project.ts # Project types (512 lines)
- Grid generation history (per scene)
- Timeline playback with sync
- Drag shots to Timeline
- Video export with audio mixing
- TTS audio generation
- Scene drag & reorder on canvas
- Timeline clip adjustment (trim, reorder)
- Payment integration for credits
- OAuth login (GitHub, Google)
For detailed feature list, see FEATURES.md
- Quick Reference for AI: AGENTS.md - Commands and best practices
- API Architecture: API_ARCHITECTURE.md - API design and authentication
- Authentication System: AUTHENTICATION.md - User auth and roles
- Credits System: CREDITS_SYSTEM.md - Credits pricing and management
- Development Guide: CLAUDE.md - Detailed development philosophy
- Chat Migration: CHAT_STORAGE_MIGRATION.md - Cloud storage migration guide
- Check
NEXT_PUBLIC_GEMINI_API_KEYin.env.local - Ensure network can access Google API
- Verify inference endpoints created in Volcano Engine Console
- Confirm endpoint_id format is correct (ep-xxxxxx-xxxxx)
- Ensure shot has Grid image
- Check
NEXT_PUBLIC_DOUBAO_MODEL_IDconfiguration
Click the button below for one-click deployment:
Manual Deployment Steps:
- Visit Vercel Import
- Connect your GitHub account
- Configure environment variables (see
.env.example) - Click "Deploy"
Required Environment Variables:
- Supabase:
NEXT_PUBLIC_SUPABASE_URL,NEXT_PUBLIC_SUPABASE_ANON_KEY,SUPABASE_SERVICE_ROLE_KEY - R2 Storage:
R2_BUCKET_NAME,R2_ACCESS_KEY_ID,R2_SECRET_ACCESS_KEY,R2_ENDPOINT,NEXT_PUBLIC_R2_PUBLIC_URL - Gemini API:
GEMINI_TEXT_API_KEY,GEMINI_IMAGE_API_KEY,GEMINI_AGENT_API_KEY - Volcano Engine:
NEXT_PUBLIC_VOLCANO_API_KEY, model IDs for SeeDream/SeeDance/Doubao
Post-Deployment:
- Auto-deploy on every push to
mainbranch - Preview deployments for PRs
- Custom domain configuration available
For detailed instructions, see DEPLOY.md
- β Pure Cloud Architecture - Removed guest mode, all data stored in cloud
- β 28 Agent Tools - Complete CRUD + generation + batch operations
- β Jimeng Integration - Chinese-optimized image generation
- β Location Management - Location reference image generation
- β Planning Mode - Separate tool set for story conception
- β Timeline Video Sync - Progress bar drag auto-switches shots
- β Anti-Override Sync - Smart sync prevents overwriting user selections
- β Sora Sync Performance Optimization - Batch processing, only new tasks (within 30s) write to database
- β Sora Video Always Overwrites - Agent mode videos always overwrite existing shots
- β URL Normalization Fix - Unified filename comparison for task status matching
- β Session Auto-Refresh - Retrieve from Supabase when cookie missing, background refresh before expiry
- β Agent Request Timeout Extended - CONTINUE_TIMEOUT_MS increased from 45s to 90s
- β
Sora Registration Sync Mode -
register_directnow returns in 1-3 seconds - β Latest-Video API Optimization - Pure read-only, no more R2 upload delays
- β Auth Middleware Refresh Token - Auto-refresh expired access tokens
- β SmartRecovery Fix - Prevents infinite polling loop on page refresh
- β Sora Registration Optimization - Async registration (<1s response) + Smart Task Recovery
- β Smart Asset Generation - Auto-detects missing assets for imported storyboards
- β Inspiration Auto-Trigger - Auto-starts AI storyboard from homepage inspiration
- β Conflict Resolution - Mutual exclusion between Auto-Gen and Asset-Gen flows
- β UI Refinement - Unified color theme (Zinc/Neutral) for progress indicators
- β Sora Video Generation - Full Sora 2 integration via Kaponai API
- β SoraOrchestrator - Automated pipeline for character registration and video generation
- β Character Consistency - @username-based character tracking across scenes
- β Dynamic Aspect Ratio - Auto-detect image ratio for optimal video output
- β Smart Scene Splitting - >15s scenes auto-split into chunks (Greedy Packing)
- β R2 Persistence - Automatic upload to Cloudflare R2 for video storage
- β User Authentication System - Supabase Auth integration
- β Credits System - Three-tier pricing (admin free, vip 80% off, user standard)
- β Request Cancellation - AbortController support for AI requests
- β Cloud Storage - Supabase PostgreSQL for projects and chat history
- β Chat History Sync - Three-level scope (project/scene/shot) cloud storage
- β Character AI turnaround generation (1/3 face + 2/3 views)
- β GridPreviewModal component for slice preview & manual assignment
- β Pro mode Grid generation integrated with preview modal
- β Audio upload functionality (music/voice/sfx)
- β Canvas zoom and pan
- β Gemini API integration for Grid generation
- β AI Agent conversation system (streaming output)
- β AI storyboard generation (8-principle rules)
- β Timeline editor
MIT License
Developed by θ₯ΏηΎη³ Team, assisted by Claude Code + Gemini Code.
Star β this repo if you find it helpful!