An open-source AI video studio with a 9-agent director pipeline โ from a one-line idea to a fully rendered video with voiceover, BGM, and storyboard.
๐ Official Website: https://od.seme.cc
OpenDirector is a Docker-first, self-hosted AI video production studio. Describe your idea in one sentence, and a team of 9 specialized AI agents collaborate to produce a complete video โ with optional web research, a storyboard, character designs, voiceover, background music, and rendered output.
Just docker compose up and start creating.
| AI Director Chat | Batch Production |
|---|---|
![]() |
![]() |
| Creation Editor | Storyboard Preview |
![]() |
![]() |
1.mp4 |
2.mp4 |
cmp3hhcas00g5hjrqfisin8y4.mp4 |
cmp3k3mxu006z8y7pxjl9iiuj.mp4 |
Your Idea
|
v
[Research Agent] --> [Script Agent] --> [Art Style Agent] --> [Storyboard Agent]
|
v
[Character Agent] --> [Location Agent] --> [Voice Agent] --> [BGM Agent]
|
v
[Media Agent]
|
v
[Render Worker] --> Final Video
9 specialized agents work in a pipeline:
- Research Agent โ uses OpenAI
web_search_previewwhen needed to check known stories, factual references, brands, products, and source notes - Script Agent โ generates the story outline and narrative structure, using research notes when available
- Art Style Agent โ selects from 34 built-in styles (e.g. Futuristic Neon Noir, Dreamscape Watercolor Anime, Documentary Realism)
- Storyboard Agent โ breaks the story into scenes with shot descriptions and dialogue
- Character Agent โ designs characters with visual prompts and assigns voice profiles
- Location Agent โ creates environment concepts for each scene
- Voice Agent โ assigns TTS voices matched to character personality and gender
- BGM Agent โ generates background music based on story atmosphere
- Media Agent โ orchestrates image/voice/music generation into final assets
Each agent is a LangGraph node that streams its output in real-time โ you can watch the plan build step by step.
The shared graph state now includes a research field with notes, cautions, and sources. The Script Agent consumes those notes without copying source text.
- Input one sentence, AI director auto-generates complete plan: brief, story, storyboard, voiceover, images, BGM
- Optional web research for known stories, factual references, brands, products, and public information
- 34 built-in art styles across 9 categories: Cinematic, Commercial, Futuristic, Retro, Anime, 3D, Illustration, Realistic, Experimental
- AI-generated story scripts, editable manually
- AI voiceover with multiple voice options, real-time preview
- AI background music, auto-generated based on story atmosphere
- Storyboard preview with image + voiceover + BGM synced playback
- Support 16:9 / 9:16 / 1:1 aspect ratios
- Export at 480p / 720p / 1080p
- Input topics, AI auto-generates multiple scripts, batch produce short videos
- Configurable clip duration (2-10 seconds), control material switching rhythm
- Support Chinese and English video scripts
- Multiple TTS voices with built-in Edge TTS (free), real-time preview
- Subtitle generation with customizable font, size, color, position, stroke
- Background music โ random or specified local files, adjustable volume
- Video materials are HD and royalty-free (Pexels / Pixabay), local files also supported
- Generate multiple output variations at once, pick the best one
- Multiple AI model providers โ OpenAI, Google Gemini, DeepSeek, Qwen, MiniMax, Ollama, and more
- Pluggable media providers โ AiHubMix, WaveSpeed, switch via environment variable
- Docker one-click deploy โ
docker compose upand you're ready - Fully self-hosted โ data stays on your server
- Chinese and English UI
- Docker & Docker Compose
- (Optional) Node.js 20+ for local development
git clone https://github.com/seme-org/open-director.git
cd open-director
cp .env.example .env
# Edit .env with your API keys
docker compose up --buildThen open http://localhost:3000.
| Service | URL | Credentials |
|---|---|---|
| App | http://localhost:3000 | โ |
| MinIO Console | http://localhost:9001 | opendirector / opendirector-secret |
| MySQL | localhost:3307 | See .env.prod |
| Redis | localhost:6379 | โ |
OpenDirector uses WaveSpeed for image generation. Character and location plates use a text-to-image model, while storyboard frames with character references must use an image-to-image/edit model so reference images are honored.
WAVESPEED_API_KEY="your-wavespeed-key"
WAVESPEED_IMAGE_MODEL="nano-banana"
WAVESPEED_IMAGE_TO_IMAGE_MODEL="nano-banana-2-edit"
EDGE_TTS_VOICE="zh-CN-XiaoxiaoNeural"Do not set WAVESPEED_IMAGE_TO_IMAGE_MODEL to nano-banana: that alias routes to a text-to-image endpoint and can ignore character reference images.
Speech uses local Edge TTS. Background music uses local tracks from assets/bgm/default/.
The LLM is used for recipe generation, script writing, and the AI director. It uses OpenAI-compatible API format.
OPENAI_API_KEY="your-key"
OPENAI_BASE_URL="https://api.openai.com/v1"
OPENAI_MODEL="gpt-4o-mini"Use OpenAI directly or an OpenAI-compatible endpoint for the LLM.
pnpm install
pnpm db:generate
pnpm devThis starts the Next.js dev server on http://localhost:3000.
| File | Purpose |
|---|---|
.env.example |
Documented template for all variables |
.env |
Local machine overrides (git-ignored) |
.env.prod |
Docker Compose production defaults |
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ Next.js App โ
โ (App Router, React 19, TypeScript, Tailwind) โ
โโโโโโโโโโฌโโโโโโโโโโโฌโโโโโโโโโโโฌโโโโโโโโโโโโโโโโโโโ
โ โ โ
โโโโโโผโโโโ โโโโโโผโโโโ โโโโโผโโโโโ
โ MySQL โ โ Redis โ โ MinIO โ
โ 8.4 โ โ 7 โ โ S3 โ
โโโโโโโโโโ โโโโโโฌโโโโ โโโโโโโโโโ
โ
โโโโโโโผโโโโโโโ
โ Worker โ
โ (FFCreator) โ
โโโโโโโโโโโโโโ
open-director/
โโโ apps/
โ โโโ web/ # Next.js frontend + API routes + 9 AI agents
โ โโโ render/ # BullMQ render worker (FFCreator)
โโโ assets/
โ โโโ fonts/ # Subtitle rendering fonts
โโโ prisma/
โ โโโ schema.prisma # Database schema (voices, art_styles, bgms, etc.)
โโโ docker-compose.yml
โโโ package.json
apps/web/src/server/agent/
โโโ media-provider.ts # Types + factory + orchestrator
โโโ schemas/
โ โโโ research.ts # Research notes, cautions, and source schema
โโโ voices.ts # TTS voice catalog (loaded from database)
โโโ art-styles.ts # Art style catalog (loaded from database)
โโโ providers/
โ โโโ wavespeed.ts # WaveSpeed implementation
โ โโโ aihubmix.ts # AiHubMix implementation
โ โโโ local-bgm.ts # Local BGM (random track from database)
โ โโโ wavespeed.test.ts # Provider tests
โโโ graph/nodes/recipe/ # 9 LangGraph agent nodes
| Layer | Technology |
|---|---|
| Frontend | Next.js 16, React 19, TypeScript, Tailwind CSS 4 |
| AI | LangChain + LangGraph |
| Database | Prisma + MySQL 8.4 |
| Queue | BullMQ + Redis |
| Storage | MinIO (S3-compatible) |
| Render | FFCreator (FFmpeg-based) |
| Auth | Custom credentials (Prisma-backed) |
| i18n | next-intl (English + Chinese) |
| Route | Description |
|---|---|
/ |
Landing page |
/chat |
AI director studio |
/chat/[id] |
Existing conversation |
/creation/[id] |
Creation editor (storyboard preview + export) |
/space |
User workspace |
/batch |
Batch video production |
/signin, /signup |
Authentication |
| Endpoint | Description |
|---|---|
/api/agent-chat |
AI director chat (streaming) |
/api/threads |
Thread CRUD |
/api/messages |
Message CRUD |
/api/assets |
Asset management |
/api/recipes/thread/[id] |
Recipe operations |
/api/uploads/init, /complete |
File upload |
/api/render/quick-concat |
Video render |
/api/jobs/[id] |
Job status |
| Variable | Default | Description |
|---|---|---|
DATABASE_URL |
Set in .env.prod |
MySQL connection string |
REDIS_HOST |
redis |
Redis host |
REDIS_PORT |
6379 |
Redis port |
S3_ENDPOINT |
http://minio:9000 |
S3-compatible storage endpoint |
S3_ACCESS_KEY_ID |
opendirector |
S3 access key |
S3_SECRET_ACCESS_KEY |
opendirector-secret |
S3 secret key |
S3_BUCKET |
open-director |
S3 bucket name |
| Variable | Default | Description |
|---|---|---|
WAVESPEED_API_KEY |
โ | WaveSpeed API key |
WAVESPEED_IMAGE_MODEL |
nano-banana |
Text-to-image model for character and location plates |
WAVESPEED_IMAGE_TO_IMAGE_MODEL |
nano-banana-2-edit |
Image-to-image/edit model for storyboard frames with character references |
EDGE_TTS_VOICE |
zh-CN-XiaoxiaoNeural |
Local Edge TTS voice |
| Variable | Default | Description |
|---|---|---|
OPENAI_API_KEY |
โ | OpenAI-compatible API key |
OPENAI_BASE_URL |
โ | API base URL |
OPENAI_MODEL |
gpt-4o-mini |
Model name |
| Variable | Default | Description |
|---|---|---|
PEXELS_API_KEY |
โ | Pexels API key for stock videos |
PIXABAY_API_KEY |
โ | Pixabay API key for stock videos |
BATCH_TTS_PROVIDER |
edge |
Batch TTS provider |
BATCH_EDGE_TTS_VOICE |
zh-CN-XiaoxiaoNeural |
Batch Edge TTS voice |
docker compose up -d --buildThis starts all services: MySQL, Redis, MinIO, web app, and render worker.
pnpm install
pnpm db:generate
pnpm db:migrate
pnpm build
pnpm start- Fork the repository
- Create a feature branch:
git checkout -b feature/my-feature - Commit your changes:
git commit -m "feat: add my feature" - Push to the branch:
git push origin feature/my-feature - Open a Pull Request
- Run
pnpm typecheckbefore committing - Run
pnpm lintto check code style - Follow Conventional Commits for commit messages
- AI Digital Human โ talking-head video generation with digital avatars
- Manga Drama โ comic panel animation with expression switching and camera effects
- Multi-language voiceover โ expand TTS voice catalog with more languages



