A recursive demonstration: the Hermes Agent Kanban system creates a video explaining itself.
Four AI agent profiles collaborate autonomously — a director, cinematographer, renderer, and editor — communicating through structured handoffs, comments, and shared workspace files. One command kicks off the entire pipeline; you come back to a finished video.
A Kanban-driven multi-agent video production pipeline built on Hermes Agent. You describe a video, the system decomposes it into a task graph, and specialized agent profiles execute each stage:
You type one command
→ Director decomposes the brief into 9 tasks
→ Cinematographer designs visual language
→ 7 Renderers execute in parallel (ASCII video + p5.js)
→ Editor reviews clips + assembles final cut
→ output/final.mp4
The entire pipeline auto-runs with zero human intervention. Every stage promotes automatically when its dependencies complete.
This video is about Kanban — and it was made by Kanban. The first project demonstrates the system by having it create an explainer video about itself. Two-part structure:
- Part 1 (captured with Screen Studio): You type the command, the dashboard lights up, tasks flow between columns
- Part 2 (generated by the pipeline): The actual video the system produced — ASCII art + generative visuals explaining how Kanban works
| Profile | Role | Communication |
|---|---|---|
| Director | Vision, decomposition, final approval | Handoffs to cinematographer, comments on anything |
| Cinematographer | Visual spec, quality gate | Specs to renderers, feedback via comments |
| Renderer | Code execution (ascii-video / p5.js) | Reads parent handoff + comments, writes clips |
| Editor | Pacing review, assembly, final cut | Reviews with AI video analysis, assembles with ffmpeg |
Agents talk to each other through three built-in Kanban channels:
- Structured handoffs — when a task completes with
summary+metadata, every child task automatically receives it in context - Comments — any profile can write
kanban_comment(task_id=<any_task>)on any task; the next worker reads the full thread - Shared workspace — all profiles read/write to the same project directory
- Hermes Agent installed
- OpenRouter API key (for claude-opus-4.6)
- Gateway running (
hermes gateway start)
chmod +x setup.sh
./setup.shThis creates 4 profiles (director, cinematographer, renderer, editor), writes their personalities, sets up the project structure, and initializes the Kanban board.
hermes kanban create "Make a video explaining how Hermes Kanban works" \
--assignee director \
--workspace dir:$(pwd)/kanban-launch-video \
--tenant video-pipeline \
--priority 2 \
--max-runtime 4h \
--body "Read brief.md. Decompose into: cinematographer → renderers → editor. All child tasks MUST use workspace_kind='dir' and workspace_path='$(pwd)/kanban-launch-video'. Pass tenant='video-pipeline' on every create."hermes kanban watch --tenant video-pipeline # live events
hermes kanban list --tenant video-pipeline # board state
hermes dashboard # visual board (Kanban tab)├── README.md ← you are here
├── ARCHITECTURE.md ← system design, communication model
├── RUN.md ← step-by-step execution guide
├── brief.md ← the video specification
├── setup.sh ← one-shot setup script
├── taste/
│ ├── brand-guide.md ← visual language rules
│ └── emotional-dna.md ← the creative soul document
├── tools/
│ └── media_analyze.py ← AI audio/video analysis CLI (Gemini via OpenRouter)
└── render-scripts/ ← what the agents actually wrote (reference)
├── scene-01/render.py ← ASCII video (Python + NumPy + ffmpeg)
├── scene-02/render.py
├── scene-03/sketch.html ← p5.js generative (HTML + Puppeteer)
├── scene-04/render.py
├── scene-05/sketch.html
├── scene-06/render.py
└── scene-07/render.py
The pipeline produces:
output/final.mp4— 1920×1080 @ 24fps, ~2 minutes, with audiooutput/final-noaudio.mp4— silent version for re-scoring- Individual scene clips at
scenes/scene-NN/clip.mp4
See ARCHITECTURE.md for:
- Full communication model diagram
- Task graph with dependency links
- Iteration loops (cinematographer ↔ renderer, editor ↔ renderer)
- Renderer flexibility (swap ascii-video / p5js / remotion per task)
- Media analysis tool usage
| Phase | Duration | What Happens |
|---|---|---|
| Director | ~3 min | Reads brief, creates 9 tasks |
| Cinematographer | ~8 min | Designs visual spec for all scenes |
| Renderers (×7 parallel) | ~20 min | ASCII/p5.js rendering |
| Editor | ~10 min | Reviews, assembles, muxes audio |
| Total | ~45 min | Fully autonomous |
To make your own video:
- Edit
brief.mdwith your video spec - Replace
taste/brand-guide.mdwith your visual language - Drop your audio into the workspace
- Run
setup.sh→ fire the kanban command - Come back to a video
The renderer profile has both ascii-video and p5js skills — use --skill <name> on individual tasks to control which renderer is used per scene.
MIT