Skip to content

NousResearch/kanban-video-pipeline

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Kanban Video Pipeline — Multi-Agent Creative Production

A recursive demonstration: the Hermes Agent Kanban system creates a video explaining itself.

Four AI agent profiles collaborate autonomously — a director, cinematographer, renderer, and editor — communicating through structured handoffs, comments, and shared workspace files. One command kicks off the entire pipeline; you come back to a finished video.

What This Is

A Kanban-driven multi-agent video production pipeline built on Hermes Agent. You describe a video, the system decomposes it into a task graph, and specialized agent profiles execute each stage:

You type one command
    → Director decomposes the brief into 9 tasks
        → Cinematographer designs visual language
            → 7 Renderers execute in parallel (ASCII video + p5.js)
                → Editor reviews clips + assembles final cut
                    → output/final.mp4

The entire pipeline auto-runs with zero human intervention. Every stage promotes automatically when its dependencies complete.

The Recursive Concept

This video is about Kanban — and it was made by Kanban. The first project demonstrates the system by having it create an explainer video about itself. Two-part structure:

  1. Part 1 (captured with Screen Studio): You type the command, the dashboard lights up, tasks flow between columns
  2. Part 2 (generated by the pipeline): The actual video the system produced — ASCII art + generative visuals explaining how Kanban works

The Team

Profile Role Communication
Director Vision, decomposition, final approval Handoffs to cinematographer, comments on anything
Cinematographer Visual spec, quality gate Specs to renderers, feedback via comments
Renderer Code execution (ascii-video / p5.js) Reads parent handoff + comments, writes clips
Editor Pacing review, assembly, final cut Reviews with AI video analysis, assembles with ffmpeg

Cross-Profile Communication

Agents talk to each other through three built-in Kanban channels:

  • Structured handoffs — when a task completes with summary + metadata, every child task automatically receives it in context
  • Comments — any profile can write kanban_comment(task_id=<any_task>) on any task; the next worker reads the full thread
  • Shared workspace — all profiles read/write to the same project directory

Quick Start

Prerequisites

  • Hermes Agent installed
  • OpenRouter API key (for claude-opus-4.6)
  • Gateway running (hermes gateway start)

Setup

chmod +x setup.sh
./setup.sh

This creates 4 profiles (director, cinematographer, renderer, editor), writes their personalities, sets up the project structure, and initializes the Kanban board.

Run

hermes kanban create "Make a video explaining how Hermes Kanban works" \
    --assignee director \
    --workspace dir:$(pwd)/kanban-launch-video \
    --tenant video-pipeline \
    --priority 2 \
    --max-runtime 4h \
    --body "Read brief.md. Decompose into: cinematographer → renderers → editor. All child tasks MUST use workspace_kind='dir' and workspace_path='$(pwd)/kanban-launch-video'. Pass tenant='video-pipeline' on every create."

Monitor

hermes kanban watch --tenant video-pipeline    # live events
hermes kanban list --tenant video-pipeline     # board state
hermes dashboard                                # visual board (Kanban tab)

Project Structure

├── README.md                     ← you are here
├── ARCHITECTURE.md               ← system design, communication model
├── RUN.md                        ← step-by-step execution guide
├── brief.md                      ← the video specification
├── setup.sh                      ← one-shot setup script
├── taste/
│   ├── brand-guide.md            ← visual language rules
│   └── emotional-dna.md          ← the creative soul document
├── tools/
│   └── media_analyze.py          ← AI audio/video analysis CLI (Gemini via OpenRouter)
└── render-scripts/               ← what the agents actually wrote (reference)
    ├── scene-01/render.py        ← ASCII video (Python + NumPy + ffmpeg)
    ├── scene-02/render.py
    ├── scene-03/sketch.html      ← p5.js generative (HTML + Puppeteer)
    ├── scene-04/render.py
    ├── scene-05/sketch.html
    ├── scene-06/render.py
    └── scene-07/render.py

Output

The pipeline produces:

  • output/final.mp4 — 1920×1080 @ 24fps, ~2 minutes, with audio
  • output/final-noaudio.mp4 — silent version for re-scoring
  • Individual scene clips at scenes/scene-NN/clip.mp4

How It Works (Deep Dive)

See ARCHITECTURE.md for:

  • Full communication model diagram
  • Task graph with dependency links
  • Iteration loops (cinematographer ↔ renderer, editor ↔ renderer)
  • Renderer flexibility (swap ascii-video / p5js / remotion per task)
  • Media analysis tool usage

Timing

Phase Duration What Happens
Director ~3 min Reads brief, creates 9 tasks
Cinematographer ~8 min Designs visual spec for all scenes
Renderers (×7 parallel) ~20 min ASCII/p5.js rendering
Editor ~10 min Reviews, assembles, muxes audio
Total ~45 min Fully autonomous

Adapting This

To make your own video:

  1. Edit brief.md with your video spec
  2. Replace taste/brand-guide.md with your visual language
  3. Drop your audio into the workspace
  4. Run setup.sh → fire the kanban command
  5. Come back to a video

The renderer profile has both ascii-video and p5js skills — use --skill <name> on individual tasks to control which renderer is used per scene.

License

MIT

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors