Skip to content

hallelx2/video-studio

Repository files navigation

Video Studio

Video Studio — 2-minute walkthrough

Prefer audio + full quality? Download the MP4 (5.4 MB).

A desktop agent-driven video studio. Compose, narrate, and render videos through a single in-app conversation with a Claude agent that drafts scripts, lays out HyperFrames compositions, and renders Chromium-safe MP4s — all without leaving the window.

License: MIT Release CI Electron React TypeScript Vite Tailwind Claude Agent SDK HyperFrames Conventional Commits


What it does

Video Studio is the cockpit for an agent that ships finished videos. You describe what you want; the agent turns that into:

  • A script drafted with the persona, tone, and beat structure you set.
  • A narration track generated through Kokoro TTS — runs entirely on your machine, no per-character billing.
  • HTML compositions authored as HyperFrames documents — multi-aspect (1080×1080, 1920×1080, 9:16, …) with shared media and per-aspect tuning.
  • A rendered MP4 post-processed into Chromium-safe H.264 / yuv420p so the in-app preview can stream the result without a transcode dance.

Every step is interruptible. The agent's reasoning streams into an activity feed, tool calls are surfaced inline, and the user can correct course at any approval point — /redraft, /renarrate, /recompose, or /rerender re-enter the pipeline at exactly the right stage instead of starting over.

Highlights

  • In-app preview — clicking preview spawns the HyperFrames dev server and renders it inside a slide-in panel. No external browser windows ever pop. The OS browser remains the explicit escape hatch via open in browser ↗.
  • Stage-retry slash commands/redraft → /renarrate → /recompose → /rerender cascade left-to-right, so a tweak high in the pipeline invalidates everything downstream.
  • Composio design system — pitch-black canvas, white-opacity hairline borders, rationed Electric Cyan / Composio Cobalt accents. Dual-font identity: geometric sans + JetBrains Mono. See DESIGN.md.
  • Resilient render path — rendered MP4s are post-processed into Chromium-safe profiles, with a one-click fix codec recovery for files that arrived encoded outside the safe set.
  • Custom studio-media:// protocol — the renderer streams arbitrary local files through Electron without exposing the filesystem to the renderer process.
  • Type-safe IPC bridge — every renderer ↔ main call is defined once in electron/types.ts and consumed identically on both sides.
  • GitHub-backed auto-update — installed builds check this repo's releases on launch and hourly thereafter; new versions download in the background and the user is prompted to restart when ready.

Stack

Layer Tech
Shell Electron 33, TypeScript
Renderer React 19, Vite 8, Tailwind 4
Agent Claude Agent SDK (Opus 4.7 · 1M context)
Composition HyperFrames HTML compositions, GSAP timelines
Rendering HyperFrames render → FFmpeg post-process (H.264 / yuv420p)
Audio Kokoro TTS — local, on-device narration (no API key, no metering)
Packaging electron-builder — Windows, macOS, Linux

Getting started

Prerequisites

  • Node.js ≥ 22
  • pnpm ≥ 9npm i -g pnpm
  • FFmpeg on PATH — required by HyperFrames render and the post-process step
  • Python ≥ 3.10 with kokoro-onnx + soundfile — drives narration synthesis. Install once:
    pip install kokoro-onnx soundfile
    Then point Settings → Python interpreter at the matching python.exe / python3 if it isn't your default. The first narration run downloads the voice model (~330 MB) and caches it.
  • An Anthropic API key for the Claude Agent SDK

Install & run

git clone https://github.com/hallelx2/video-studio.git
cd video-studio
pnpm install
pnpm dev

pnpm dev boots four concurrent processes (Vite, agent watch, electron tsc watch, electron run) and opens the studio window. Settings → API keys when you first launch.

Bundle for distribution

pnpm bundle:win     # Windows nsis installer + portable
pnpm bundle:mac     # macOS dmg
pnpm bundle:linux   # AppImage + deb
pnpm bundle         # current platform only

Artifacts land in dist-electron/.

Repository layout

video-studio/
├── electron/             main / preload / agent bridge / config / IPC types
├── agent/                @video-studio/agent — runs the Claude Agent SDK loop
├── src/                  renderer (React 19 + Vite + Tailwind)
│   ├── components/agent  ActivityStream, InlineApproval, PreviewPanel, …
│   ├── components/ui     design-system primitives
│   ├── lib               preview-context, agent-client, media-url
│   └── routes            Home, Workbench, Projects, Settings, Onboarding
├── scripts/              one-off codemods + dev launchers
├── composio/             upstream design-system reference
├── DESIGN.md             design system spec — single source of truth
└── .github/workflows/    release-please + cross-platform build jobs

Conventions

  • Conventional commitsfeat:, fix:, chore:, refactor:, docs:, test:. Release-please consumes these to compute the next version and write CHANGELOG.md.
  • Design tokens first — never hardcode a color or spacing value in a component. If a token is missing, add it to src/index.css and document it in DESIGN.md first.
  • Type-safe IPC — every new IPC method goes through StudioBridge in electron/types.ts and is exposed in electron/preload.ts.
  • Comments earn their keep — only when why is non-obvious. Don't restate what the code already says.

Releases

Releases are managed by release-please. Every push to main either:

  1. Opens / updates a release PR with the next version, an entry-by-entry changelog, and a manifest bump — driven entirely by the conventional-commit subjects since the last tag.
  2. On merge of that PR, tags vX.Y.Z, drafts a GitHub Release, and the cross-platform build workflow attaches Windows / macOS / Linux installers as release assets.

You don't bump package.json by hand — release-please does it.

License

MIT © Halleluyah Oludele. See LICENSE.


Built with Claude Code ✺ Co-authored by Claude Opus 4.7 (1M context).

About

Desktop agent-driven video studio — Electron + Claude Agent SDK + HyperFrames. Compose, narrate, render videos through a single in-app conversation.

Topics

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors