Skip to content

dliangthinks/iExplain

Repository files navigation

Remotion Video App

A macOS app that turns a narration script into a motion graphics video. Each sentence becomes a scene with theme-aware layout, generated text-to-speech audio, and a live preview. Export to MP4 when ready.

Usage

npm install
cd packages/pipeline && npm run build
cd packages/shell && npx electron-vite dev

UI overview

┌─────────────────────────────────────────────────┐
│ [☰ Menu]              [Theme]  [⤓ Render]       │  TopBar
├──────┬──────────────────────────────────┬───────┤
│ ▢ 1  │  ┌───────────────────────────┐   │       │
│ ▢ 2◄ │  │   Scene preview (webview) │   │ Theme │
│ ▢ 3  │  └───────────────────────────┘   │ panel │
│ ▢ 4  │  ┌──────────────────┬────────┐   │ (tog- │
│ ▢ 5  │  │                  │ Layout ▾│  │ gle)  │
│      │  │  Scene script    │ ⟳ Regen │  │       │
│      │  │                  │         │  │       │
│      │  └──────────────────┴────────┘   │       │
└──────┴──────────────────────────────────┴───────┘
  • Top bar: project menu (left), Theme toggle and Render button (right).
  • Left sidebar: one tile per scene (numbered, color-coded by archetype, with a thumbnail when the scene has been rendered). Click to seek the preview.
  • Center: live @remotion/player preview of the selected scene + per-scene script editor + scene-level layout dropdown + scene-level Regenerate button.
  • Right (toggleable): theme picker. Selecting a different theme reveals an Apply theme button that runs a deterministic theme swap.

Deterministic vs AI behaviors

The app makes a clear split between operations that cost AI tokens and operations that don't. The pipeline's change-detector chooses the cheapest path that still produces a correct result.

Deterministic — no Claude tokens consumed

Operation What runs
Open / save / list projects Filesystem read/write of project.json
Archetype classification + layout pick archetype-mapper.ts heuristics (numbers→stats, ?→rhetorical, …)
Theme swap (theme-only change) theme-patcher.ts — string-replaces theme imports in the composition
Preview server npx vite serving the project's player.html
Render to MP4 npx remotion render
Scene thumbnails npx remotion still per scene
Composition update on add/remove MyVideo.tsx regenerated by template logic
TTS audio Calls Inworld API — costs Inworld credits, not Claude tokens
Scene removal Delete scene file + composition update

AI — consumes Claude API tokens via the Claude Code CLI

Operation What runs
Initial scene generation One full Claude Code session writes all scene .tsx files
Edit scene text → regenerate that scene Targeted regen — only the changed scenes are sent to the AI
Change a scene's layout Targeted regen — affected scenes only
Add a new scene Targeted regen for added scenes + composition update

The change detector returns a ChangeSet describing exactly which scenes need AI work, which need new TTS, and whether a theme swap can shortcut everything else. Theme-only changes never go through AI; sentence edits never re-render unaffected scenes.

Design decisions

Two-package split (pipeline vs shell)

The app is split into @remotion-app/pipeline (pure Node.js) and @remotion-app/shell (Electron). The pipeline has zero Electron imports — it communicates through EventEmitter events and method calls.

Why: The shell may change. Today it's Electron; tomorrow it could be a native macOS app (Swift + WKWebView) or a Tauri app. By keeping all video production logic in a standalone Node.js package, a shell swap only requires rewriting the thin IPC bridge — not the scaffolding, TTS, AI generation, preview server, rendering, or thumbnail code.

The coupling point is a single file: packages/shell/src/main/ipc-bridge.ts. It's the only file that imports both Electron and the pipeline.

Pluggable providers (audio and AI)

Both audio generation and AI scene generation are behind interfaces, not hardcoded to any vendor.

Why: These are the two external dependencies most likely to change. TTS is a commodity — Inworld today, ElevenLabs or user-recorded audio tomorrow. AI scene generation is Claude Code today, but could be a different CLI, a direct API call, or a local model. By putting both behind interfaces (AudioProvider and AIProvider), swapping a vendor means writing one new class, not touching the pipeline or shell.

Built-in audio providers:

  • InworldProvider — calls the Inworld TTS API, writes MP3s, measures durations
  • FileAudioProvider — copies user-supplied MP3s, measures durations

Built-in AI providers:

  • ClaudeCodeProvider — spawns the Claude Code CLI with --print --output-format stream-json --verbose. Captures the session ID from the system.init event and passes it as --resume on subsequent turns for multi-turn context.

Smart change detection

change-detector.ts compares the current project state against the last generated snapshot and returns a ChangeSet. The pipeline routes work to the cheapest path that produces a correct result:

  • No changes → skip everything.
  • Theme onlypatchTheme (deterministic file edit, no AI, no TTS).
  • Sentence edits → new TTS + AI scene regen for affected scenes.
  • Layout-only edits → AI scene regen for affected scenes (no new TTS).
  • Added scenes → TTS + AI for new scenes + composition update.
  • Removed scenes → delete scene files + composition update (no AI).

Thumbnails are regenerated at the end of any change so the sidebar stays in sync.

Scenes are the source of truth

The user edits scene text per-scene, not as one large script. Each scene is a freeform text block, classified by archetype (hook, stats, rhetorical, contrast, enumeration, cta, statement) which determines its default layout. The user can override the layout per scene.

The whole-script view is derived (scenes.map(s => s.sentence).join("\n")) and exists only for change detection and pipeline compatibility.

Live preview via @remotion/player in a webview

The preview is a <webview> pointing at a Vite dev server that serves a small player.html / player.tsx mounting <Player> with the project's main composition. The webview reloads with ?from=N to seek to scene N.

Why: Earlier versions embedded Remotion Studio with CSS injection to hide the chrome. Switching to @remotion/player gave us a clean canvas with no UI to strip, full control over which composition mounts, and easier seeking via URL params. Vite serves the project files directly, so HMR still picks up scene/theme edits without restarting anything.

Archetype-based scene classification

When the user pastes a script, each sentence is classified into an archetype which maps to a default layout. The mapper is a pure function (no AI call) using simple heuristics:

  • contains a number → stats
  • ends with ?rhetorical
  • first sentence and short → hook
  • last sentence → cta
  • contrast words (but, however, …) → contrast
  • list words (first, second, …) → enumeration
  • default → statement

Adjacent layout duplicates are resolved automatically — no two consecutive scenes get the same layout.

Claude Code CLI as the default AI (not direct API)

The default AI provider spawns the Claude Code CLI rather than calling the Anthropic API directly.

Why: Claude Code already has the user's API key, model preferences, and authentication configured. It handles file writing, error recovery, and the complex system prompt from skill.md. The stream-json output gives structured progress events for the UI.

This is a provider choice, not an architectural commitment. The AIProvider interface accepts { cwd, systemPrompt, userMessage, onProgress } — any backend that can generate Remotion scene files from a prompt can be plugged in.

Zustand for UI state

The Electron renderer uses Zustand for UI-only state (selected scene, theme panel open, render status, etc.). Pipeline state flows in through IPC events. Project persistence is handled by the pipeline's project-manager.ts.

Prerequisites

  • Node.js >= 20
  • npm >= 9
  • Claude Code CLI installed and authenticated
  • INWORLD_API_KEY in ~/.env (for TTS), or use FileAudioProvider with your own MP3s

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages