ragTechDev · natashaannn · Apr 25, 2026 · Apr 13, 2026 · Apr 14, 2026 · Apr 14, 2026
diff --git a/.gitignore b/.gitignore
@@ -15,7 +15,6 @@
 
 # next.js
 /.next/
-/out/
 
 # production
 /build
@@ -41,22 +40,38 @@ yarn-error.log*
 *.tsbuildinfo
 next-env.d.ts
 
-# Media content
-/public/sync/audio/*
-/public/sync/output/*
-/public/sync/video/*
-/public/transcribe/input/*
-/public/input/*
-/public/sync/*
+# ── Raw inputs (drop video/audio here before running the wizard) ───────────────
+# Never committed — files are 1–10 GB
+/input/
 
-# comment this out if using agent to edit transcript
-/public/transcribe/output/* 
+# ── Pipeline intermediates (large/generated media) ────────────────────────────
+public/sync/              # synced video/audio working files
+public/transcribe/input/  # audio extracted for Whisper
+public/transcribe/output/ # raw Whisper JSON + VTT
+public/thumbnail/         # candidate frames, cutouts, manifest
+public/renders/           # final rendered .mp4 files
+public/output/            # carousel exports
+
+# ── Editable pipeline outputs (text/JSON) ─────────────────────────────────────
+public/edit/   → transcript.doc.txt, transcript.json, SRT exports
+public/camera/ → camera-profiles.json, frame snapshots, detections
+#
+# These are committed by default so collaborators (and coding agents) can read
+# and edit the transcript. Coding agents like Claude access project context via
+# git — committing these files lets the agent see, diff, and propose edits to
+# the transcript doc without needing filesystem access.
+#
+# Uncomment to exclude (e.g. for a private or unreleased episode):
+public/edit/*
+public/camera/*
 
 # Whisper.cpp binary and models (downloaded on first run)
 /whisper.cpp/
+
+# Claude Code
 .claude/settings.local.json
 
-# Python Environments
+# Python environments
 .envrc
 .venv
 env/
@@ -65,8 +80,10 @@ ENV/
 env.bak/
 venv.bak/
 
-# Python Byte-compiled / optimized / DLL files
+# Python bytecode
 scripts/diarize/__pycache__/
 *.py[codz]
 *$py.class
+
+# Editor
 .vscode/settings.json
diff --git a/AGENTS.md b/AGENTS.md
diff --git a/CLAUDE.md b/CLAUDE.md
@@ -1,53 +1,96 @@
 # RAG Tech Podcast — Project Context
 
 ## Show
-**RAG Tech** is a biweekly tech podcast that explores real-life topics in tech. New episodes drop every other week.
+**RAG Tech** — biweekly tech podcast. Episodes drop every other week.
 
 ## Cohosts
-| Name | Role | Image path |
-|------|------|------------|
+| Name | Role | Image |
+|------|------|-------|
 | Natasha | Software Engineer | `public/assets/team/natasha.PNG` |
 | Saloni | Software Developer | `public/assets/team/saloni.PNG` |
 | Victoria | Solutions Engineer | `public/assets/team/victoria.PNG` |
 
 All cohost images have transparent backgrounds.
 
-## Mascot
-**Techybara** — a capybara mascot. Individual PNGs live in `public/assets/techybara/`.
-
 ## Brand
-Brand config: `public/brand.json` (colors, typography, logo, shape radius).
-Logo: `public/assets/logo/transparent-bg-logo.png`
-Font: Nunito (variable, loaded via `remotion/loadFonts.ts`)
+- Config: `public/brand.json` (colors, typography, logo, shape radius)
+- Logo: `public/assets/logo/transparent-bg-logo.png`
+- Font: Nunito (variable, loaded via `remotion/loadFonts.ts`)
+- Mascot: **Techybara** (capybara) — PNGs in `public/assets/techybara/`
 
 ## Platforms
-- **Audio/Video:** Spotify, YouTube, Apple Podcasts
-- **Social:** Instagram, TikTok, LinkedIn
-- **Handle:** `@ragtechdev` (same on all platforms)
-
-## Tone
-Fun and accessible — tech content that doesn't take itself too seriously.
+Spotify · YouTube · Apple Podcasts · Instagram · TikTok · LinkedIn — handle `@ragtechdev`
 
 ## Key assets
 | Asset | Path |
 |-------|------|
 | Intro/outro music | `public/sounds/intro-outro-music.mp3` |
-| Background music (main) | `public/sounds/jazz-cafe-music.mp3` |
+| Background music | `public/sounds/jazz-cafe-music.mp3` |
 | Techybara images | `public/assets/techybara/` |
 | Cohost photos | `public/assets/team/` |
 | Logo | `public/assets/logo/` |
 
 ## Remotion compositions
 | ID | Component | Notes |
 |----|-----------|-------|
-| `ragTechVodcast` | `MyComposition` | Full episode (hooks → intro → main video) |
-| `PodcastIntro` | `PodcastIntroComposition` | Standalone 7 s intro (420 frames @ 60 fps) |
+| `ragTechVodcast` | `MyComposition` | Full episode: hooks → intro → main video |
+| `PodcastIntro` | `PodcastIntroComposition` | 7 s intro (420 frames @ 60 fps) |
+
+## Pipeline overview
+```
+[sync]           Audio ↔ video alignment → synced-output.mp4
+[transcribe]     Whisper.cpp → token-level timestamps
+[diarize]        Speaker turn detection
+[assign-speakers] Labels segments with speaker names
+[align]          WhisperX forced alignment → populates token.t_end
+[edit-transcript] Merges phrases into sentences → transcript.doc.txt
+Human edits doc (cuts, corrections, hooks, camera cues)
+[merge-doc]      Applies doc edits → transcript.json
+[setup-camera]   Face detection + GUI → camera-profiles.json
+Remotion         transcript.json + camera-profiles.json → composed video
+```
+
+Intermediate files: `public/transcribe/output/`. Synced video: `public/sync/output/`.
+
+## transcript.json key schema
+```
+meta
+  videoSrc?:   string     path relative to /public (overrides composition prop)
+  videoSrcs?:  string[]   all angle paths for multi-angle shoots
+  videoStart?: number     source seconds — segments before excluded
+  videoEnd?:   number     source seconds — segments after excluded
+  fps:         60
+segments[]
+  id, start, end, speaker, text, cut: boolean
+  tokens[]:    { t_dtw, t_end?, text, cut }
+  cuts[]:      [{ from, to }]  intra-segment ranges to skip
+  hook?        hookFrom?, hookTo?  hook clip bounds
+  cameraCues[] explicit camera overrides (> CAM directives)
+```
 
-## Video pipeline overview
-1. **Hooks** — selected transcript segments play first as teasers, with karaoke captions and the Techybara mascot overlay (`HookOverlay`).
-2. **Intro** — `PodcastIntro` plays between hooks and the main episode content.
-3. **Main episode** — full edited recording with optional camera punch-ins (`CameraPlayer`).
+`token.t_end` is populated only after forced alignment — enables exact cut boundaries; without it, heuristic biases apply.
 
-Forced alignment (`npm run align`) populates `token.t_end` (word-end boundary) alongside `token.t_dtw` (word start), enabling exact cut boundaries when words are marked for removal. Without it, cut boundaries fall back to heuristic bias constants. See `AGENTS.md` for full architecture details.
+## camera-profiles.json key schema
+```json
+{
+  "sourceWidth": 1920, "sourceHeight": 1080,
+  "outputWidth": 1920, "outputHeight": 1080,
+  "wideViewport": { "cx": 0.5, "cy": 0.5, "w": 1, "h": 1 },
+  "angles": {                                   // multi-angle only
+    "angle1": { "videoSrc": "sync/output/synced-output-1.mp4",
+                "sourceWidth": 1920, "sourceHeight": 1080 },
+    "angle2": { "videoSrc": "sync/output/synced-output-2.mp4",
+                "sourceWidth": 1920, "sourceHeight": 1080 }
+  },
+  "speakers": {
+    "Natasha": {
+      "label": "Natasha",
+      "angleName": "angle1",                    // multi-angle only
+      "closeupViewport": { "cx": 0.3, "cy": 0.4, "w": 0.35, "h": 0.35 },
+      "portraitCx": 0.3
+    }
+  }
+}
+```
 
-Transcript editing scripts live in `scripts/`. Transcription pipeline is in `scripts/transcribe/`.
+**Multi-angle rendering**: `CameraPlayer` stacks one `SegmentPlayer` per unique angle video, switches visibility via `opacity` at shot boundaries. Non-active layers are `muted`. All angles share the same jump-cut sections (cuts are audio-driven). See `AGENTS.md` for full architecture.
diff --git a/Dockerfile b/Dockerfile
@@ -48,13 +48,16 @@ COPY . .
 RUN pip install --upgrade pip setuptools wheel && \
     pip install -r scripts/diarize/requirements.txt && \
     pip install whisperx faster-whisper && \
-    pip install -r scripts/camera/requirements.txt
+    pip install -r scripts/camera/requirements.txt && \
+    pip install -r scripts/thumbnail/requirements.txt && \
+    pip install "coverage>=7.0"
 
-# Ensure public directories exist
-RUN mkdir -p public/input/video public/input/audio \
+# Ensure directories exist
+RUN mkdir -p input/video input/audio \
     public/sync/video public/sync/audio public/sync/output \
-    public/transcribe/input public/transcribe/output/raw public/transcribe/output/edit \
-    public/transcribe/output/camera
+    public/transcribe/input public/transcribe/output/raw \
+    public/edit public/camera public/thumbnail \
+    public/renders public/output
 
 # Set environment variables
 ENV PYTHON_PATH="/usr/local/bin/python"