Manim generation lessons: font, layout, rendering, and sync

## Context

These are field-tested lessons from regenerating 18 demo segments for [tekton-dag](https://github.com/jmjava/tekton-dag) using `docgen`. Every issue here was hit in production and required manual intervention. When `docgen` absorbs auto-scene generation (#1) and visual validation (#2), these lessons should inform the defaults and guardrails.

## Lesson 1: Manim's default font is unusable

**Problem:** Manim uses Pango's default font which renders text with terrible kerning — characters appear individually placed, like ransom-note typography. At 720p the text is blurry and nearly unreadable.

**Fix applied:** `Text.set_default(font="Lato")` at the start of `construct()`. Any clean sans-serif (Lato, Inter, Roboto, Liberation Sans) is dramatically better.

**Recommendation for docgen:**
- [ ] Set a default font in the Manim config/template — never ship Pango defaults
- [ ] Add `manim.font` to `docgen.yaml` so users can override (default: `"Lato"` or `"Liberation Sans"` for maximum availability)
- [ ] Document font requirements in `docgen init` scaffold output
- [ ] CI smoke test should verify the configured font is installed on the system

## Lesson 2: Hardcoded coordinates cause overlap — use Manim's layout system

**Problem:** Positioning text with absolute coordinates (`move_to(UP * 1.3 + LEFT * 2.5)`) is fragile. Every font, size, and content change shifts bounding boxes, causing text-on-text collisions. We went through **3 full rewrites** of a single scene (RoadmapScene) trying to fix overlaps with coordinate adjustments.

**Fix applied:** Replaced all absolute positioning with Manim's layout primitives:
- `VGroup.arrange(DOWN, buff=0.25, aligned_edge=LEFT)` for vertical lists
- `mob.next_to(anchor, DOWN, buff=0.3)` to chain elements below each other
- Never compute Y positions manually — let Manim measure actual text bounds

**Recommendation for docgen:**
- [ ] Auto-generated scenes must ONLY use `arrange()` and `next_to()` for layout — ban absolute coordinates except for the top-level anchor points (strip Y, content area Y)
- [ ] Define 2–3 layout zones (nav strip, title, content area) with fixed Y anchors; everything inside a zone uses relative positioning
- [ ] The layout engine should enforce a maximum content height per section and warn if content would overflow below the visible frame

## Lesson 3: Font sizes must be minimum 14pt for video

**Problem:** Font sizes of 8–11pt are unreadable in video, even at 1080p. This is not print — viewers watch on screens at normal viewing distance, often in browser video players with compression artifacts.

**Fix applied:** Minimum body text 16pt, titles 20pt, section headings 36pt, pillar card labels 14pt.

**Recommendation for docgen:**
- [ ] Enforce minimum font sizes in the layout engine: body ≥ 14pt, subtitle ≥ 16pt, heading ≥ 20pt
- [ ] `docgen validate` should sample frames and flag text regions where OCR confidence is low (proxy for too-small or blurry text)
- [ ] `docgen.yaml` should allow `manim.min_font_size` override

## Lesson 4: Render at 1080p minimum — 720p is too blurry for text-heavy content

**Problem:** 720p30 renders look blurry when the video contains dense text (bullet lists, code snippets, multi-line labels). Compression artifacts at 720p make small text illegible.

**Fix applied:** Render at 1920×1080 (or 2560×1440 for production quality). The compose step handles resolution normalization.

**Recommendation for docgen:**
- [ ] Default `manim.quality` in `docgen.yaml` should be `1080p30` not `720p30`
- [ ] `docgen manim` should render at the configured quality and warn if below 1080p
- [ ] Document that 720p is only suitable for terminal recordings (VHS), not Manim text scenes

## Lesson 5: `_wait_until` is essential but easy to miscalculate

**Problem:** Manim animations must fill the exact audio duration. `_wait_until(self, target_t, current_t)` is the mechanism, but timing errors accumulate. If any section runs over its allocated window, subsequent `_wait_until` calls become no-ops and the scene desynchronizes.

**Fix applied:** Conservative timing — each pillar section ends ~1s before the next starts, with an explicit fade-out transition to absorb timing drift.

**Recommendation for docgen:**
- [ ] Auto-generated scenes should budget animation time per section from Whisper segments, with 1–2s buffer between sections
- [ ] Add a `_wait_until` wrapper that logs a warning (not crash) if `target_t < current_t` — this catches timing overflows during development
- [ ] After rendering, compare actual scene duration to audio duration and warn if drift exceeds 2%

## Lesson 6: Pillar/section pattern should be a reusable template

**Problem:** Every pillar in segment 18 follows the same visual pattern: highlight card → show title → reveal bullet list → clear. We wrote this 7 times with slight variations, which is error-prone.

**Fix applied:** Extracted `_show_pillar()`, `_add_subtitle()`, `_reveal_list()`, `_clear()` helper methods.

**Recommendation for docgen:**
- [ ] Provide a `SectionScene` base class or mixin with built-in support for: nav strip, title card, bullet reveals, key-value pairs, flow diagrams, and section transitions
- [ ] Auto-scene generation should detect section boundaries from narration paragraphs and apply this pattern automatically
- [ ] Users can customize by overriding specific sections rather than writing the full `construct()` method

## Lesson 7: The overview "strip" of cards must scale dynamically

**Problem:** We started with 5 pillar cards at `width=1.8`. When expanding to 7 cards, they overflowed the frame width. Manual resizing to `width=1.4` was needed.

**Fix applied:** Used `VGroup.arrange(RIGHT, buff=0.15)` to auto-space cards, then positioned the group as a unit.

**Recommendation for docgen:**
- [ ] Nav strip should auto-calculate card width based on `(frame_width - margins) / num_cards`
- [ ] If labels are too long, truncate or reduce font size automatically
- [ ] Max 8–10 cards before switching to a two-row layout

## Lesson 8: Contrast rules for dark backgrounds

**Problem:** Colored text on dark backgrounds (e.g., `color=C_WARN` on `C_BG`) has insufficient contrast when the element is dimmed to 0.2 opacity. Inactive elements become invisible.

**Fix applied:** White text with colored accents (icon, background fill at 0.25 opacity). Inactive opacity floor of 0.4 (not 0.2).

**Recommendation for docgen:**
- [ ] Default text color should always be WHITE on dark backgrounds; use color only for accents (icons, borders, fills)
- [ ] Minimum opacity for dimmed elements: 0.35–0.40
- [ ] `docgen validate` should check frame-level contrast ratios (WCAG AA: 4.5:1 for text)

## Lesson 9: `docgen compose` path conventions must match render output

**Problem:** `docgen compose` looks for Manim output at `animations/media/videos/scenes/720p30/<Scene>.mp4`, but programmatic renders (via Python scripts) output to `animations/media/videos/720p30/<Scene>.mp4` (no `scenes/` subdirectory). This caused "FREEZE GUARD" failures that were actually just file-not-found falling through to stale cached files.

**Fix applied:** Manual `cp` to the expected path after each render.

**Recommendation for docgen:**
- [ ] `docgen manim` should handle the render and place the output in the canonical path — users should never need to run Manim directly
- [ ] If a stale file is found at the expected path, compare its duration to the audio and warn if they differ by more than 10%
- [ ] Support multiple resolution directories: look for `1080p30/`, `1440p60/`, `720p30/` in priority order

## Lesson 10: TTS regeneration invalidates everything downstream

**Problem:** After updating narration and regenerating TTS, the new audio has a different duration. This silently breaks all existing Manim timing, but nothing warns you. The compose step either pads with freeze frames (if shorter) or clips (if longer).

**Fix applied:** Full pipeline re-run: `tts → timestamps → rewrite scene → render → compose → validate`.

**Recommendation for docgen:**
- [ ] `docgen tts` should emit a duration-change summary: "18-roadmap: 205.0s → 314.6s (+53%)"
- [ ] If duration changed by more than 5%, print a WARNING that scenes and timestamps need regeneration
- [ ] `docgen compose` should refuse to compose if the scene MP4 was last modified before the audio MP4 (stale visual)
- [ ] Add a `docgen rebuild <segment>` command that runs the full pipeline: tts → timestamps → manim → compose → validate

## Summary

The core theme: **docgen should make the easy path the correct path.** Good fonts, relative layout, adequate font sizes, proper resolution, and duration-aware validation should all be defaults — not things a user discovers after 5 hours of debugging blurry overlapping text.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Manim generation lessons: font, layout, rendering, and sync #3

Context

Lesson 1: Manim's default font is unusable

Lesson 2: Hardcoded coordinates cause overlap — use Manim's layout system

Lesson 3: Font sizes must be minimum 14pt for video

Lesson 4: Render at 1080p minimum — 720p is too blurry for text-heavy content

Lesson 5: `_wait_until` is essential but easy to miscalculate

Lesson 6: Pillar/section pattern should be a reusable template

Lesson 7: The overview "strip" of cards must scale dynamically

Lesson 8: Contrast rules for dark backgrounds

Lesson 9: `docgen compose` path conventions must match render output

Lesson 10: TTS regeneration invalidates everything downstream

Summary

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Manim generation lessons: font, layout, rendering, and sync #3

Description

Context

Lesson 1: Manim's default font is unusable

Lesson 2: Hardcoded coordinates cause overlap — use Manim's layout system

Lesson 3: Font sizes must be minimum 14pt for video

Lesson 4: Render at 1080p minimum — 720p is too blurry for text-heavy content

Lesson 5: _wait_until is essential but easy to miscalculate

Lesson 6: Pillar/section pattern should be a reusable template

Lesson 7: The overview "strip" of cards must scale dynamically

Lesson 8: Contrast rules for dark backgrounds

Lesson 9: docgen compose path conventions must match render output

Lesson 10: TTS regeneration invalidates everything downstream

Summary

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions

Lesson 5: `_wait_until` is essential but easy to miscalculate

Lesson 9: `docgen compose` path conventions must match render output