feat(skill): website-to-hyperframes — concept-first authoring + per-beat read protocol#1005
Conversation
…eat read protocol Rewrite of the website-to-hyperframes skill that came out of 11 evaluation rounds. The honest read of those evals: prose-only guidance had hit its ceiling — sub-agents kept reporting "0 errors, looks good" without doing the work, producing slideshow-quality videos with mismatched brand colors, missing logos, and beats that didn't serve the storyboard. This restructure addresses the failure modes that real videos showed, not theoretical ones. **Step structure (replaces 7-step layout with concept-first 6-step)** Old: capture → design → script → storyboard → vo → build → validate New: capture → design → brief → storyboard → vo → build → validate The brief step (Step 2) is new: a conversation-shaped step that aligns message + audience + arc before any beat-writing happens. Concept-first throughout — message → arc → beats that serve the arc → which assets and techniques bring each beat to life. **Step 0 (capture)** - "View the contact sheets — carefully, every cell, not a glance" closes the failure mode where agents reported "viewed the contact sheet" after one scroll and later wrote beats referencing assets that didn't exist or missed the brand logo. - Names the right artifacts to read in order (tokens.json → design-styles.json → asset-descriptions.md → fonts-manifest.json), with read-on-demand guidance for the rest. **Step 1 (design)** - DESIGN.md authoring guide. Restored component CSS sections (Component Stylings, Spacing & Layout, Depth & Elevation) that earlier batches over-collapsed. **Step 2 (brief)** - Strategy/messaging step. Clear instruction for "Surprise me" / minimal direction: state the minimum context (where the video runs, who it's for) and proceed bold. **Step 3 (storyboard + script)** - Concept gate at the top — answer "what makes this video distinct" before writing beat 1. - Brand-floor MUST rules (logo in opener + closer; signature visual somewhere in the video). - Captured assets (SVG logos, illustrations, hero art, gradients) are first-class beat content alongside composed UIs — many of them carry beats outright. The constraint is only that you start from the message, not the asset inventory. **Step 4 (vo)** - TTS ranking: HeyGen first (auto word timestamps), ElevenLabs second, Kokoro free. Audio timing reconciliation gate: if actual audio duration ≠ storyboard planned ±15%, rescale beats or trim script before Step 5. **Step 5 (build) + beat-builder-guide.md** - Sub-agent template now pastes brand values inline rather than telling the sub-agent to re-read DESIGN.md. Targeted file reads with specific sections + line ranges. - "Patterns that ARE shots" affirmative list (captured logo draw-on, hero illustration push-in, captured screenshot with parallax layers, kinetic typography over captured asset). - Webpage-mimicry patterns (full CSS browser chrome, parked-camera composition, ±2px breathing motion) marked ⚠ rather than ❌ — fine when the storyboard genuinely calls for them as the subject. - Required cinematography per beat: shot type, camera move, depth strategy, purpose. **Step 6 (validate) — per-beat read protocol** This replaces the previous "spawn verify-beats CLI" gate. A grep of composition HTML can catch structural lies (missing hex codes, wrong asset paths) but it can't catch boring beats, off-screen logos, GSAP timelines that only cover the first 2 seconds, or camera moves that don't match the storyboard. Those failures only surface when somebody opens the file and reads it. Per-beat verdict template names the brand hex codes used, captured asset paths referenced, headline `font-size`, GSAP timeline coverage, and storyboard alignment. Critic sub-agent scores a "Captured asset utilization" dimension specifically so the eval captures whether captured SVGs/illustrations carried beats or got recreated as divs. **Asset bundle** - 20 Pixabay-licensed SFX files with `CREDITS.md` documenting provenance. SFX assignment moved to Step 3 (creative decision) so Step 5 implements rather than improvises. - Capabilities reference + html-in-canvas-patterns updated: Three.js 0.181.2 + ESM jsm imports, mulberry32 seeded PRNG for deterministic shatter, 24-effect text-animation catalog referenced (catalog itself lands in the hyperframes-skill PR). - Visual vocabulary rewritten: replaces user-word lookup tables with brand-first derivation across 6 axes; user words land as modifiers, not replacements.
- Delete `references/visual-vocabulary.md` and scrub the four call sites that referenced it. The 6-axis lookup framing it introduced contradicted the rest of the skill's "design from the brand, not from a table" stance. - Replace all `npx tsx packages/cli/src/cli.ts <cmd>` invocations with `npx hyperframes <cmd>` in step-0-capture.md, step-5-build.md, step-6-validate.md, and beat-builder-guide.md. The capture- and snapshot-pipeline improvements that previously required the local CLI now ship in the published CLI via the stack's PRs #987 and #988, so once the stack lands the published CLI is the right invocation for the skill prose. - Remove the now-contradictory "ALWAYS use the local CLI — never npx hyperframes" warnings in step-0-capture.md and step-6-validate.md.
SKILL.md grew to 192 lines from a 124-line baseline. Most of the
bloat was content duplicated in the step reference files it points
to. Removed 6 sections that duplicated step content, composed 2
small additions into the step files where they actually belonged.
Removed from SKILL.md (already covered elsewhere):
- "Take your time" / "Quality matters more than speed" paragraph
— operational philosophy already implicit in step-6-validate's
cell-by-cell review prose.
- "Creative Tension Principle" section — step-3-storyboard.md:21
already has the exact "What makes this video different from a
generic [video type] for any [industry] brand?" single-sentence
test. Duplicate removed; storyboard is the right home.
- "Step -1: What we're actually making" (30 lines: anti-patterns,
video grammar, shot framing, camera moves) — duplicates step-3-
storyboard.md:197+ (shot types), :229–232 (anti-patterns), and
beat-builder-guide.md:126+ (shot framing).
- "Sub-agent mode" + "No sub-agents" preamble — step-5-build.md:286
–292 already handles both parallel and serial runtimes.
- "Image-viewing capability" warning — operationally implicit in
step-0 ("View the contact sheets") and step-6 ("View snapshots/
contact-sheet.jpg cell-by-cell").
- "User Interaction Points" table — redundant with the inline 💬
markers on Steps 3 and 4.
Composed into step files (content that wasn't there yet):
- step-1-design.md "Target length" paragraph: added the fast-pacing
/ billboard-per-beat exception (50-line DESIGN.md is enough when
beats are single hero elements on full-bleed backgrounds, not
full UIs).
- step-2-brief.md "Surprise me" section: added the global-propagation
rule — when the user signals autonomous mode at Step 2, every 💬
gate downstream (Step 3 storyboard approval, Step 4 TTS choice) is
also skipped.
Step 5 SKILL.md gate paragraph trimmed from a 6-clause description
of the per-beat read to one line that points at step-5-build.md
for the full checklist.
Updated the techniques.md reference counts from "20" to "13" in
SKILL.md, beat-builder-guide.md, and step-3-storyboard.md to match
the techniques.md trim in the upstream branch.
Net: SKILL.md 192 → 131 lines.
Step 0 had bloated to 91 lines that did the work of Steps 1–3: viewing contact sheets cell-by-cell, reading 8 data files, listing promising assets, inferring product purpose / audience / value prop / brand voice. That meant the agent did all the heavy lifting upfront, produced summaries that went stale before they were used, and the actual "run the capture" instruction was buried. Step 0 now owns only what Step 0 is: run the capture command, sanity-check it succeeded, hand off. 91 → 55 lines. Moved (composed into destination files, verified each was the right home before adding): - Read tokens.json + design-styles.json → step-1-design.md replaces the passive "you read these in Step 0" line with an active "Read these now — primary data source for Sections 3–6." - Contact-sheet "every cell, name 5 assets per page" anti-glance prose → step-3-storyboard.md asset-discovery bullet (which already covered contact-sheet viewing generally, now strengthened with the anti-glance rule). - Strategic site summary (product / audience / voice / value prop) → step-2-brief.md absorbed this; the brief itself IS the summary. Replaced "After presenting the site summary (from Step 0)" with step-2 grounding itself by reading DESIGN.md + asset-descriptions + visible-text directly. Step 0's new structure: - Run the capture (CLI command + project-dir convention) — unchanged - Confirm it succeeded (1-line summary, error-out on bad capture) - Reference table mapping each capture/ file to the step that first reads it (explicit "DO NOT read these here") - Gate: capture exits 0 + counts non-zero
Cleans up two related overcorrections that crept across the skill
prose: (a) "compose UIs from divs/SVG/CSS" repeated 6+ times in
step-1, anchoring agents to website-shaped beats; (b) "every beat's
primary visual stays composed from divs / SVG / CSS / GSAP" and
"captured assets are accents — they decorate, they don't carry"
overstatements in step-3 and step-5 that contradicted the dial-back
done earlier in this stack.
The real framing: a beat composes from whatever primitives the scene
needs — HTML/CSS, SVG, captured assets, WebGL, Canvas, Three.js,
kinetic typography, Lottie — alone or in combination. They're inputs
to one output (the video frame). No rule maps intent → primitive.
The narrow no-go is one rule: never paste a product-UI screenshot as
load-bearing content (the slideshow pattern).
step-1-design.md (8 edits):
- L5 intro: drop "composed from divs/SVG/CSS at build time" detail.
- L7 length: drop "compose UIs from scratch (divs/SVG/CSS)" framing;
merge L290's "over-investing in prose" caveat in.
- L97: "composing UIs from divs in Step 5" → "building beats".
- L161: "compose the X UI" → "a beat featuring the X".
- L290: duplicate length bullet — deleted.
- L293: "sub-agents compose UIs at build time from divs/SVG/CSS..."
→ "No separate Components section — Quick Reference is where
components live."
step-3-storyboard.md (3 edits):
- L3 (intro): "alongside composed UIs" → "alongside composed beats".
- L276 ("Compose the load-bearing visuals yourself") paragraph
replaced with the primitive-toolkit framing — toolkit is open, the
only no-go is product-UI screenshots as load-bearing content.
- L381–383 ("The bar:") three bullets collapsed to one bullet:
primary visuals use whatever combination the scene needs; accents
are optional; brand-floor minimums are the minimum.
step-5-build.md (2 edits):
- L104 stacked-beats intro: "composed from divs, SVG, canvas, and
CSS. Never a full-bleed screenshot." → "composes from whatever
primitives the storyboard called for ... Narrow no-go: never a
full-bleed product-UI screenshot as load-bearing content."
- L147: "Build the UI element from divs and CSS" → "Build the
element from divs and CSS" — drops the UI bias since this rule
applies only when the asset IS a product-UI screenshot.
Net result: "compose from divs/SVG/CSS" mentions drop from 10+ to 0
as a generalized framing; the term survives only in concrete
examples (e.g. "cards-as-divs" when the beat is specifically a
kanban demo) where divs/CSS IS the right answer.
Three follow-ups caught by a post-restructure audit pass. All three were places where the earlier "compose primary, asset is accent" framing survived after the step-3 and step-5 paragraphs already got the primitive-toolkit rewrite. Cleans up the contradiction so the skill speaks with one voice: captured assets can be primary content; the narrow no-go is just pasting product-UI screenshots. - step-2-brief.md:80 — the "flip it" example said agents should reframe "the hero illustration centers the opener" into "kinetic typography ... hero illustration as ambient depth." That reverses the dial-back: captured illustrations CAN center an opener. The flip-it rule now applies narrowly to product-UI screenshots; for captured logos/illustrations/hero art, no flip is needed. - step-2-brief.md:149 — option-template guidance said "primary content is 'the screenshot of X'" was forbidden. Narrowed to "primary content is a pasted product-UI screenshot." Other captured assets (SVG logos, illustrations, hero art) are valid primaries when the concept calls for them. - step-3-storyboard.md:314 — Common-accent-uses bullet implied accents are always layered on "composed UI." Reframed: list accent uses for when the primary is something else; when the captured asset IS the primary (logo opener, hero parallax), document it under Composition, not Accents.
… normal Second-batch audit cleanup after Ular's "logo isn't a requirement, just a nice default" correction. Three related places still framed captured-asset-primary beats as rare exceptions and the brand-floor rules as hard MUSTs — both overstatements that contradict the rest of the dial-back. Plus a TOC-only callout on capabilities.md. - step-3:300 "for the RARE beat where a captured asset is the primary visual ... defaulted to the slideshow pattern this workflow exists to break" — rewritten. Captured-asset-primary beats are a normal valid choice. The narrow no-go is just pasting product-UI screenshots full-bleed. - step-3:351 "Each one has a composed visual that carries it" — rewritten to "Each one has a primary visual that carries it (composed UI, captured asset, kinetic typography, WebGL, etc.)". - step-3:353 "assets decorate concept-defined beats; they do not seed them" — kept "do not seed" (correct: don't write a beat because of a cool asset); dropped the "decorate" framing (overgeneralized — assets can be primary too). - step-3 brand-inflection floor section: relabeled from "REQUIRED minimums" to "Brand defaults (nice-to-haves for most brand videos)". "MUST appear" softened to "for most brand videos, the logo lands in the opener and the closer" with explicit "skippable when the storyboard's concept calls for it" language. - step-3:379 "The bar:" bullet: "brand-floor minimums ... the minimum, not the ceiling" → "brand-defaults section covers most brand videos but isn't a hard requirement." - step-5:413 "Brand-floor check" section in the per-beat read protocol: relabeled "Brand-defaults check", reframed each item as a default not a fail-condition; agent checks against the storyboard's intent rather than enforcing a hard rule. - capabilities.md top: added a "Scan the TOC; do NOT read this file linearly" callout — it's a 700+ line inventory; agents should jump to the section a beat needs, not read top-to-bottom.
…ugh-white regression Five fixes from Ular's first-pass workflow run: 1. step-1-design.md Fonts section — sub-agents pointed @font-face for "ES Build Neutral" at the Inter .woff2 files because DESIGN.md only named families, never emitted exact src: paths. Now the Fonts section example shows per-family + per-weight file paths AND a copy-verbatim @font-face block sub-agents can paste, so there's no inference step. Adds an explicit narrative of the real failure mode and how to avoid it. 2. beat-builder-guide.md FONTS rule — was "brand fonts with capture/assets/fonts/ path need @font-face in <style>." Now: "copy the @font-face block VERBATIM from DESIGN.md. Do NOT guess which .woff2 file belongs to which family — capture filenames are content-hashed and there is no visible mapping. If DESIGN.md doesn't include exact src: paths per family, STOP and ask the main agent; never pair an arbitrary .woff2 with a family name from memory." 3. step-1-design.md Colors section — Sub-agents reproduced brand colors faithfully and hit WCAG AA failures on dark surfaces (#68686A on #18191B = 3.16:1). Now the Colors section example computes per-pairing contrast ratios with ✅/⚠/❌ markers, documents the dark-surface substitute color when the brand's own palette fails, and points at the /hyperframes-contrast skill for ratio computation. Sub-agents pick text colors by surface context, not by "this is the brand's secondary text color." 4. capabilities.md flash-through-white entry — the "ideal as invisible bridge at duration: 0.01" framing caused agents to scatter white flashes through every composition as transition bridges. The fix was documented in the branch's HANDOFF but never landed. Now: "Fade through white midpoint — a visible white flash between scenes. Use only when the brand specifically calls for a white-flash beat boundary; this is NOT a neutral 'default' transition." 5. step-6-validate.md Warnings list — adds a paragraph on WCAG contrast false positives. The validator samples at fixed timestamps; elements at opacity:0 / mid-fade get measured as if fully visible, producing spurious failures. Tells the agent to verify visually before changing colors to clear a WCAG warning — bumping a color to fix a sampling artifact changes brand identity for no real benefit.
miguel-heygen
left a comment
There was a problem hiding this comment.
Content matches previously approved #990 (exact same diff stats +3062/-863, same 8 commits). Re-approved.
jrusso1020
left a comment
There was a problem hiding this comment.
Re-approving the v2 re-cut.
Content equivalence verified at the patch-id level — all 8 commits in #1005 produce byte-equivalent patches to #990:
#990 / #1005 commits (all 8 match):
71e143e8..., bff42f6a..., 5f37cddf..., efa13932...,
ea4cbfce..., ce98e538..., 44077189..., 6cddea00...
Stable patch-id equality across the full commit stack confirms no functional drift. The merge-base diff stat is inflated only by main's drift in the interim.
Merge via GitHub UI individually per Ular's commitment.
— Rames Jusso
The base branch was changed.
|
@miguel-heygen can I get restamp here as well, apparently rebasing drops the approvals |
for sure, and yeah rebasing drops the approvals for security! |
Re-stacked version of #990 (silently lost in yesterday's Graphite stack-merge).
Content identical to what was approved on #990. Stacked on #1004 (lint rules).
What's in this PR
Rewrites the
website-to-hyperframesskill. Highlights:SKILL.md(131 lines) — step-pointer index, push detail to step filesstep-0-capture.md(~55 lines) — step 0 owns capture, not analysisstep-1-design.md— explicit WCAG AA contrast pairings with examples of failures (e.g.,#68686Aon#18191B= 3.16:1 fails AA)@font-facepaths spelled outbrand-floor→brand-defaultsrenameOriginal review history
graphite-base/988)Layer 3 of 4.