refactor(skills): split asset preprocessing out of hyperframes-cli by jrusso1020 · Pull Request #619 · heygen-com/hyperframes

jrusso1020 · 2026-05-04T21:59:58Z

What

Split the hyperframes-cli skill into two skills:

hyperframes-cli stays focused on the dev loop: init, lint, inspect, preview, render, doctor, browser, info, upgrade, compositions, docs, benchmark. Description rewritten so it no longer triggers on TTS / transcription / background-removal keywords.
hyperframes-media is new. It owns tts, transcribe, and remove-background — voice selection table, the critical .en-translates-non-English whisper rule, output codec choice (VP9 alpha WebM vs ProRes), the TTS → transcribe → captions chain, and composition integration snippets.

Doc references updated in README.md, CLAUDE.md, docs/quickstart.mdx, and docs/guides/prompting.mdx.

Why

Two problems with the current hyperframes-cli:

Description bloat causes spurious agent loads. The skill description listed every subcommand as a trigger keyword (tts, transcribe, remove-background, ...). Agents auto-loading skills by description were pulling hyperframes-cli for any mention of audio, transcription, or backgrounds — even when the task was just rendering a composition. Internal agents in particular were loading the CLI skill unnecessarily.
Body bloat penalizes the common path. The dev loop (lint/render) is invoked far more often than asset preprocessing, but voice tables, whisper model rules, and codec guidance loaded on every invocation. Now that the CLI ships three preprocessing commands (and they each download multi-hundred-MB models on first run), this content was only going to grow.

How

One sibling, not three. tts, transcribe, and remove-background share a workflow pattern (first-run model download → produce file → drop into composition) and chain naturally (TTS → transcribe → captions), so bundling them keeps the mental model coherent. The CLI skill references hyperframes-media from a one-paragraph "Asset Preprocessing" stub instead of inlining 50+ lines of substantive guidance.

Net change: hyperframes-cli 173 → 144 lines, new hyperframes-media at 147 lines. Tighter description on hyperframes-cli is the bigger win for spurious-load reduction.

Test plan

bunx oxfmt --check clean on all changed files
bunx oxlint clean on all changed files
Lefthook pre-commit + commit-msg hooks pass
Verify npx skills add heygen-com/hyperframes picks up hyperframes-media after merge
Smoke-test: invoke /hyperframes-media in Claude Code and confirm voice table + .en warning load
Smoke-test: invoke /hyperframes-cli and confirm body no longer mentions Kokoro/Whisper/u2net specifics

Move tts/transcribe/remove-background guidance into a new hyperframes-media sibling skill so the CLI skill stays focused on the dev loop (init/lint/inspect/preview/render/doctor). Two motivations: 1. Description bloat. The CLI skill listed every subcommand as a trigger keyword, which made agents auto-load it for any mention of audio, transcription, or backgrounds — even when the task was just rendering a composition. 2. Body bloat. Voice tables, the .en-translates-non-English whisper rule, and codec selection guidance all loaded on every CLI invocation. With three preprocessing commands now in the CLI (tts, transcribe, remove-background), this is only going to grow. The split keeps a single sibling (hyperframes-media), not three: the commands share a workflow (preprocess asset → drop into composition) and the same first-run-downloads-a-model pattern, so they belong together. CLI skill now references hyperframes-media from a one-paragraph "Asset Preprocessing" stub. Doc references updated in README.md, CLAUDE.md, docs/quickstart.mdx, and docs/guides/prompting.mdx.

mintlify · 2026-05-04T22:04:17Z

Preview deployment for your docs. Learn more about Mintlify Previews.

Project	Status	Preview	Updated (UTC)
hyperframes	🟢 Ready	View Preview	May 4, 2026, 10:04 PM

💡 Tip: Enable Workflows to automatically generate PRs for you.

…edia Code review found the new hyperframes-media skill was parallel content with skills/hyperframes/references/tts.md and the "Whisper Model Guide" section of transcript-guide.md — same voice table, same .en-translates-non-English warning, same TTS→transcribe chain in both places. Plus some scope creep in hyperframes-media (audio/video HTML snippets that duplicate the canonical track docs in hyperframes/SKILL.md:265+). Consolidation: - hyperframes-media is now the single source of truth for CLI invocation, voice selection, multilingual phonemization, whisper model selection, and the .en gotcha. Picked up the multilingual prefix decoding from the deleted tts.md. - skills/hyperframes/references/tts.md deleted; the bullet in hyperframes/SKILL.md is removed (no replacement — agents land on hyperframes-media via its own description). - skills/hyperframes/references/transcript-guide.md keeps only the caption-side concerns: input-format table, mandatory quality check, cleaning JS, external-API import path, and the "if no transcript exists" flow. The intro bash recipe and Whisper Model Guide section both moved to hyperframes-media. Top of the file now points to hyperframes-media for CLI/model details. Other tightening in hyperframes-media: - Dropped WHAT-narration filler and the inline <audio>/<video> HTML snippets — they duplicate the canonical track-attribute docs in hyperframes/SKILL.md. - Added the `id` field (`w0`, `w1`, ...) to the transcript output shape — the actual Word interface in packages/cli/src/whisper/normalize.ts includes it (optional for backwards compat), used by caption override logic. - Compressed the TTS → Transcribe → Captions chain section. Net: hyperframes-media 147 → 136 lines, transcript-guide.md 152 → 106 lines, tts.md gone (-75 lines).

vanceingalls

Verdict — Request changes (small, mechanical)

The split itself is sound — clear responsibility boundary, sensible bundling of tts / transcribe / remove-background (shared first-run-model-download pattern, natural TTS→transcribe chain), and hyperframes-media's 147 lines reads as a tight, useful skill. But two stale references on the PR branch directly undercut the stated goal of the split.

Blockers

skills/hyperframes/SKILL.md line 3 description still ends with: "For CLI commands (init, lint, preview, render, transcribe, tts) see the hyperframes-cli skill." The whole point of this PR is to stop transcribe/tts triggering hyperframes-cli autoloads. The parent skill is now telling agents the exact opposite. Update to point to hyperframes-media for those, or split the sentence.
packages/cli/src/templates/_shared/CLAUDE.md line 10 — the skills table that ships into every hyperframes init project still reads hyperframes-cli | CLI commands: init, lint, preview, render, transcribe, tts. Every new project bootstrapped after this lands will get the wrong mapping baked into its CLAUDE.md. Add the hyperframes-media row and trim the cli row to match the new README/quickstart wording.

Important

Test-plan checkboxes for the two smoke tests (invoke /hyperframes-media, invoke /hyperframes-cli and confirm body) are unchecked. For a skill-description refactor these are the only things that actually verify the behavior change — please run them before merge, since lint/format don't catch description-trigger regressions.
Consider whether hyperframes-media warrants a row in the docs feature-comparison / npx hyperframes skills output, if either enumerates skills statically (the skills command uses --all, so it's fine, but worth a grep for hardcoded lists).

Nits

New hyperframes-media/SKILL.md cross-links to ../hyperframes/references/transcript-guide.md — relative path traversal across sibling skills works but is a slightly leaky boundary. Probably fine since the guide is genuinely caption-authoring content; just flagging that if skills add ever flattens directories this breaks.
hyperframes-cli description: "For asset preprocessing commands (tts, transcribe, remove-background), invoke the hyperframes-media skill instead." — naming the commands here is clear for humans but does put those keywords back into the cli description. Probably unavoidable for the redirect to be useful, but worth eyes on whether spurious loads actually drop after merge.
README skills table description for hyperframes-media is great; the (Kokoro)/(Whisper)/(u2net) parentheticals duplicate trigger keywords but help humans scan — leave as-is.

What's good

The why/how in the PR description is excellent — clear problem statement (description bloat → spurious loads), clear bundling rationale.
hyperframes-media body is well-structured: voice-selection table, .en-translates-non-English rule called out as Non-Negotiable, codec-selection table.
transcript-guide.md correctly de-duped — caption-side vs CLI-side concerns now have distinct homes with a one-line redirect.
references/tts.md deletion is clean (no orphan links remaining in the parent skill).

— Review by Vai

…nters Review on PR #619 caught two places that still pointed transcribe/tts at hyperframes-cli — directly undercutting the description-trigger goal of the split: - skills/hyperframes/SKILL.md description ended with "For CLI commands (init, lint, preview, render, transcribe, tts) see the hyperframes-cli skill." Now splits the redirect: dev-loop commands (init, lint, inspect, preview, render) → hyperframes-cli; asset preprocessing (tts, transcribe, remove-background) → hyperframes-media. - packages/cli/src/templates/_shared/CLAUDE.md is the skills table baked into every project bootstrapped by `hyperframes init`. Its hyperframes-cli row still listed transcribe/tts. Trimmed to the dev-loop commands and added a hyperframes-media row beside it, so new projects pick up the correct mapping. Also caught by greppping for stale skill lists: - .codex-plugin/plugin.json longDescription bundled transcribe/tts into "use the CLI for ...". Split into "use the CLI for the dev loop (init/preview/render), preprocess assets (tts/transcribe/remove-background)" so the Codex plugin store surface matches reality. Confirmed `npx hyperframes skills` shells out to `npx skills add heygen-com/hyperframes --all` (packages/cli/src/commands/skills.ts), so the skill list is read dynamically from the repo and picks up hyperframes-media without code changes.

jrusso1020 · 2026-05-05T03:16:32Z

Thanks @vanceingalls — addressed both blockers in 94dc6c8:

Blockers

✅ skills/hyperframes/SKILL.md:3 description split: dev-loop commands (init, lint, inspect, preview, render) → hyperframes-cli; asset preprocessing (tts, transcribe, remove-background) → hyperframes-media. The parent skill no longer tells agents to load hyperframes-cli for transcribe/tts.
✅ packages/cli/src/templates/_shared/CLAUDE.md:10 — the skills table baked into every bootstrapped project. Trimmed the hyperframes-cli row to the dev-loop commands and added a hyperframes-media row alongside, so new projects from hyperframes init pick up the correct mapping.

Important

I also greppped for any other hardcoded skill lists and caught one more: .codex-plugin/plugin.json longDescription bundled transcribe/tts into "use the CLI for ...". Split it into dev-loop vs asset preprocessing in the same commit.
Verified npx hyperframes skills shells out to npx skills add heygen-com/hyperframes --all (packages/cli/src/commands/skills.ts:17), so the list is read dynamically from the repo — no code change needed there.

Test-plan checkboxes — fair point, leaving them for a manual pass after merge in a clean session, since the verification has to come from a fresh agent decision (not the same agent that just authored the split). Will follow up in the PR if the spurious-load behavior doesn't actually drop.

Nits

Relative path ../hyperframes/references/transcript-guide.md — agreed, a slightly leaky boundary. Leaving as-is for now since transcript-guide.md is genuinely caption-authoring content. If skills add ever flattens directories we'll need to inline the redirect or move the file.
CLI description naming tts/transcribe/remove-background in the redirect — necessary for the redirect to be useful. Will keep eyes on whether spurious loads actually drop after merge; if they don't, we can swap the explicit names for "preprocessing commands".
(Kokoro)/(Whisper)/(u2net) parentheticals in the README — leaving as-is per your suggestion.

vanceingalls

Re-review — All blockers resolved

Verified the three commits since my prior review (range 4413d25b..94dc6c8). Commit 94dc6c8 is a surgical, scoped fix touching exactly the three files where stale references lived — no scope creep.

Prior blockers

Resolved — skills/hyperframes/SKILL.md description (line 3). Confirmed at HEAD: now reads "For dev-loop CLI commands (init, lint, inspect, preview, render) see the hyperframes-cli skill; for asset preprocessing commands (tts, transcribe, remove-background) see the hyperframes-media skill." The split-redirect approach (instead of a single redirect) is the right call — agents searching for tts/transcribe keywords now hit hyperframes-media, which is exactly the autoload-routing fix this PR was about.
Resolved — packages/cli/src/templates/_shared/CLAUDE.md (line 10–11). Confirmed at HEAD: hyperframes-cli row trimmed to init, lint, inspect, preview, render, doctor, and a new hyperframes-media row was added directly beneath it. Every project bootstrapped by hyperframes init after this lands will now ship the correct skill table — which was the failure mode I was worried about.

Bonus catch (good)

.codex-plugin/plugin.json longDescription previously said "use the CLI for init/preview/render/transcribe/tts". Now reads "use the CLI for the dev loop (init/preview/render), preprocess assets (tts/transcribe/remove-background) for compositions". Same class of stale reference as the two blockers — nice to have caught it in the same pass so the Codex plugin store surface isn't misleading.

New observations

nit — Worth a follow-up grep some weeks out for npx hyperframes skills output and any registry/site copy that enumerates skills; the dynamic skills add --all path covers most of it (good), but description/marketing copy elsewhere is the kind of thing that drifts.
nit — Test-plan smoke checkboxes (invoke /hyperframes-media, invoke /hyperframes-cli and confirm body) remain unchecked in the PR description. Since the PR is already merged, this is just a flag — the only real verification of a description-trigger refactor is running the agent against the new descriptions, so worth doing post-merge to confirm spurious-load reduction is real.

Verdict

Approve. Both blockers verified at the merge commit. Clean execution on the addressed-feedback pass.

— Review by Vai

mintlify Bot deployed to staging - docs May 4, 2026 22:04 View deployment

vanceingalls reviewed May 5, 2026

View reviewed changes

miguel-heygen approved these changes May 5, 2026

View reviewed changes

jrusso1020 merged commit 06f5422 into main May 5, 2026
29 checks passed

jrusso1020 deleted the split-cli-media-skills branch May 5, 2026 03:27

mintlify Bot deployed to staging - docs May 5, 2026 04:19 View deployment

jrusso1020 mentioned this pull request May 5, 2026

chore: release v0.4.45 #628

Closed

vanceingalls reviewed May 5, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

refactor(skills): split asset preprocessing out of hyperframes-cli#619

refactor(skills): split asset preprocessing out of hyperframes-cli#619
jrusso1020 merged 3 commits intomainfrom
split-cli-media-skills

jrusso1020 commented May 4, 2026

Uh oh!

mintlify Bot commented May 4, 2026 •

edited

Loading

Uh oh!

vanceingalls left a comment

Uh oh!

jrusso1020 commented May 5, 2026

Uh oh!

Uh oh!

vanceingalls left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

jrusso1020 commented May 4, 2026

What

Why

How

Test plan

Uh oh!

mintlify Bot commented May 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

vanceingalls left a comment

Choose a reason for hiding this comment

Verdict — Request changes (small, mechanical)

Blockers

Important

Nits

What's good

Uh oh!

jrusso1020 commented May 5, 2026

Uh oh!

Uh oh!

vanceingalls left a comment

Choose a reason for hiding this comment

Re-review — All blockers resolved

Prior blockers

Bonus catch (good)

New observations

Verdict

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

mintlify Bot commented May 4, 2026 •

edited

Loading