refactor(skills): split asset preprocessing out of hyperframes-cli#619
refactor(skills): split asset preprocessing out of hyperframes-cli#619jrusso1020 merged 3 commits intomainfrom
Conversation
Move tts/transcribe/remove-background guidance into a new hyperframes-media sibling skill so the CLI skill stays focused on the dev loop (init/lint/inspect/preview/render/doctor). Two motivations: 1. Description bloat. The CLI skill listed every subcommand as a trigger keyword, which made agents auto-load it for any mention of audio, transcription, or backgrounds — even when the task was just rendering a composition. 2. Body bloat. Voice tables, the .en-translates-non-English whisper rule, and codec selection guidance all loaded on every CLI invocation. With three preprocessing commands now in the CLI (tts, transcribe, remove-background), this is only going to grow. The split keeps a single sibling (hyperframes-media), not three: the commands share a workflow (preprocess asset → drop into composition) and the same first-run-downloads-a-model pattern, so they belong together. CLI skill now references hyperframes-media from a one-paragraph "Asset Preprocessing" stub. Doc references updated in README.md, CLAUDE.md, docs/quickstart.mdx, and docs/guides/prompting.mdx.
|
Preview deployment for your docs. Learn more about Mintlify Previews.
💡 Tip: Enable Workflows to automatically generate PRs for you. |
…edia Code review found the new hyperframes-media skill was parallel content with skills/hyperframes/references/tts.md and the "Whisper Model Guide" section of transcript-guide.md — same voice table, same .en-translates-non-English warning, same TTS→transcribe chain in both places. Plus some scope creep in hyperframes-media (audio/video HTML snippets that duplicate the canonical track docs in hyperframes/SKILL.md:265+). Consolidation: - hyperframes-media is now the single source of truth for CLI invocation, voice selection, multilingual phonemization, whisper model selection, and the .en gotcha. Picked up the multilingual prefix decoding from the deleted tts.md. - skills/hyperframes/references/tts.md deleted; the bullet in hyperframes/SKILL.md is removed (no replacement — agents land on hyperframes-media via its own description). - skills/hyperframes/references/transcript-guide.md keeps only the caption-side concerns: input-format table, mandatory quality check, cleaning JS, external-API import path, and the "if no transcript exists" flow. The intro bash recipe and Whisper Model Guide section both moved to hyperframes-media. Top of the file now points to hyperframes-media for CLI/model details. Other tightening in hyperframes-media: - Dropped WHAT-narration filler and the inline <audio>/<video> HTML snippets — they duplicate the canonical track-attribute docs in hyperframes/SKILL.md. - Added the `id` field (`w0`, `w1`, ...) to the transcript output shape — the actual Word interface in packages/cli/src/whisper/normalize.ts includes it (optional for backwards compat), used by caption override logic. - Compressed the TTS → Transcribe → Captions chain section. Net: hyperframes-media 147 → 136 lines, transcript-guide.md 152 → 106 lines, tts.md gone (-75 lines).
vanceingalls
left a comment
There was a problem hiding this comment.
Verdict — Request changes (small, mechanical)
The split itself is sound — clear responsibility boundary, sensible bundling of tts / transcribe / remove-background (shared first-run-model-download pattern, natural TTS→transcribe chain), and hyperframes-media's 147 lines reads as a tight, useful skill. But two stale references on the PR branch directly undercut the stated goal of the split.
Blockers
-
skills/hyperframes/SKILL.mdline 3 description still ends with: "For CLI commands (init, lint, preview, render, transcribe, tts) see the hyperframes-cli skill." The whole point of this PR is to stoptranscribe/ttstriggeringhyperframes-cliautoloads. The parent skill is now telling agents the exact opposite. Update to point tohyperframes-mediafor those, or split the sentence. -
packages/cli/src/templates/_shared/CLAUDE.mdline 10 — the skills table that ships into everyhyperframes initproject still readshyperframes-cli | CLI commands: init, lint, preview, render, transcribe, tts. Every new project bootstrapped after this lands will get the wrong mapping baked into its CLAUDE.md. Add thehyperframes-mediarow and trim the cli row to match the new README/quickstart wording.
Important
-
Test-plan checkboxes for the two smoke tests (invoke
/hyperframes-media, invoke/hyperframes-cliand confirm body) are unchecked. For a skill-description refactor these are the only things that actually verify the behavior change — please run them before merge, since lint/format don't catch description-trigger regressions. -
Consider whether
hyperframes-mediawarrants a row in the docs feature-comparison /npx hyperframes skillsoutput, if either enumerates skills statically (theskillscommand uses--all, so it's fine, but worth a grep for hardcoded lists).
Nits
- New
hyperframes-media/SKILL.mdcross-links to../hyperframes/references/transcript-guide.md— relative path traversal across sibling skills works but is a slightly leaky boundary. Probably fine since the guide is genuinely caption-authoring content; just flagging that ifskills addever flattens directories this breaks. hyperframes-clidescription: "For asset preprocessing commands (tts,transcribe,remove-background), invoke thehyperframes-mediaskill instead." — naming the commands here is clear for humans but does put those keywords back into the cli description. Probably unavoidable for the redirect to be useful, but worth eyes on whether spurious loads actually drop after merge.- README skills table description for
hyperframes-mediais great; the(Kokoro)/(Whisper)/(u2net)parentheticals duplicate trigger keywords but help humans scan — leave as-is.
What's good
- The why/how in the PR description is excellent — clear problem statement (description bloat → spurious loads), clear bundling rationale.
hyperframes-mediabody is well-structured: voice-selection table,.en-translates-non-English rule called out as Non-Negotiable, codec-selection table.transcript-guide.mdcorrectly de-duped — caption-side vs CLI-side concerns now have distinct homes with a one-line redirect.references/tts.mddeletion is clean (no orphan links remaining in the parent skill).
— Review by Vai
…nters Review on PR #619 caught two places that still pointed transcribe/tts at hyperframes-cli — directly undercutting the description-trigger goal of the split: - skills/hyperframes/SKILL.md description ended with "For CLI commands (init, lint, preview, render, transcribe, tts) see the hyperframes-cli skill." Now splits the redirect: dev-loop commands (init, lint, inspect, preview, render) → hyperframes-cli; asset preprocessing (tts, transcribe, remove-background) → hyperframes-media. - packages/cli/src/templates/_shared/CLAUDE.md is the skills table baked into every project bootstrapped by `hyperframes init`. Its hyperframes-cli row still listed transcribe/tts. Trimmed to the dev-loop commands and added a hyperframes-media row beside it, so new projects pick up the correct mapping. Also caught by greppping for stale skill lists: - .codex-plugin/plugin.json longDescription bundled transcribe/tts into "use the CLI for ...". Split into "use the CLI for the dev loop (init/preview/render), preprocess assets (tts/transcribe/remove-background)" so the Codex plugin store surface matches reality. Confirmed `npx hyperframes skills` shells out to `npx skills add heygen-com/hyperframes --all` (packages/cli/src/commands/skills.ts), so the skill list is read dynamically from the repo and picks up hyperframes-media without code changes.
|
Thanks @vanceingalls — addressed both blockers in 94dc6c8: Blockers
Important
Test-plan checkboxes — fair point, leaving them for a manual pass after merge in a clean session, since the verification has to come from a fresh agent decision (not the same agent that just authored the split). Will follow up in the PR if the spurious-load behavior doesn't actually drop. Nits
|
vanceingalls
left a comment
There was a problem hiding this comment.
Re-review — All blockers resolved
Verified the three commits since my prior review (range 4413d25b..94dc6c8). Commit 94dc6c8 is a surgical, scoped fix touching exactly the three files where stale references lived — no scope creep.
Prior blockers
-
Resolved —
skills/hyperframes/SKILL.mddescription (line 3). Confirmed at HEAD: now reads "For dev-loop CLI commands (init, lint, inspect, preview, render) see the hyperframes-cli skill; for asset preprocessing commands (tts, transcribe, remove-background) see the hyperframes-media skill." The split-redirect approach (instead of a single redirect) is the right call — agents searching fortts/transcribekeywords now hithyperframes-media, which is exactly the autoload-routing fix this PR was about. -
Resolved —
packages/cli/src/templates/_shared/CLAUDE.md(line 10–11). Confirmed at HEAD:hyperframes-clirow trimmed toinit, lint, inspect, preview, render, doctor, and a newhyperframes-mediarow was added directly beneath it. Every project bootstrapped byhyperframes initafter this lands will now ship the correct skill table — which was the failure mode I was worried about.
Bonus catch (good)
.codex-plugin/plugin.jsonlongDescriptionpreviously said "use the CLI for init/preview/render/transcribe/tts". Now reads "use the CLI for the dev loop (init/preview/render), preprocess assets (tts/transcribe/remove-background) for compositions". Same class of stale reference as the two blockers — nice to have caught it in the same pass so the Codex plugin store surface isn't misleading.
New observations
- nit — Worth a follow-up grep some weeks out for
npx hyperframes skillsoutput and any registry/site copy that enumerates skills; the dynamicskills add --allpath covers most of it (good), but description/marketing copy elsewhere is the kind of thing that drifts. - nit — Test-plan smoke checkboxes (invoke
/hyperframes-media, invoke/hyperframes-cliand confirm body) remain unchecked in the PR description. Since the PR is already merged, this is just a flag — the only real verification of a description-trigger refactor is running the agent against the new descriptions, so worth doing post-merge to confirm spurious-load reduction is real.
Verdict
Approve. Both blockers verified at the merge commit. Clean execution on the addressed-feedback pass.
— Review by Vai
What
Split the
hyperframes-cliskill into two skills:hyperframes-clistays focused on the dev loop:init,lint,inspect,preview,render,doctor,browser,info,upgrade,compositions,docs,benchmark. Description rewritten so it no longer triggers on TTS / transcription / background-removal keywords.hyperframes-mediais new. It ownstts,transcribe, andremove-background— voice selection table, the critical.en-translates-non-English whisper rule, output codec choice (VP9 alpha WebM vs ProRes), the TTS → transcribe → captions chain, and composition integration snippets.Doc references updated in
README.md,CLAUDE.md,docs/quickstart.mdx, anddocs/guides/prompting.mdx.Why
Two problems with the current
hyperframes-cli:tts,transcribe,remove-background, ...). Agents auto-loading skills by description were pullinghyperframes-clifor any mention of audio, transcription, or backgrounds — even when the task was just rendering a composition. Internal agents in particular were loading the CLI skill unnecessarily.How
One sibling, not three.
tts,transcribe, andremove-backgroundshare a workflow pattern (first-run model download → produce file → drop into composition) and chain naturally (TTS → transcribe → captions), so bundling them keeps the mental model coherent. The CLI skill referenceshyperframes-mediafrom a one-paragraph "Asset Preprocessing" stub instead of inlining 50+ lines of substantive guidance.Net change:
hyperframes-cli173 → 144 lines, newhyperframes-mediaat 147 lines. Tighter description onhyperframes-cliis the bigger win for spurious-load reduction.Test plan
bunx oxfmt --checkclean on all changed filesbunx oxlintclean on all changed filesnpx skills add heygen-com/hyperframespicks uphyperframes-mediaafter merge/hyperframes-mediain Claude Code and confirm voice table +.enwarning load/hyperframes-cliand confirm body no longer mentions Kokoro/Whisper/u2net specifics