feat(capture): extract chips/stat-cells/tabs, detect icon fonts, transparent grounds by xuanruli · Pull Request #1827 · heygen-com/hyperframes

xuanruli · 2026-07-01T03:43:27Z

What

Extends the capture engine's design-style + font extraction so a wider range of real-world sites produce faithful, usable component/typography tokens.

designStyleExtractor.ts

Extracts three more component families beyond buttons/cards/nav: chips (pill/badge/tag), stat/metric cells, and tabs — by class-substring selector plus a shape fallback (small + fully-rounded + short text) so hashed/utility class names (Tailwind, CSS-modules, Next.js) are still caught.
Emits a "transparent" sentinel for fully-transparent (rgba(...,0)) grounds instead of collapsing them to #000000, so a transparent chip/tab/stat on a light-ground site no longer reads as solid black.

fontMetadataExtractor.ts

Flags icon fonts (isIcon) by Private-Use-Area glyph ratio (>50%). Icon fonts ship arbitrary names — swiper-icons, a custom hushly, Font Awesome — that no name-list can enumerate; without this they get mistaken for a text family and render headings as tofu/icons. A plain "no Latin letters" test is deliberately avoided: a text font served as a unicode-range subset legitimately lacks A yet is 0% PUA.

types.ts — DesignStyles gains optional chips/statCells/tabs; new StatCellStyle; FontFileMetadata gains isIcon.

Why

Validated end-to-end across 7 diverse sites (Stripe, LiveKit, DoorDash, Snowflake, Linear, ElevenLabs, Kuse). Each surfaced a distinct real-world case this PR handles generally (not per-site): oklch/hsl colors, camelCase/hashed font names, icon fonts, transparent grounds, unicode-range subsets. Snowflake's hushly icon font was rendering headings as icon glyphs before the isIcon fix.

Tests

New unit tests for isIconCharacterSet (PUA-heavy → icon; Latin / cyrillic-subset / empty → not).
fontMetadataExtractor.test.ts: 39 passing. bun run build, oxlint, oxfmt, typecheck all clean on changed files.

🤖 Generated with Claude Code

…sparent grounds designStyleExtractor now also extracts chip/pill/badge/tag, stat/metric cells, and tab components — by class-substring selector plus a shape fallback (small + fully rounded + short text) so hashed/utility class names (Tailwind, CSS-modules) are still caught. It also emits a "transparent" sentinel for fully-transparent (rgba(...,0)) grounds instead of collapsing them to #000000, so a transparent chip/tab/stat on a light-ground site no longer reads as solid black. fontMetadataExtractor now flags icon fonts (isIcon) by glyph coverage: a font is an icon font only when it BOTH lacks a real Latin alphabet (<26 of A-Za-z) AND is mostly (>50%) Private-Use-Area glyphs. The Latin gate matters — some text fonts pack thousands of PUA glyphs yet are plainly text (Apple SF Pro is ~81% PUA but ships a full alphabet; Descript's Booton ~50%); flagging by PUA ratio alone would strip a brand's real typeface. Measured icon fonts: "hushly" 63% PUA / 7 letters, Font Awesome 95% / 0 letters. Names alone can't identify icon fonts ("hushly", "swiper-icons"), hence the glyph-based test. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

miga-heygen

Review: feat(capture): extract chips/stat-cells/tabs, detect icon fonts, transparent grounds

Summary: Extends the capture engine's design-style extraction to three new component families (chips, stat cells, tabs), adds icon-font detection via PUA glyph ratio, and fixes transparent backgrounds being collapsed to #000000. Well-structured, well-tested, validated across diverse sites.

Findings:

#	Location	Severity	Note
1	`fontMetadataExtractor.ts:206`	concern	`font as unknown as { characterSet?: number[] }` is a double-cast. Project CLAUDE.md says "Avoid `any` and `as T` assertions." The `try/catch` + `Array.isArray` guard makes it runtime-safe, but consider a type guard function instead: `function hasCharacterSet(f: Font): f is Font & { characterSet: number[] }`.
2	`designStyleExtractor.ts:209`	suggestion	`parseFloat(st.borderRadius)` only reads the first value of shorthand like `"24px 24px 0px 0px"`. An element with only top-rounded corners would pass the pill-shape check. Unlikely to cause false positives given the other constraints (height ≤ 44, width ≤ 260, short text, has skin).
3	`designStyleExtractor.ts:262`	suggestion	`[class*="tab"]` could match `tabpanel`, `tabindex`, `tabular`, `establish`, `stable`. Downstream size filters mitigate most false positives, but consider additional `:not()` exclusions if observed in practice.
4	`fontMetadataExtractor.ts:192`	nit	`isIcon` not propagated to `FontFamilySummary`. Consumers need to iterate `files` to discover if a family is an icon font. Worth noting for future consumers.
5	`designStyleExtractor.ts:216-226`	nit	Same DOM element can appear in both `chipEls` and `shapeChips`, getting `getStyles()` called twice before dedup by key. No correctness issue, just redundant work.
6	`rgbToHex` — transparent fix	nit	No unit test for the `transparent` sentinel. The function lives inside a `page.evaluate` script so unit testing requires extraction or e2e. Not a blocker.

What looks good:

Icon-font detection heuristic is clever — dual-gate (Latin alphabet + PUA ratio) handles the tricky SF Pro / Booton false-positive case that naive PUA-only would miss
Transparent sentinel is a clean fix for a data-loss bug (transparent → #000000)
Type additions (chips?, statCells?, tabs?) are backward-compatible
Test coverage is solid — SF Pro test case is particularly valuable (validates the Latin gate)
Shape-fallback chip detection fills the gap for sites that don't use class names with "chip/tag/badge/pill"
Stat cell extraction correctly finds the biggest-font child for the "number" style

Verdict: LGTM — Well-designed extraction with real-world validation. The main actionable suggestion is replacing the as unknown as cast with a type guard.

— Miga

xuanruli force-pushed the feat/capture-component-extraction branch from f4f61b5 to 6cc8731 Compare July 1, 2026 03:55

xuanruli requested review from jrusso1020, miguel-heygen and ukimsanov July 1, 2026 04:04

miga-heygen reviewed Jul 1, 2026

View reviewed changes

miguel-heygen approved these changes Jul 1, 2026

View reviewed changes

xuanruli merged commit 8694424 into main Jul 1, 2026
41 checks passed

xuanruli deleted the feat/capture-component-extraction branch July 1, 2026 04:30

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat(capture): extract chips/stat-cells/tabs, detect icon fonts, transparent grounds#1827

feat(capture): extract chips/stat-cells/tabs, detect icon fonts, transparent grounds#1827
xuanruli merged 1 commit into
mainfrom
feat/capture-component-extraction

xuanruli commented Jul 1, 2026

Uh oh!

miga-heygen left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

Conversation

xuanruli commented Jul 1, 2026

What

Why

Tests

Uh oh!

miga-heygen left a comment

Choose a reason for hiding this comment

Review: feat(capture): extract chips/stat-cells/tabs, detect icon fonts, transparent grounds

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants