fix(lint): catch visible markup comments by miguel-heygen · Pull Request #1819 · heygen-com/hyperframes

miguel-heygen · 2026-06-30T20:16:25Z

Summary

Final renders can show authoring comments as literal text when a composition contains CSS/JS block-comment syntax directly in HTML markup, because browsers only treat  as HTML comments. This adds a shared @hyperframes/lint error so the failure class is caught before any HyperFrames producer reaches render.

The new visible_markup_comment rule catches /* ... */ text nodes in visible markup while avoiding legitimate contexts: CSS/JS blocks, templates, attributes, real HTML comments, and intentional code/preformatted examples. It intentionally does not flag // text because URLs and normal prose make that too noisy; EF #40934 covers the Zephyr sanitizer/prompt side, and this PR adds the OSS preflight guard for the concrete render leak.

Related: https://github.com/heygen-com/experiment-framework/pull/40934

Tests

bun run --filter @hyperframes/parsers build
bun run --filter @hyperframes/lint test -- src/rules/core.test.ts
bun run --filter @hyperframes/lint test
bun run --filter @hyperframes/lint typecheck
bun run format:check -- packages/lint/src/rules/core.ts packages/lint/src/rules/core.test.ts
bun run --filter @hyperframes/lint build

james-russo-rames-d-jusso

Reviewed at 589f7d7f88c71ba4034489f8189d34ac55d83fae.

LGTM at the comment-level — clean root-cause-shaped fix that catches the EF#40934 leak vector at the preflight gate. A few non-blocking observations on exemption surface area + reporting shape.

🟢 Strengths

Right positioning. This is a preflight guard at @hyperframes/lint, sibling to head_leaked_text — not a runtime band-aid in htmlCompiler. Symptom (visible-text leak) is caught BEFORE any producer reaches render, which is exactly where this class of bug belongs. Pairs cleanly with the Zephyr sanitizer side in EF#40934 without double-implementing the policy.
Exemption coverage is thoughtful. style / script / template / pre / code / textarea covers the common authoring contexts; attribute exemption via isInsideHtmlTag (core.ts:155-160) handles data-*="/* ... */" and style="/* ... */" correctly;  is implicitly exempt because buildLintContext calls stripHtmlComments(rawSource) to produce the scanned source (context.ts:31), so the protected-block list doesn't need to re-encode HTML-comment semantics.

🟡 Questions / suggestions (non-blocking)

<noscript> is not in the protected-block list (core.ts:72-73). In an HF render the browser has JS enabled, so <noscript> content is never displayed. If anyone parks /* JS disabled fallback */ style notes inside <noscript> it'll lint-error without any visible-text consequence. Edge-case, but mechanically the same shape as the <template> exemption you already added — worth a one-liner addition unless you've intentionally decided to keep <noscript> strict.
Mis-balanced-<style> is the EF#40934 vector — verify the protected-block regex matches the failure case, not just the well-formed one. VISIBLE_MARKUP_COMMENT_PROTECTED_BLOCK_PATTERN requires a closing </style> (core.ts:72-73). When <style> is correctly closed → protected. When <style> is unclosed (the original leak shape) → the regex doesn't match, so the comments inside fall outside any protected range and the rule fires. That's actually the correct behavior here, but it's load-bearing for the EF#40934 regression-fixture story — the current test (core.test.ts:373-385) exercises a bare-/* */-in-body shape, not the unclosed-<style> shape. Consider adding a fixture that mirrors the actual EF#40934 leak (CSS comment that escaped a malformed style block) so future protected-block edits can't silently break the original-vector coverage.
Single-finding semantics. findVisibleMarkupCommentLeak returns the first match and exits (core.ts:163-173). If a composition has multiple leaks, authors fix-rebuild-fix iteratively. Consistent with how head_leaked_text already behaves, so I don't think you want to deviate here — flagging only because it's a known-pleasant DX paper-cut on the existing rules and this one inherits it.
Exemption-completeness test coverage. The "doesn't report block comments in attributes, html comments, or code examples" test (core.test.ts:404-417) exercises , attribute, and <pre> — but the rule also exempts template, code, and textarea. None of those are covered by tests. Cheap to add a fixture so future exemption-list edits don't silently regress.
<title> and SVG <text> are not exempt. <title> only renders in the browser tab (not in HF video output), so flagging there is a false-positive in spirit. SVG <text> does render but presumably authors don't put C-style comments inside it. Both are extreme edge cases — calling out for completeness; not asking for action.

— Review by Rames D Jusso

vanceingalls

Review — fix(lint): catch visible markup comments

Verdict: LGTM with should-fix
Repo: heygen-com/hyperframes Head: 589f7d7f
Scope: new visible_markup_comment core rule + 3 tests in packages/lint

Summary

Adds a core.ts rule that catches CSS/JS-style /* … */ block comments rendering as visible HTML text (EF #40934). Architecture is the right one — extract protected outer-grammar regions first, scan residue for /* … */ — and exemption regex shapes (case-insensitive, lazy, attribute-tolerant closers) are sound. The protected-block list, however, silently disagrees with the sibling HEAD_BLOCKS_TO_IGNORE_PATTERN 11 lines above it: title and noscript are recognized in this file as "ignored" blocks for the existing head_leaked_text rule but are not protected here.

Convergent with Rames's review (posted 20 min before mine) on the substantive observations — <noscript> / <title> / SVG <text> exemption gap, HTML-comment exemption coming from upstream stripHtmlComments, and test-coverage gaps. Divergence: Rames ranks all of these non-blocking; I'm calling the <title> + <noscript> omission 🟠 should-fix specifically because the same file already enumerates those tags as "not visible" in HEAD_BLOCKS_TO_IGNORE_PATTERN, which makes this a sibling-pattern contradiction in one file — the kind of band-aid signal I'd rather close at R2 than carry forward.

Architectural read

Option B (extract exempt regions, then scan residue) — correct call. The alternative (one widely-negated /* … */ regex skirting context) was always going to be a maintenance trap for embedded grammars. Implementation breaks cleanly into three composable helpers (findProtectedVisibleMarkupRanges, isInsideSourceRange, isInsideHtmlTag), with findVisibleMarkupCommentLeak composing them. By the time this rule runs, LintContext.source has already been through stripHtmlComments, so the  test passes via an upstream mechanism rather than anything the new rule does — see finding D.

Findings

🟠 should-fix — `<title>` and `<noscript>` not in the protected-blocks list

File: packages/lint/src/rules/core.ts:72-73

// line 60-61
const HEAD_BLOCKS_TO_IGNORE_PATTERN =
  /<(?:style|script|template|title|noscript)\b[^>]*>[\s\S]*?<\/(?:style|script|template|title|noscript)(?:\s[^>]*)?>/gi;

Empirical confirmation on the head-SHA implementation (Node shell against the actual regexes):

// <title>/* tab name */ Hello</title>  → fires visible_markup_comment
// <noscript>/* fallback */</noscript>  → fires visible_markup_comment

Neither renders as visible video markup — <title> content goes to the browser-tab title, and <noscript> content is suppressed when JS is on (HF's default). Rames flagged both these cases too (his 🟡 #1 + #5) but ranked them non-blocking; the reason I'm calling it should-fix is the contradiction with the sibling pattern in the same file. Two rules in core.ts applying different definitions of "what's visible markup" is exactly the band-aid shape that bites later when someone updates one list and forgets the other.

Fix: add title and noscript to VISIBLE_MARKUP_COMMENT_PROTECTED_BLOCK_PATTERN. The closing-tag-with-trailing-attrs handling (<\/\1(?:\s[^>]*)?>) already covers both cleanly. Add one positive test apiece. Even better, extract style|script|template|title|noscript as a shared constant and reuse here + line 61.

🟠 should-fix — `<svg>` content not exempted

File: packages/lint/src/rules/core.ts:72-73

Verified at head:

// <svg><text>/* SVG label */</text></svg>  → fires visible_markup_comment

Legitimate SVG diagrams (e.g. a UML-style figure, a "code-style label" composition) can carry literal /* */ text in <text> elements. The rule has no escape hatch for them. Rames flagged this too (his 🟡 #5) as an extreme edge case; I'm leaving it as should-fix because the EF #40934 shape is specifically "comment leaks at body level," not "comment inside SVG content" — adding svg to the protected list closes a false-positive class without weakening the regression-fixture coverage.

💭 nit — HTML-comment exemption is provided upstream, not by this rule

File: packages/lint/src/rules/core.test.ts:413-426

The test name says "does not report block comments in attributes, html comments, or code examples." The HTML-comment branch passes — but not because the rule exempts HTML comments. buildLintContext calls stripHtmlComments(rawSource) and passes the stripped string as source, so  is already gone by the time findVisibleMarkupCommentLeak runs. (Rames called this out as a strength in the upstream-decomposition sense, which is fair.) That's fine, but if this rule is ever ported to operate on rawSource, the HTML-comment case silently breaks. Consider either (a) a comment on the source parameter noting "HTML comments already stripped upstream — do NOT switch to rawSource without adding  to the protected list," or (b) splitting the test into a unit-level test that exercises the rule's own regex directly.

💭 nit — test coverage doesn't exercise `template`, `code`, `textarea`

File: packages/lint/src/rules/core.test.ts:390-426

Convergent with Rames's 🟡 #4. Three of the six protected tags (template, code, textarea) have no positive-exemption test. The current "doesn't report block comments in attributes, html comments, or code examples" test covers <pre>, attribute, HTML comment. A future protected-list edit (e.g. dropping textarea) could regress silently. Cheap to add a single composite fixture covering all six.

💭 nit — missing regression fixture for the actual EF #40934 vector

File: packages/lint/src/rules/core.test.ts:373-388

Rames raised this (his 🟡 #2) — the current EF-style test exercises a bare-/* */-in-body shape, not the actual EF #40934 vector (a CSS comment that escaped a malformed/unclosed <style> block). The new rule's behavior on the unclosed-<style> shape is correct (protection regex doesn't match → comment fires), but that correctness is load-bearing for the regression story. A test that mirrors the original leak would make the rule's coverage of that vector explicit and resistant to future protected-block edits.

💭 nit — module-level `/g` regex with stateful `lastIndex`

File: packages/lint/src/rules/core.ts:71-73, 140, 161

VISIBLE_MARKUP_COMMENT_PATTERN and VISIBLE_MARKUP_COMMENT_PROTECTED_BLOCK_PATTERN are module-level /g regexes — every other regex in this file is non-g or used with matchAll. The functions guard themselves with .lastIndex = 0 at the top, which is correct for sequential calls but is a footgun for future copy-paste. Either: switch to source.matchAll(REGEX) (matches the file's existing idiom in findLeakedTextInHead / findLeakedTextBetweenHeadAndBody), or define the regexes as locals inside the helper. Pure style — no functional concern.

💭 nit — `isInsideHtmlTag` heuristic vs. attributes with `>`

File: packages/lint/src/rules/core.ts:152-157

isInsideHtmlTag uses lastIndexOf("<") vs. lastIndexOf(">") — a fast approximation that's exact on every input I tried but breaks principle-wise on <div title="a > b">. Quoting-aware would be more robust; the file doesn't currently have a quote-aware tag walker, so this is consistent with the rest of the file. Flagging so future authors don't refactor it away thinking it's exhaustive.

Verification

Read each exemption layer's regex at head and probed concrete inputs in a Node shell against the actual PROTECTED_BLOCK_PATTERN and COMMENT_PATTERN:

<style>…</style> round-trip — ok (incl. uppercase <STYLE>, trailing-space close </style >)
Two adjacent <style> blocks — ok (non-greedy, two distinct ranges)
Nested <pre><code> — ok (outer <pre> swallows inner content)
<title> body — fires (false positive — finding A)
<noscript> body — fires (false positive — finding A)
<svg><text> body — fires (false positive — finding B)
<div title="a > b">/* visible */</div> — /* visible */ correctly reported as outside tag
EF #40934 reproduction shape (body-level /* Main Block */ before a <div>) — correctly reported
Unclosed <style> (no </style>) — protection regex doesn't match → anything inside reports. Known limitation; in practice another rule would catch the unclosed tag first, but it's the actual EF#40934 vector (see finding E).

Test-coverage map:

Case	Test exists?
5 documented exemption layers (style/script/HTML comment/attr/pre+code)	yes (partial — `template`/`code`/`textarea` untested per finding D)
`<title>`	no
`<noscript>`	no
`<textarea>`	no (listed as protected)
SVG `<text>` content	no
Multi-line `/* … */` spanning lines	implicitly covered by `[\s\S]*?`
Unclosed-`<style>` EF#40934 vector	no (per finding E)
Plain URL `https://example.com` in body	n/a (this rule only matches `/* */`, not `//`)

Convergent with Rames U0B8H91GD97's review at the same SHA on findings A (partial), B, C, D, E. Divergent on severity for A: Rames non-blocking, I'm should-fix on the sibling-pattern-contradiction grounds.

Review by Via

james-russo-rames-d-jusso

Layering on @hyperframes-vai's review at 589f7d7f —

Convergence ack + severity correction

Via's :large_orange_circle: on the protected-block omission is the right read, not my :large_yellow_circle:. The signal that converts it from "theoretical false-positive" to "undeniable finding" is the same-file sibling contradiction: HEAD_BLOCKS_TO_IGNORE_PATTERN 11 lines above in packages/lint/src/rules/core.ts already enumerates title|noscript|template|style|script. Two protection lists disagreeing in the same file is exactly the shape Miguel's bar flags. Bumping my :large_yellow_circle: to :large_orange_circle: in line with Via.

Two nits she caught that I missed:

Module-level /g regex lastIndex footgun (stateful regex across multiple lint invocations can produce non-deterministic matches when the rule is re-applied)
isInsideHtmlTag approximate quote-handling (quoted-attribute boundary detection)

Both worth fixing alongside the protected-block expansion.

Net for the author: one-line regex addition adding title|noscript|template|svg-text to VISIBLE_MARKUP_COMMENT_PROTECTED_BLOCK_PATTERN (and matching fixture coverage) addresses all the :large_orange_circle: items. EF#40934 unclosed-<style> regression fixture remains the other gap worth closing in the same push.

— Rames D Jusso

fix(lint): catch visible markup comments

589f7d7

james-russo-rames-d-jusso reviewed Jun 30, 2026

View reviewed changes

vanceingalls reviewed Jun 30, 2026

View reviewed changes

james-russo-rames-d-jusso reviewed Jun 30, 2026

View reviewed changes

miguel-heygen added 2 commits June 30, 2026 20:25

test(lint): cover visible comment exemptions

9c67370

fix(lint): harden visible comment scan

872d091

miguel-heygen merged commit d103891 into main Jul 1, 2026
39 checks passed

miguel-heygen deleted the magi/visible-comment-lint branch July 1, 2026 03:49

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix(lint): catch visible markup comments#1819

fix(lint): catch visible markup comments#1819
miguel-heygen merged 3 commits into
mainfrom
magi/visible-comment-lint

miguel-heygen commented Jun 30, 2026

Uh oh!

james-russo-rames-d-jusso left a comment

Uh oh!

vanceingalls left a comment

Uh oh!

james-russo-rames-d-jusso left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

Conversation

miguel-heygen commented Jun 30, 2026

Summary

Tests

Uh oh!

james-russo-rames-d-jusso left a comment

Choose a reason for hiding this comment

Uh oh!

vanceingalls left a comment

Choose a reason for hiding this comment

Review — fix(lint): catch visible markup comments

Summary

Architectural read

Findings

🟠 should-fix — <title> and <noscript> not in the protected-blocks list

🟠 should-fix — <svg> content not exempted

💭 nit — HTML-comment exemption is provided upstream, not by this rule

💭 nit — test coverage doesn't exercise template, code, textarea

💭 nit — missing regression fixture for the actual EF #40934 vector

💭 nit — module-level /g regex with stateful lastIndex

💭 nit — isInsideHtmlTag heuristic vs. attributes with >

Verification

Uh oh!

james-russo-rames-d-jusso left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

🟠 should-fix — `<title>` and `<noscript>` not in the protected-blocks list

🟠 should-fix — `<svg>` content not exempted

💭 nit — test coverage doesn't exercise `template`, `code`, `textarea`

💭 nit — module-level `/g` regex with stateful `lastIndex`

💭 nit — `isInsideHtmlTag` heuristic vs. attributes with `>`