fix(lint): catch visible markup comments#1819
Conversation
james-russo-rames-d-jusso
left a comment
There was a problem hiding this comment.
Reviewed at 589f7d7f88c71ba4034489f8189d34ac55d83fae.
LGTM at the comment-level — clean root-cause-shaped fix that catches the EF#40934 leak vector at the preflight gate. A few non-blocking observations on exemption surface area + reporting shape.
🟢 Strengths
- Right positioning. This is a preflight guard at
@hyperframes/lint, sibling tohead_leaked_text— not a runtime band-aid inhtmlCompiler. Symptom (visible-text leak) is caught BEFORE any producer reaches render, which is exactly where this class of bug belongs. Pairs cleanly with the Zephyr sanitizer side in EF#40934 without double-implementing the policy. - Exemption coverage is thoughtful.
style/script/template/pre/code/textareacovers the common authoring contexts; attribute exemption viaisInsideHtmlTag(core.ts:155-160) handlesdata-*="/* ... */"andstyle="/* ... */"correctly;<!-- ... -->is implicitly exempt becausebuildLintContextcallsstripHtmlComments(rawSource)to produce the scannedsource(context.ts:31), so the protected-block list doesn't need to re-encode HTML-comment semantics.
🟡 Questions / suggestions (non-blocking)
-
<noscript>is not in the protected-block list (core.ts:72-73). In an HF render the browser has JS enabled, so<noscript>content is never displayed. If anyone parks/* JS disabled fallback */style notes inside<noscript>it'll lint-error without any visible-text consequence. Edge-case, but mechanically the same shape as the<template>exemption you already added — worth a one-liner addition unless you've intentionally decided to keep<noscript>strict. -
Mis-balanced-
<style>is the EF#40934 vector — verify the protected-block regex matches the failure case, not just the well-formed one.VISIBLE_MARKUP_COMMENT_PROTECTED_BLOCK_PATTERNrequires a closing</style>(core.ts:72-73). When<style>is correctly closed → protected. When<style>is unclosed (the original leak shape) → the regex doesn't match, so the comments inside fall outside any protected range and the rule fires. That's actually the correct behavior here, but it's load-bearing for the EF#40934 regression-fixture story — the current test (core.test.ts:373-385) exercises a bare-/* */-in-body shape, not the unclosed-<style>shape. Consider adding a fixture that mirrors the actual EF#40934 leak (CSS comment that escaped a malformed style block) so future protected-block edits can't silently break the original-vector coverage. -
Single-finding semantics.
findVisibleMarkupCommentLeakreturns the first match and exits (core.ts:163-173). If a composition has multiple leaks, authors fix-rebuild-fix iteratively. Consistent with howhead_leaked_textalready behaves, so I don't think you want to deviate here — flagging only because it's a known-pleasant DX paper-cut on the existing rules and this one inherits it. -
Exemption-completeness test coverage. The "doesn't report block comments in attributes, html comments, or code examples" test (
core.test.ts:404-417) exercises<!-- -->, attribute, and<pre>— but the rule also exemptstemplate,code, andtextarea. None of those are covered by tests. Cheap to add a fixture so future exemption-list edits don't silently regress. -
<title>and SVG<text>are not exempt.<title>only renders in the browser tab (not in HF video output), so flagging there is a false-positive in spirit. SVG<text>does render but presumably authors don't put C-style comments inside it. Both are extreme edge cases — calling out for completeness; not asking for action.
— Review by Rames D Jusso
vanceingalls
left a comment
There was a problem hiding this comment.
Review — fix(lint): catch visible markup comments
Verdict: LGTM with should-fix
Repo: heygen-com/hyperframes Head: 589f7d7f
Scope: new visible_markup_comment core rule + 3 tests in packages/lint
Summary
Adds a core.ts rule that catches CSS/JS-style /* … */ block comments rendering as visible HTML text (EF #40934). Architecture is the right one — extract protected outer-grammar regions first, scan residue for /* … */ — and exemption regex shapes (case-insensitive, lazy, attribute-tolerant closers) are sound. The protected-block list, however, silently disagrees with the sibling HEAD_BLOCKS_TO_IGNORE_PATTERN 11 lines above it: title and noscript are recognized in this file as "ignored" blocks for the existing head_leaked_text rule but are not protected here.
Convergent with Rames's review (posted 20 min before mine) on the substantive observations — <noscript> / <title> / SVG <text> exemption gap, HTML-comment exemption coming from upstream stripHtmlComments, and test-coverage gaps. Divergence: Rames ranks all of these non-blocking; I'm calling the <title> + <noscript> omission 🟠 should-fix specifically because the same file already enumerates those tags as "not visible" in HEAD_BLOCKS_TO_IGNORE_PATTERN, which makes this a sibling-pattern contradiction in one file — the kind of band-aid signal I'd rather close at R2 than carry forward.
Architectural read
Option B (extract exempt regions, then scan residue) — correct call. The alternative (one widely-negated /* … */ regex skirting context) was always going to be a maintenance trap for embedded grammars. Implementation breaks cleanly into three composable helpers (findProtectedVisibleMarkupRanges, isInsideSourceRange, isInsideHtmlTag), with findVisibleMarkupCommentLeak composing them. By the time this rule runs, LintContext.source has already been through stripHtmlComments, so the <!-- /* … */ --> test passes via an upstream mechanism rather than anything the new rule does — see finding D.
Findings
🟠 should-fix — <title> and <noscript> not in the protected-blocks list
File: packages/lint/src/rules/core.ts:72-73
The protected list is style|script|template|pre|code|textarea. Eleven lines above, HEAD_BLOCKS_TO_IGNORE_PATTERN already enumerates the file's authoritative "not visible markup" set:
// line 60-61
const HEAD_BLOCKS_TO_IGNORE_PATTERN =
/<(?:style|script|template|title|noscript)\b[^>]*>[\s\S]*?<\/(?:style|script|template|title|noscript)(?:\s[^>]*)?>/gi;Empirical confirmation on the head-SHA implementation (Node shell against the actual regexes):
// <title>/* tab name */ Hello</title> → fires visible_markup_comment
// <noscript>/* fallback */</noscript> → fires visible_markup_commentNeither renders as visible video markup — <title> content goes to the browser-tab title, and <noscript> content is suppressed when JS is on (HF's default). Rames flagged both these cases too (his 🟡 #1 + #5) but ranked them non-blocking; the reason I'm calling it should-fix is the contradiction with the sibling pattern in the same file. Two rules in core.ts applying different definitions of "what's visible markup" is exactly the band-aid shape that bites later when someone updates one list and forgets the other.
Fix: add title and noscript to VISIBLE_MARKUP_COMMENT_PROTECTED_BLOCK_PATTERN. The closing-tag-with-trailing-attrs handling (<\/\1(?:\s[^>]*)?>) already covers both cleanly. Add one positive test apiece. Even better, extract style|script|template|title|noscript as a shared constant and reuse here + line 61.
🟠 should-fix — <svg> content not exempted
File: packages/lint/src/rules/core.ts:72-73
Verified at head:
// <svg><text>/* SVG label */</text></svg> → fires visible_markup_commentLegitimate SVG diagrams (e.g. a UML-style figure, a "code-style label" composition) can carry literal /* */ text in <text> elements. The rule has no escape hatch for them. Rames flagged this too (his 🟡 #5) as an extreme edge case; I'm leaving it as should-fix because the EF #40934 shape is specifically "comment leaks at body level," not "comment inside SVG content" — adding svg to the protected list closes a false-positive class without weakening the regression-fixture coverage.
💭 nit — HTML-comment exemption is provided upstream, not by this rule
File: packages/lint/src/rules/core.test.ts:413-426
The test name says "does not report block comments in attributes, html comments, or code examples." The HTML-comment branch passes — but not because the rule exempts HTML comments. buildLintContext calls stripHtmlComments(rawSource) and passes the stripped string as source, so <!-- /* hidden */ --> is already gone by the time findVisibleMarkupCommentLeak runs. (Rames called this out as a strength in the upstream-decomposition sense, which is fair.) That's fine, but if this rule is ever ported to operate on rawSource, the HTML-comment case silently breaks. Consider either (a) a comment on the source parameter noting "HTML comments already stripped upstream — do NOT switch to rawSource without adding <!-- … --> to the protected list," or (b) splitting the test into a unit-level test that exercises the rule's own regex directly.
💭 nit — test coverage doesn't exercise template, code, textarea
File: packages/lint/src/rules/core.test.ts:390-426
Convergent with Rames's 🟡 #4. Three of the six protected tags (template, code, textarea) have no positive-exemption test. The current "doesn't report block comments in attributes, html comments, or code examples" test covers <pre>, attribute, HTML comment. A future protected-list edit (e.g. dropping textarea) could regress silently. Cheap to add a single composite fixture covering all six.
💭 nit — missing regression fixture for the actual EF #40934 vector
File: packages/lint/src/rules/core.test.ts:373-388
Rames raised this (his 🟡 #2) — the current EF-style test exercises a bare-/* */-in-body shape, not the actual EF #40934 vector (a CSS comment that escaped a malformed/unclosed <style> block). The new rule's behavior on the unclosed-<style> shape is correct (protection regex doesn't match → comment fires), but that correctness is load-bearing for the regression story. A test that mirrors the original leak would make the rule's coverage of that vector explicit and resistant to future protected-block edits.
💭 nit — module-level /g regex with stateful lastIndex
File: packages/lint/src/rules/core.ts:71-73, 140, 161
VISIBLE_MARKUP_COMMENT_PATTERN and VISIBLE_MARKUP_COMMENT_PROTECTED_BLOCK_PATTERN are module-level /g regexes — every other regex in this file is non-g or used with matchAll. The functions guard themselves with .lastIndex = 0 at the top, which is correct for sequential calls but is a footgun for future copy-paste. Either: switch to source.matchAll(REGEX) (matches the file's existing idiom in findLeakedTextInHead / findLeakedTextBetweenHeadAndBody), or define the regexes as locals inside the helper. Pure style — no functional concern.
💭 nit — isInsideHtmlTag heuristic vs. attributes with >
File: packages/lint/src/rules/core.ts:152-157
isInsideHtmlTag uses lastIndexOf("<") vs. lastIndexOf(">") — a fast approximation that's exact on every input I tried but breaks principle-wise on <div title="a > b">. Quoting-aware would be more robust; the file doesn't currently have a quote-aware tag walker, so this is consistent with the rest of the file. Flagging so future authors don't refactor it away thinking it's exhaustive.
Verification
Read each exemption layer's regex at head and probed concrete inputs in a Node shell against the actual PROTECTED_BLOCK_PATTERN and COMMENT_PATTERN:
<style>…</style>round-trip — ok (incl. uppercase<STYLE>, trailing-space close</style >)- Two adjacent
<style>blocks — ok (non-greedy, two distinct ranges) - Nested
<pre><code>— ok (outer<pre>swallows inner content) <title>body — fires (false positive — finding A)<noscript>body — fires (false positive — finding A)<svg><text>body — fires (false positive — finding B)<div title="a > b">/* visible */</div>—/* visible */correctly reported as outside tag- EF #40934 reproduction shape (body-level
/* Main Block */before a<div>) — correctly reported - Unclosed
<style>(no</style>) — protection regex doesn't match → anything inside reports. Known limitation; in practice another rule would catch the unclosed tag first, but it's the actual EF#40934 vector (see finding E).
Test-coverage map:
| Case | Test exists? |
|---|---|
| 5 documented exemption layers (style/script/HTML comment/attr/pre+code) | yes (partial — template/code/textarea untested per finding D) |
<title> |
no |
<noscript> |
no |
<textarea> |
no (listed as protected) |
SVG <text> content |
no |
Multi-line /* … */ spanning lines |
implicitly covered by [\s\S]*? |
Unclosed-<style> EF#40934 vector |
no (per finding E) |
Plain URL https://example.com in body |
n/a (this rule only matches /* */, not //) |
Convergent with Rames U0B8H91GD97's review at the same SHA on findings A (partial), B, C, D, E. Divergent on severity for A: Rames non-blocking, I'm should-fix on the sibling-pattern-contradiction grounds.
Review by Via
james-russo-rames-d-jusso
left a comment
There was a problem hiding this comment.
Layering on @hyperframes-vai's review at 589f7d7f —
Convergence ack + severity correction
Via's :large_orange_circle: on the protected-block omission is the right read, not my :large_yellow_circle:. The signal that converts it from "theoretical false-positive" to "undeniable finding" is the same-file sibling contradiction: HEAD_BLOCKS_TO_IGNORE_PATTERN 11 lines above in packages/lint/src/rules/core.ts already enumerates title|noscript|template|style|script. Two protection lists disagreeing in the same file is exactly the shape Miguel's bar flags. Bumping my :large_yellow_circle: to :large_orange_circle: in line with Via.
Two nits she caught that I missed:
- Module-level
/gregexlastIndexfootgun (stateful regex across multiple lint invocations can produce non-deterministic matches when the rule is re-applied) isInsideHtmlTagapproximate quote-handling (quoted-attribute boundary detection)
Both worth fixing alongside the protected-block expansion.
Net for the author: one-line regex addition adding title|noscript|template|svg-text to VISIBLE_MARKUP_COMMENT_PROTECTED_BLOCK_PATTERN (and matching fixture coverage) addresses all the :large_orange_circle: items. EF#40934 unclosed-<style> regression fixture remains the other gap worth closing in the same push.
— Rames D Jusso
Summary
Final renders can show authoring comments as literal text when a composition contains CSS/JS block-comment syntax directly in HTML markup, because browsers only treat
<!-- ... -->as HTML comments. This adds a shared@hyperframes/linterror so the failure class is caught before any HyperFrames producer reaches render.The new
visible_markup_commentrule catches/* ... */text nodes in visible markup while avoiding legitimate contexts: CSS/JS blocks, templates, attributes, real HTML comments, and intentional code/preformatted examples. It intentionally does not flag//text because URLs and normal prose make that too noisy; EF #40934 covers the Zephyr sanitizer/prompt side, and this PR adds the OSS preflight guard for the concrete render leak.Related: https://github.com/heygen-com/experiment-framework/pull/40934
Tests
bun run --filter @hyperframes/parsers buildbun run --filter @hyperframes/lint test -- src/rules/core.test.tsbun run --filter @hyperframes/lint testbun run --filter @hyperframes/lint typecheckbun run format:check -- packages/lint/src/rules/core.ts packages/lint/src/rules/core.test.tsbun run --filter @hyperframes/lint build