Fix: scrub instead of detach for framework-rendered DOM by twschiller · Pull Request #176 · pixiebrix/agent-browser-shield

twschiller · 2026-06-05T21:19:22Z

Summary

Four rules — meta-injection-strip, noscript-strip, html-comment-strip,
hidden-text-strip — called .remove() on DOM nodes the page framework had
rendered and was tracking for unmount. When React Router (and equivalents in
Vue / Svelte / Astro / htmx head-merge) tried to reconcile those nodes out
on route change, removeChild threw inside the commit phase and stranded
the route mid-render.

User-visible symptom on the demo site: clicking the RiverMart logo from a
product detail page updated the URL but never repainted the home content,
because React 19 hoists <title> and three <meta> tags from
ProductDetail.tsx into <head>, meta-injection-strip detached them on
match, and React's next commit tried to removeChild nodes whose
parentNode was already null.

Each rule now blanks its data carrier (attribute value, textContent, or
Text-node data) rather than detaching the carrier node. The element
references the framework holds stay valid; agent-readable content still
goes to empty.

meta-injection-strip — blanks content="" on matching <meta> (was
element.remove()).
noscript-strip — blanks <noscript> children via textContent = ""
(was element.remove()).
html-comment-strip — blanks comment.data only when the data matches
INJECTION_PATTERNS (was: unconditional removal of every comment outside
<script> / <style> / <noscript>). The injection-match gate
naturally preserves React Suspense markers ($, /$, $?, …) without
an explicit allowlist.
hidden-text-strip — walks SHOW_TEXT and blanks each Text node's data
(was element.remove()). Element / comment nodes inside the hidden
subtree stay where the framework put them.

Property-based tests pin the load-bearing invariants:

html-comment-strip.property.test.ts — across all known injection
fixtures, comments are blanked but stay attached; across the framework
marker set ($, /$, $?, $!, [, ], build stamps, license
headers, dev TODOs), comments are left untouched. Catches any future
widening of INJECTION_PATTERNS that would re-introduce the navigation
crash via a different path.
hidden-text-strip.property.test.ts — across the cross-product of
hidden-CSS triggers × child-subtree shapes, descendant text is blanked
AND every element node inside the hidden box stays attached. Drift back
to element.remove() or replaceChildren() would fail this as a class,
not just one specific case.

Docs and skill catalogs updated to describe each rule's current scrub
behavior and to broaden the strip verb's taxonomy entry.

Coverage trade-offs

Switching from "detach the carrier" to "blank the carrier" trades framework
safety for some coverage reduction. Four specific false-negative paths to
be aware of:

meta-injection-strip — in-place content updates. The subtree
watcher observes id / class attribute changes only, not content.
If a framework or page script later writes a new poisoned value to
content= on the same <meta> node we blanked, the new value is
visible to agents until the next route change or subtree addition
re-triggers a scan. Pre-fix, the meta was detached, so any framework
write landed on a node nobody was reading.
noscript-strip — in-place children replacement inside a kept
<noscript>. The watcher fires on the new children, but
stripNoscript's querySelectorAll walks downward — it doesn't
recognize that the added subtree's ancestor is a <noscript> we
already blanked. Pre-fix, the noscript was gone, so any new content had
to bring its own <noscript> wrapper. Real-world likelihood is very
low: frameworks essentially never re-render noscript children at
runtime.
html-comment-strip — comments not matching INJECTION_PATTERNS now
survive. Largest coverage delta in the PR. The rule went from "strip
every comment outside <script> / <style> / <noscript>" to "scrub
only comments whose data matches the injection pattern set" (currently
11 patterns). Novel injection phrasings, off-pattern prose, and any
benign-looking comments carrying instruction-shaped text that used to
be removed now stay visible. The narrowing is what makes React Suspense
markers safe automatically (they don't match any pattern) — but it caps
coverage at the recognized shapes.
hidden-text-strip — non-text payloads inside hidden subtrees
survive. The rule used to remove the wrapper, taking image alt,
aria-label, title, SVG <title> / <desc>, and data-*
attributes with it. Now it only blanks Text nodes. Attribute-shaped
payloads inside a hidden box are still caught by
attribute-injection-sanitize and svg-text-strip if they match
those rules' pattern sets, but the broad "wipe everything inside the
box" sweep is gone.

Follow-ups (not in this PR)

Extend OBSERVED_ATTRIBUTES in subtree-watcher.ts to include content
and add an upward closest(\"noscript\") check in stripNoscript to
close (1) and (2).
Decide whether html-comment-strip should restore broad-sweep behavior
with a framework-marker allowlist (skip $ / /$ / [ / ]-prefixed
data) to close (3). Kept out of this PR because the allowlist
enumeration is its own design discussion.
Audit INJECTION_PATTERNS coverage on attribute-targeted rules now that
hidden-text-strip no longer back-stops them (4).

Test plan

bun run check in extension/ clean
bun run test in extension/: 1380 / 1380 passing across 77 suites
Manual repro on the demo site: navigate from a product detail page
back to home via the RiverMart logo; confirm Home renders (was: URL
updated but content frozen on Product Detail)
Toggle each of the four rules off in the popup on the demo site;
navigation should still work either way
Sanity check on a real React 19 SSR site with metadata hoisting
(any Next.js 14+ deploy)

🤖 Generated with Claude Code

Four rules — meta-injection-strip, noscript-strip, html-comment-strip, hidden-text-strip — called .remove() on DOM nodes that the page framework had rendered and was tracking for unmount. When React Router (and equivalents in Vue / Svelte / Astro / htmx head-merge) tried to reconcile those nodes out on route change, removeChild threw inside the commit phase and stranded the route mid-render. User-visible symptom on the demo site: clicking the RiverMart logo from a product detail page updated the URL but never repainted the home content. Switched each rule to blank its data carrier (attribute value, textContent, or Text-node data) rather than detach the carrier node. Element references the framework holds stay valid, agent-readable content still goes to empty. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

vercel · 2026-06-05T21:19:28Z

The latest updates on your projects. Learn more about Vercel for GitHub.

Project	Deployment	Actions	Updated (UTC)
agent-browser-shield-demo-site	Ready	Preview, Comment	Jun 5, 2026 9:27pm

unblocked

1 issue found.

About Unblocked

Unblocked has been set up to automatically review your team's pull requests to identify genuine bugs and issues.

📖 Documentation — Learn more in our docs.

💬 Ask questions — Mention @unblocked to request a review or summary, or ask follow-up questions.

👍 Give feedback — React to comments with 👍 or 👎 to help us improve.

⚙️ Customize — Adjust settings in your preferences.

unblocked · 2026-06-05T21:21:36Z

+  if (content !== null && content.length > 0 && containsInjection(content)) {
+    element.setAttribute("content", "");


The exported description at line 99 still reads "Remove <meta> tags and clear <title> text whose content carries prompt-injection patterns." — but the behavior now blanks the content attribute rather than removing the element. The other three rules in this PR (hidden-text-strip, html-comment-strip, noscript-strip) all had their description property updated from "Remove" to "Blank". This one was missed.

At line 99 update the description to something like:

"Blank <meta> content and <title> text whose value carries prompt-injection patterns.",

Fixed in bf0a0be — the exported description now reads "Blank content and <title> text whose value carries prompt-injection patterns.", matching what the other three converted rules adopted. Good catch — this is exactly the kind of drift the rule sync was supposed to prevent. — Claude Code, on behalf of @twschiller

Sync meta-injection-strip's exported description with its current behavior — the rule blanks the content attribute now, not the element. Matches the wording the other three converted rules adopted in the prior commit. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Reflow docs through mdformat — pre-commit owns markdown wrapping, and my edits in the previous commits used different line widths and table column padding. Pure formatting, no content changes. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…uts in attribute-injection-sanitize These ARIA attributes surface to the accessibility tree as agent-readable name / description / value / shortcut text, so they're a quiet carrier for injection payloads. Pre-#176 they fell to hidden-text-strip's wrapper detachment; after #176 nothing was scrubbing them. Append the four names to CANDIDATE_ATTRIBUTES; the selector and scrub loop pick them up unchanged. Tests, property-test allowlist, and rule docs updated. Closes #182 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

unblocked Bot reviewed Jun 5, 2026

View reviewed changes

vercel Bot deployed to Preview June 5, 2026 21:22 View deployment

vercel Bot deployed to Preview June 5, 2026 21:27 View deployment

twschiller added the bug Something isn't working label Jun 5, 2026

twschiller merged commit 66620a5 into main Jun 5, 2026
7 checks passed

twschiller deleted the fix/scrub-not-detach-framework-nodes branch June 5, 2026 21:31

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix: scrub instead of detach for framework-rendered DOM#176

Fix: scrub instead of detach for framework-rendered DOM#176
twschiller merged 3 commits into
mainfrom
fix/scrub-not-detach-framework-nodes

twschiller commented Jun 5, 2026

Uh oh!

vercel Bot commented Jun 5, 2026 •

edited

Loading

Uh oh!

unblocked Bot left a comment

Uh oh!

unblocked Bot Jun 5, 2026

Uh oh!

twschiller Jun 5, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

		if (content !== null && content.length > 0 && containsInjection(content)) {
		element.setAttribute("content", "");

Conversation

twschiller commented Jun 5, 2026

Summary

Coverage trade-offs

Follow-ups (not in this PR)

Test plan

Uh oh!

vercel Bot commented Jun 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

unblocked Bot left a comment

Choose a reason for hiding this comment

About Unblocked

Uh oh!

unblocked Bot Jun 5, 2026

Choose a reason for hiding this comment

Uh oh!

twschiller Jun 5, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

vercel Bot commented Jun 5, 2026 •

edited

Loading