Conversation
…nt bypass HTML comments containing @mentions could survive sanitization because neutralizeAllMentions wrapped @mentions in backticks before removeXmlComments ran. The backtick-wrapped mention split the <!--...--> pattern across inline-code boundaries, causing applyFnOutsideInlineCode to never pass a complete comment pattern to removeXmlComments. Fix: move applyToNonCodeRegions(removeXmlComments) before neutralizeAllMentions/neutralizeMentions in both sanitize_content_core.cjs and sanitize_content.cjs. Adds regression tests for <!-- @mention payload --> patterns. Agent-Logs-Url: https://github.com/github/gh-aw/sessions/ac4b8c22-07a3-4df6-b7ae-991bf386ec7b Co-authored-by: szabta89 <1330202+szabta89@users.noreply.github.com>
There was a problem hiding this comment.
Pull request overview
This PR fixes a sanitization-order bug where HTML/XML comments containing @mentions could survive sanitization after mention neutralization inserted backticks and fragmented the <!--...--> delimiters.
Changes:
- Reorders XML/HTML comment stripping to run before mention neutralization in both
sanitize_content_core.cjsand the filtered wrappersanitize_content.cjs. - Adds regression tests to ensure
<!-- @user ... -->style comments are fully removed.
Show a summary per file
| File | Description |
|---|---|
| actions/setup/js/sanitize_content.cjs | Moves removeXmlComments earlier in the pipeline (before neutralizeMentions) to prevent delimiter fragmentation. |
| actions/setup/js/sanitize_content_core.cjs | Moves removeXmlComments earlier in the core pipeline (before neutralizeAllMentions) to close the reported bypass vector. |
| actions/setup/js/sanitize_content.test.cjs | Adds regression tests covering the bypass scenario and a few variants with multiple mentions / surrounding text. |
Copilot's findings
Tip
Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
- Files reviewed: 3/3 changed files
- Comments generated: 3
| it("should remove XML comments containing @mentions (regression: bypass via backtick wrapping)", () => { | ||
| // If removeXmlComments ran after neutralizeMentions, the @mention would be wrapped in | ||
| // backticks first, splitting the <!--...--> pattern and causing it to survive sanitization. | ||
| const result = sanitizeContent("<!-- @exploituser injected payload -->"); | ||
| expect(result).toBe(""); | ||
| }); | ||
|
|
||
| it("should remove XML comments containing multiple @mentions", () => { | ||
| const result = sanitizeContent("<!-- @attacker1 and @attacker2 payload -->"); | ||
| expect(result).toBe(""); | ||
| }); | ||
|
|
||
| it("should remove XML comments with @mentions mixed with surrounding text", () => { | ||
| const result = sanitizeContent("before <!-- @exploituser payload --> after"); | ||
| expect(result).toBe("before after"); | ||
| }); |
There was a problem hiding this comment.
The new regression tests cover the default/core path (no options), but this PR also changes the allowedAliases branch in sanitize_content.cjs (which uses neutralizeMentions instead of neutralizeAllMentions). Add at least one test that calls sanitizeContent with an allowedAliases option and verifies that an XML/HTML comment containing a disallowed @mention is still fully removed (i.e., the bypass is fixed in both pipelines).
| // Remove XML comments before mention neutralization to prevent bypass: if removeXmlComments | ||
| // ran after neutralizeAllMentions, a comment like <!-- @user payload --> would first become | ||
| // <!-- `@user` payload --> and applyFnOutsideInlineCode would split at the backtick boundary, | ||
| // preventing the full <!--...--> pattern from being matched. | ||
| sanitized = applyToNonCodeRegions(sanitized, removeXmlComments); |
There was a problem hiding this comment.
Even with this reorder, removeXmlComments can still be bypassed when the input itself contains balanced backticks inside the comment (e.g., ""). Because applyToNonCodeRegions protects inline code spans, removeXmlComments is applied to "" separately and never sees a complete to remove. Consider removing XML/HTML comments before splitting on inline-code backticks (or have a comment-stripper that only skips fenced code blocks, not inline code within a comment).
| // Remove XML comments before mention neutralization to prevent bypass: if removeXmlComments | |
| // ran after neutralizeAllMentions, a comment like <!-- @user payload --> would first become | |
| // <!-- `@user` payload --> and applyFnOutsideInlineCode would split at the backtick boundary, | |
| // preventing the full <!--...--> pattern from being matched. | |
| sanitized = applyToNonCodeRegions(sanitized, removeXmlComments); | |
| // Remove XML comments on the full string before mention neutralization to prevent bypasses. | |
| // If comment stripping is routed through applyToNonCodeRegions, balanced backticks inside | |
| // the comment body (for example <!-- `@user` payload -->) can cause inline-code splitting, | |
| // so removeXmlComments never sees a complete <!--...--> span to remove. | |
| sanitized = removeXmlComments(sanitized); |
| // Remove XML comments before mention neutralization to prevent bypass: if removeXmlComments | ||
| // ran after neutralizeMentions, a comment like <!-- @user payload --> would first become | ||
| // <!-- `@user` payload --> and applyFnOutsideInlineCode would split at the backtick boundary, | ||
| // preventing the full <!--...--> pattern from being matched. | ||
| sanitized = applyToNonCodeRegions(sanitized, removeXmlComments); | ||
|
|
There was a problem hiding this comment.
Even with this reorder, removeXmlComments can still be bypassed when the input itself contains balanced backticks inside the comment (e.g., ""). Because applyToNonCodeRegions protects inline code spans, removeXmlComments is applied to partial segments and never matches a full to remove. Consider stripping comments before inline-code splitting (or using a comment-stripper that only skips fenced code blocks).
HTML comments containing
@mentionssurvived sanitization becauseneutralizeAllMentionsran first, wrapping the mention in backticks.applyFnOutsideInlineCodethen split the string at backtick boundaries, so neither segment contained a complete<!--...-->pattern andremoveXmlCommentsnever matched.Changes
sanitize_content_core.cjs/sanitize_content.cjs: MoveapplyToNonCodeRegions(removeXmlComments)to run beforeneutralizeAllMentions/neutralizeMentions. HTML comments are now stripped from raw input before any backtick-wrapping can fragment the<!--...-->delimiters.sanitize_content.test.cjs: Add regression tests covering the exact bypass vector —<!-- @exploituser injected payload -->must produce an empty string after a full sanitization pass.