fix: preserve template delimiters inside fenced code blocks in safe-output sanitizer#28736
fix: preserve template delimiters inside fenced code blocks in safe-output sanitizer#28736
Conversation
…utput sanitizer
The `neutralizeTemplateDelimiters` function now uses `applyToNonCodeRegions` so
that `{{`, `${`, `<%=`, `{#`, and `{%` inside fenced code blocks (including GitHub
`suggestion` blocks) and inline code spans are no longer escaped.
This fixes the bug where suggesting documentation changes that include valid
template-like syntax (e.g. Elastic `{{fleet-server}}`) would corrupt the patch
when a reviewer clicked "Commit suggestion".
Updated tests:
- Replaced the "should escape template delimiters in code blocks" test (which
asserted the now-incorrect behaviour) with four targeted tests:
- inline code spans are preserved
- fenced code blocks are preserved
- `suggestion` fenced blocks are preserved ← main regression case
- template delimiters outside code blocks are still escaped
Agent-Logs-Url: https://github.com/github/gh-aw/sessions/b3d5dfb0-f0a0-4bc0-8294-ec87a63672bd
Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
core is always available as a global (or a shim is loaded), so the typeof core !== "undefined" guards are unnecessary. Agent-Logs-Url: https://github.com/github/gh-aw/sessions/9b0d87f4-8e26-4ea0-9f21-395bf174b0e4 Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
|
@copilot add extensive unit tests and fuzz tests for parser (no mocks) |
…, applyFnOutsideInlineCode, applyToNonCodeRegions) 63 new tests covering: - getFencedCodeRanges contract (backtick/tilde fences, info strings, suggestion blocks, unclosed fences, closing fence length, fence character mismatch, adjacent/consecutive blocks, CRLF) - applyFnOutsideInlineCode inline code span handling (single/double/triple backticks, mismatched counts, nested backticks, very long spans, many spans in a row) - applyToNonCodeRegions mixed code region combinations - Fuzz-style invariant tests (identity transform, fn never sees fenced content, no crash on adversarial inputs, fn never receives non-strings) - Edge cases (empty fence, fence-only document, Windows CRLF, Unicode) No mocks; core is stubbed as a plain object. Agent-Logs-Url: https://github.com/github/gh-aw/sessions/ac84c1d9-91a1-4cd1-a204-871dcd17d5a2 Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
Added in f9cf5e1 — new file
|
There was a problem hiding this comment.
Pull request overview
Fixes an output-sanitization regression where template delimiter escaping was corrupting content inside markdown code regions (notably GitHub suggestion blocks), leading to backslashes being committed literally when applying suggestions.
Changes:
- Refactors
neutralizeTemplateDelimitersto escape only in non-code markdown regions viaapplyToNonCodeRegions, and dedupes template-detection logs. - Updates/expands
sanitize_content.test.cjsto ensure inline code, fenced blocks, andsuggestionfences are preserved verbatim while prose is still escaped. - Adds a new
sanitize_content_core_parser.test.cjssuite with extensive unit + fuzz-style coverage for the markdown code-region helpers.
Show a summary per file
| File | Description |
|---|---|
| actions/setup/js/sanitize_content_core.cjs | Routes template-delimiter escaping through the non-code-region parser to avoid modifying fenced/inline code. |
| actions/setup/js/sanitize_content.test.cjs | Replaces prior expectations with targeted regression tests for code-region preservation and prose escaping. |
| actions/setup/js/sanitize_content_core_parser.test.cjs | Adds broad unit/fuzz coverage for fenced/inline code parsing and transformation boundaries. |
Copilot's findings
Tip
Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
- Files reviewed: 3/3 changed files
- Comments generated: 4
| it("multi-line inline code (inline code spans do not cross line boundaries in CommonMark)", () => { | ||
| // In CommonMark, inline code CAN span lines. Verify no crash and sensible output. | ||
| const result = applyToNonCodeRegions("`line1\nline2`", s => s.toUpperCase()); | ||
| expect(typeof result).toBe("string"); | ||
| }); |
There was a problem hiding this comment.
The test comment is internally inconsistent about CommonMark behavior: it says inline code spans do not cross line boundaries, then immediately says they can span lines. Please clarify/correct the comment to avoid future confusion when maintaining the parser.
| if (/\{\{/.test(result)) { | ||
| if (!detectedTypes.has("jinja2")) { | ||
| detectedTypes.add("jinja2"); | ||
| core.info("Template syntax detected: Jinja2/Liquid double braces {{"); | ||
| } | ||
| result = result.replace(/\{\{/g, "\\{\\{"); | ||
| } | ||
| result = result.replace(/\{\{/g, "\\{\\{"); | ||
| } | ||
|
|
||
| // Escape ERB delimiters: <%= ... %> | ||
| // Replace <%= with \<%= to prevent ERB evaluation | ||
| if (/<%=/.test(result)) { | ||
| templatesDetected = true; | ||
| if (typeof core !== "undefined" && core.info) { | ||
| core.info("Template syntax detected: ERB delimiter <%="); | ||
| // Escape ERB delimiters: <%= ... %> |
There was a problem hiding this comment.
neutralizeTemplateDelimiters now calls core.info / core.warning unconditionally. This will throw a ReferenceError if core is not present in the global scope (e.g., when importing sanitize_content_core.cjs in a plain Node context without setupGlobals()/shim.cjs). Other functions in this file still guard core access via typeof core !== "undefined", so this introduces an inconsistency and a potential runtime regression when template patterns are present. Consider restoring the typeof core !== "undefined" && core.* guards here, or consistently referencing global.core with a safe fallback/no-op logger.
| beforeEach(async () => { | ||
| // Set up a minimal stub so code that calls core.* doesn't throw. | ||
| // We deliberately avoid vi.fn() to keep tests mock-free. | ||
| global.core = { | ||
| info: () => {}, | ||
| warning: () => {}, | ||
| debug: () => {}, | ||
| error: () => {}, | ||
| }; | ||
|
|
||
| const mod = await import("./sanitize_content_core.cjs"); |
There was a problem hiding this comment.
This test file assigns global.core in beforeEach but never restores/deletes it. Leaving a modified global behind can create order-dependent interactions with other test files (many of which explicitly assert/assume global.core is unset between suites). Consider saving the previous global.core and restoring it in an afterEach/afterAll.
| /** | ||
| * Helper: verify that a transform fn is NOT called on regions inside a | ||
| * fenced block and IS called on regions outside it. | ||
| */ | ||
| function verifyFences(input, expectedOutside, expectedInsideRaw) { | ||
| const calls = []; | ||
| applyToNonCodeRegions(input, chunk => { | ||
| calls.push(chunk); | ||
| return chunk; | ||
| }); | ||
|
|
||
| // The raw fence text should appear verbatim in the result | ||
| const result = applyToNonCodeRegions(input, chunk => chunk.toUpperCase()); | ||
| return { result, calls }; | ||
| } | ||
|
|
There was a problem hiding this comment.
verifyFences (and its parameters expectedOutside, expectedInsideRaw) is defined but never used, and the parameters are unused even within the function. This adds dead code/noise to the test suite; consider removing the helper or using it in the fence tests so it contributes to coverage.
| /** | |
| * Helper: verify that a transform fn is NOT called on regions inside a | |
| * fenced block and IS called on regions outside it. | |
| */ | |
| function verifyFences(input, expectedOutside, expectedInsideRaw) { | |
| const calls = []; | |
| applyToNonCodeRegions(input, chunk => { | |
| calls.push(chunk); | |
| return chunk; | |
| }); | |
| // The raw fence text should appear verbatim in the result | |
| const result = applyToNonCodeRegions(input, chunk => chunk.toUpperCase()); | |
| return { result, calls }; | |
| } |
🧪 Test Quality Sentinel ReportTest Quality Score: 71/100
Test Classification DetailsView all 50 classified tests
Flagged Tests — Requires ReviewNo tests were flagged. All 50 analyzed tests verify observable behavioral contracts. Notes
|
There was a problem hiding this comment.
✅ Test Quality Sentinel: 71/100. Test quality is acceptable — 0% of new tests are implementation tests (threshold: 30%). All 50 analyzed tests verify observable behavioral contracts of the markdown code-block parser. Minor note: test file sanitize_content_core_parser.test.cjs shows a ~10× line ratio vs the production file, which is above the 2:1 inflation threshold; however the tests establish a comprehensive CommonMark-parser contract baseline (including fuzz/property-based invariants) that provides lasting regression value. No coding-guideline violations detected.
|
@copilot review all comments |
- Restore typeof core !== "undefined" guards in neutralizeTemplateDelimiters (core.info x5, core.warning) to match the rest of the file and prevent ReferenceError when running in a plain Node context - Add afterEach to delete global.core in the parser test suite so it does not leak into other test files - Remove unused verifyFences helper function (dead code) - Fix misleading test name/comment that contradicted itself about whether inline code spans can cross line boundaries (they can per CommonMark §6.11) Agent-Logs-Url: https://github.com/github/gh-aw/sessions/72c80953-78ef-4fc6-9f8a-3b139a884202 Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
All four review comments addressed in f8a9047:
|
|
@copilot remove all core guards, a shim is loaded in node.js environments |
A shim is always loaded in Node.js environments, so the guards are unnecessary. Removed all 13 remaining typeof core !== "undefined" guard wrappers across the file (URL redaction, mention escaping, template delimiter detection, GitHub reference filtering). Agent-Logs-Url: https://github.com/github/gh-aw/sessions/d67f99b2-824d-43ca-a316-13659e8da6a5 Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
Tests in this file call sanitizeTitle → sanitizeContentCore, which now calls core.info/core.debug unconditionally (guards removed). Add beforeEach/afterEach to set up and tear down a global.core stub, consistent with sanitize_content.test.cjs. Agent-Logs-Url: https://github.com/github/gh-aw/sessions/a74f7f25-be32-4dc6-b4e5-48ded4834099 Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
Fixed in b8ae0f0 — |
|
✅ Smoke CI completed successfully! |
Comment MemoryNote This comment is managed by comment memory.It stores persistent context for this thread in the code block at the top of this comment. |
neutralizeTemplateDelimiterswas applying regex replacements globally, including inside fenced code blocks and inline code spans. This corrupted GitHubsuggestionblock content — e.g.{{fleet-server}}became\{\{fleet-server}}in the patch, causing "Commit suggestion" to write the backslashes literally into the file.Changes
sanitize_content_core.cjs— RefactoredneutralizeTemplateDelimitersto pass the escaping logic throughapplyToNonCodeRegions, which already skips fenced blocks and inline code spans. ASetdeduplicatescore.infolog calls across multiple non-code segments. Removed alltypeof core !== "undefined"guards throughout the file —coreis always available as a global or shim in every execution environment.sanitize_content.test.cjs— Replaced the test asserting (now-incorrect) escaping inside inline code with four targeted tests:suggestionfenced blocks preserved verbatim ← primary regression casesanitize_content_core_parser.test.cjs(new) — Extensive unit and fuzz tests for the core parser helpers (getFencedCodeRanges,applyFnOutsideInlineCode,applyToNonCodeRegions) with no mocks. 63 tests covering:suggestionblocksglobal.corerestored after each test to prevent cross-suite pollutionExample
Before:
{{fleet-server}}→\{\{fleet-server}}(patch corrupted)After:
{{fleet-server}}preserved verbatim inside the fence