Fix crash due to invalid characters in candidate#19829
Conversation
WalkthroughThe 🚥 Pre-merge checks | ✅ 4✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. 📝 Coding Plan
Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (1)
packages/tailwindcss/src/utils/escape.test.ts (1)
15-23: Consider adding surrogate and NULL regression cases too.This test covers the out-of-range branch well, but
unescape()also replaces NULL and surrogate code points. Adding one case for each would lock in all invalid-path behavior.➕ Suggested test additions
describe('unescape', () => { test('removes backslashes', () => { expect(unescape(String.raw`red-1\/2`)).toMatchInlineSnapshot(`"red-1/2"`) }) test('replaces out-of-range escaped code points', () => { @@ ) }) + + test('replaces surrogate escaped code points', () => { + expect(unescape(String.raw`\d800`)).toBe('�') + }) + + test('replaces null escaped code points', () => { + expect(unescape(String.raw`\0 `)).toBe('�') + }) })🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@packages/tailwindcss/src/utils/escape.test.ts` around lines 15 - 23, Add two small tests to escape.test.ts that assert unescape()'s handling of surrogate and NULL code points: one test feeding a lone high-surrogate escape (e.g. String.raw`...\uD800...` or using the same backslash-escaped hex form as the existing test) and another feeding a NULL escape (e.g. String.raw`...\0...`), and assert the returned string matches the expected replacement behavior (the same replacement character or removal behavior that unescape() implements). Place these cases alongside the existing out-of-range test so they lock in unescape()'s invalid-path behavior for surrogate and NULL code points.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@packages/tailwindcss/src/index.test.ts`:
- Around line 1506-1510: The test is using expect(() =>
run([...])).not.toThrow() which only checks synchronous throws and misses
Promise rejections from the async run function; update the test to use an async
assertion by awaiting the promise and asserting it resolves (for example: await
expect(run([...])).resolves.not.toThrow() or await
expect(run([...])).resolves.toBeUndefined()), ensuring you call run(...)
directly (not wrapped in a function) and mark the test async so any rejection
fails the test; locate the invocation of run in the index.test.ts test and
replace the synchronous-check pattern with the awaited expect(...).resolves
form.
---
Nitpick comments:
In `@packages/tailwindcss/src/utils/escape.test.ts`:
- Around line 15-23: Add two small tests to escape.test.ts that assert
unescape()'s handling of surrogate and NULL code points: one test feeding a lone
high-surrogate escape (e.g. String.raw`...\uD800...` or using the same
backslash-escaped hex form as the existing test) and another feeding a NULL
escape (e.g. String.raw`...\0...`), and assert the returned string matches the
expected replacement behavior (the same replacement character or removal
behavior that unescape() implements). Place these cases alongside the existing
out-of-range test so they lock in unescape()'s invalid-path behavior for
surrogate and NULL code points.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: f6677bfb-2cbb-4997-af25-479a58ea0c2f
📒 Files selected for processing (3)
packages/tailwindcss/src/index.test.tspackages/tailwindcss/src/utils/escape.test.tspackages/tailwindcss/src/utils/escape.ts
This PR fixes an issue where the compiler can crash if it encounters an invalid codepoint.
When we extract potential candidates from files, it could be that we encounter values that look like a class or a CSS variable, if it turns out that it's an invalid CSS variable we can ignore it.
The problem is that sometimes there are escaped values in there that result in invalid code points crashing the compiler.
This PR fixes that by gracefully handling that and making sure that invalid code points are replaced by
\uFFFDas per the spec.The bug report (#19786) has a clean example where a piece of text looks like a CSS variable, but contains invalid code points.
Luckily we can fix this today by ignoring the file paths that contain these strings using
@source not "…";, but the better way is to actually fix this.To solve this, instead of blindly passing numbers to
String.fromCodePoint, we will first validate whether it's a valid codepoint:0x0000—0x10FFFF(inclusive) is the range of valid code points. See: https://infra.spec.whatwg.org/#code-point0xD800—0xDBFF(inclusive) are leading surrogates. See: https://infra.spec.whatwg.org/#leading-surrogate0xDC00—0xDFFF(inclusive) are trailing surrogates. See: https://infra.spec.whatwg.org/#trailing-surrogateIn the code we use the
0xD800—0xDFFFrange because the ranges overlap.There are various references in the spec to replace surrogates (and invalid codepoints) with
\uFFFD. Here is one of them: https://drafts.csswg.org/css-syntax-3/#consume-escaped-code-pointFixes: #19786
Fixes: #19801 (this issue talks about a similar invalid code point issue)
Test plan
\uFFFDreplacement character.[ci-all] Just to verify on Windows as well