Skip to content

Fix crash due to invalid characters in candidate#19829

Merged
RobinMalfait merged 4 commits intomainfrom
fix/issue-19786
Mar 20, 2026
Merged

Fix crash due to invalid characters in candidate#19829
RobinMalfait merged 4 commits intomainfrom
fix/issue-19786

Conversation

@RobinMalfait
Copy link
Member

@RobinMalfait RobinMalfait commented Mar 20, 2026

This PR fixes an issue where the compiler can crash if it encounters an invalid codepoint.

When we extract potential candidates from files, it could be that we encounter values that look like a class or a CSS variable, if it turns out that it's an invalid CSS variable we can ignore it.

The problem is that sometimes there are escaped values in there that result in invalid code points crashing the compiler.

This PR fixes that by gracefully handling that and making sure that invalid code points are replaced by \uFFFD as per the spec.

The bug report (#19786) has a clean example where a piece of text looks like a CSS variable, but contains invalid code points.

--Coding-Projects-CharacterMapper-Master-Workspace\d8819554-4725-4235-9d22-2d0ed572e924

Luckily we can fix this today by ignoring the file paths that contain these strings using @source not "…";, but the better way is to actually fix this.

To solve this, instead of blindly passing numbers to String.fromCodePoint, we will first validate whether it's a valid codepoint:

  1. 0x00000x10FFFF (inclusive) is the range of valid code points. See: https://infra.spec.whatwg.org/#code-point
  2. 0xD8000xDBFF (inclusive) are leading surrogates. See: https://infra.spec.whatwg.org/#leading-surrogate
  3. 0xDC000xDFFF (inclusive) are trailing surrogates. See: https://infra.spec.whatwg.org/#trailing-surrogate

In the code we use the 0xD8000xDFFF range because the ranges overlap.

There are various references in the spec to replace surrogates (and invalid codepoints) with \uFFFD. Here is one of them: https://drafts.csswg.org/css-syntax-3/#consume-escaped-code-point

Fixes: #19786
Fixes: #19801 (this issue talks about a similar invalid code point issue)

Test plan

  1. Added a regression test where the above string was used as a CSS variable
  2. Added a regression test for the unescape functionality to make sure that invalid code points and surrogates are replaced by the \uFFFD replacement character.

[ci-all] Just to verify on Windows as well

@RobinMalfait RobinMalfait requested a review from a team as a code owner March 20, 2026 12:42
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Mar 20, 2026

Walkthrough

The unescape() utility now validates numeric CSS escapes: it treats code points 0x0000, values > 0x10FFFF, and surrogate-range values (0xD8000xDFFF) as invalid and returns the Unicode replacement character; valid multi-digit escapes use String.fromCodePoint(), and short matches retain prior behavior. Added Vitest cases: one exercising the build with an out-of-range escaped sequence and one asserting unescape() replaces an invalid numeric escape with . A changelog entry documents a fixed crash when processing candidates with invalid characters.

🚥 Pre-merge checks | ✅ 4
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed Title accurately summarizes the main change: fixing a crash caused by invalid characters in CSS candidates.
Linked Issues check ✅ Passed The PR fully addresses issue #19786 by validating code points before conversion and replacing invalid/surrogate code points with U+FFFD per CSS spec.
Out of Scope Changes check ✅ Passed All changes are focused on fixing the invalid code point crash: the escape utility implementation, related tests, and changelog entry are all in scope.
Description check ✅ Passed The pull request description clearly explains the issue, the fix applied, and provides test coverage details.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

📝 Coding Plan
  • Generate coding plan for human review comments

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
packages/tailwindcss/src/utils/escape.test.ts (1)

15-23: Consider adding surrogate and NULL regression cases too.

This test covers the out-of-range branch well, but unescape() also replaces NULL and surrogate code points. Adding one case for each would lock in all invalid-path behavior.

➕ Suggested test additions
 describe('unescape', () => {
   test('removes backslashes', () => {
     expect(unescape(String.raw`red-1\/2`)).toMatchInlineSnapshot(`"red-1/2"`)
   })
 
   test('replaces out-of-range escaped code points', () => {
@@
     )
   })
+
+  test('replaces surrogate escaped code points', () => {
+    expect(unescape(String.raw`\d800`)).toBe('�')
+  })
+
+  test('replaces null escaped code points', () => {
+    expect(unescape(String.raw`\0 `)).toBe('�')
+  })
 })
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/tailwindcss/src/utils/escape.test.ts` around lines 15 - 23, Add two
small tests to escape.test.ts that assert unescape()'s handling of surrogate and
NULL code points: one test feeding a lone high-surrogate escape (e.g.
String.raw`...\uD800...` or using the same backslash-escaped hex form as the
existing test) and another feeding a NULL escape (e.g. String.raw`...\0...`),
and assert the returned string matches the expected replacement behavior (the
same replacement character or removal behavior that unescape() implements).
Place these cases alongside the existing out-of-range test so they lock in
unescape()'s invalid-path behavior for surrogate and NULL code points.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@packages/tailwindcss/src/index.test.ts`:
- Around line 1506-1510: The test is using expect(() =>
run([...])).not.toThrow() which only checks synchronous throws and misses
Promise rejections from the async run function; update the test to use an async
assertion by awaiting the promise and asserting it resolves (for example: await
expect(run([...])).resolves.not.toThrow() or await
expect(run([...])).resolves.toBeUndefined()), ensuring you call run(...)
directly (not wrapped in a function) and mark the test async so any rejection
fails the test; locate the invocation of run in the index.test.ts test and
replace the synchronous-check pattern with the awaited expect(...).resolves
form.

---

Nitpick comments:
In `@packages/tailwindcss/src/utils/escape.test.ts`:
- Around line 15-23: Add two small tests to escape.test.ts that assert
unescape()'s handling of surrogate and NULL code points: one test feeding a lone
high-surrogate escape (e.g. String.raw`...\uD800...` or using the same
backslash-escaped hex form as the existing test) and another feeding a NULL
escape (e.g. String.raw`...\0...`), and assert the returned string matches the
expected replacement behavior (the same replacement character or removal
behavior that unescape() implements). Place these cases alongside the existing
out-of-range test so they lock in unescape()'s invalid-path behavior for
surrogate and NULL code points.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: f6677bfb-2cbb-4997-af25-479a58ea0c2f

📥 Commits

Reviewing files that changed from the base of the PR and between 7482d47 and 01221c7.

📒 Files selected for processing (3)
  • packages/tailwindcss/src/index.test.ts
  • packages/tailwindcss/src/utils/escape.test.ts
  • packages/tailwindcss/src/utils/escape.ts

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

1 participant