[JSC] Reject dangling hyphen in class set under /v flag#180
Open
robobun wants to merge 1 commit intooven-sh:mainfrom
Open
[JSC] Reject dangling hyphen in class set under /v flag#180robobun wants to merge 1 commit intooven-sh:mainfrom
robobun wants to merge 1 commit intooven-sh:mainfrom
Conversation
In UnicodeSets mode (/v), - is a ClassSetSyntaxCharacter per ECMA-262 and is only legal between two ClassSetCharacters as part of a ClassSetRange. A bare or trailing - with no right-hand side (e.g. /[a-]/v, /[\d-]/v, /[\w-]/v, /[a-z\d-]/v) must be rejected. ClassSetParserDelegate previously silently accepted the CachedCharacterHyphen and AfterCharacterClassHyphen states in flushCachedCharacterIfNeeded() and end(), so these patterns parsed without error and matched both operands and - literally. None of V8, SpiderMonkey, or the spec agree. Add an InvalidClassSetCharacter error when either incomplete-range state is hit at a class-set transition point (nested class boundary, set operator, or closing ]). The valid-range path (CachedCharacter -> CachedCharacterHyphen -> completed range) is unaffected because it does not go through flushCachedCharacterIfNeeded or end() while the hyphen is pending. Fixes oven-sh/bun#29003.
robobun
added a commit
to oven-sh/bun
that referenced
this pull request
Apr 8, 2026
See oven-sh/WebKit#180 for the JSC-side parser fix. In UnicodeSets mode, - is a ClassSetSyntaxCharacter that is only legal as part of a full ClassSetRange. A dangling - (e.g. /[a-]/v, /[\d-]/v, /[\w-]/v, /[a-z\d-]/v) must throw a SyntaxError, matching V8/SpiderMonkey and the spec. The fix lives in vendor/WebKit (not tracked in this repo) in yarr/YarrParser.h. The companion WebKit PR raises ErrorCode::InvalidClassSetCharacter in flushCachedCharacterIfNeeded() and end() when the parser is still in a CachedCharacterHyphen or AfterCharacterClassHyphen state at a class-set transition point.
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Organization UI Review profile: ASSERTIVE Plan: Pro Run ID: 📒 Files selected for processing (1)
WalkthroughThis change modifies YarrParser to properly reject invalid unescaped hyphens in character classes when using the Changes
🚥 Pre-merge checks | ✅ 4✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. Comment |
robobun
added a commit
to oven-sh/bun
that referenced
this pull request
Apr 8, 2026
- Drop the multi-line prose comments in favor of the single-line issue URL (coderabbit nit). - Split the already-passing regression guards ([-a], [-]) into their own test() block so they keep running even if the pending /v fix lands late. - Wrap the not-yet-landed assertions (/[a-]/v, /[\d-]/v, /[\w-]/v, /[a-z\d-]/v) in test.todo(). These need oven-sh/WebKit#180 to merge and a WEBKIT_VERSION bump; the todos get promoted to test() in the same commit as the bump.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
In UnicodeSets mode (
/v),-is aClassSetSyntaxCharacterper ECMA-262 and is only legal between twoClassSetCharacters as part of aClassSetRange. A bare or trailing-with no right-hand side must be rejected.Patterns that currently parse but should not:
/[a-]/v/[\d-]/v/[\w-]/v/[a-z\d-]/vNone of V8, SpiderMonkey, or the spec accept them.
Root cause
ClassSetParserDelegate::flushCachedCharacterIfNeeded()only handledCachedCharacter.end()silently acceptedCachedCharacterHyphenby emitting both the cached character and a literal-.AfterCharacterClassHyphenfell through entirely. Both states represent an incompleteClassSetRange— a-with nothing on the right — and must be errors in/v.Fix
Raise
ErrorCode::InvalidClassSetCharacterin bothflushCachedCharacterIfNeeded()andend()when either incomplete-range state is reached. The valid-range path (a-z) goes straight fromCachedCharacterHypheninto the range-completion branch ofatomPatternCharacter()without touching either of those helpers, so it is unaffected.Verification
Covered by the bun regression test in the companion bun PR. Patterns that currently should error still error, the four newly-detected dangling-hyphen forms now error, and
/[a-z]/v,/[a\-]/v,/[\-a]/v,/[a--b]/v,/[a&&b]/v,/[\w--\d]/vstill parse and match as before.Fixes oven-sh/bun#29003.