[RegExp] Match updated spec for `/\w/iu` and `/\W/iu` #1181

Closed
mathiasbynens opened this Issue Jun 23, 2016 · 4 comments

Projects

None yet

3 participants

@mathiasbynens
mathiasbynens commented Jun 23, 2016 edited

Now that tc39/ecma262#525 has landed, \u017F (LATIN SMALL LETTER LONG S) and \u212A (KELVIN SIGN) are word characters for /iu patterns.

Expected vs. actual behavior:

/\w/iu.test('\u017F') // true
/\w/iu.test('\u212A') // true
/\W/iu.test('\u017F') // false, actual: true
/\W/iu.test('\u212A') // false, actual: true
/\W/iu.test('s')      // false
/\W/iu.test('S')      // false
/\W/iu.test('K')      // false
/\W/iu.test('k')      // false

See also

@dilijev
Member
dilijev commented Jun 24, 2016

Related to #517

@akroshg akroshg was assigned by dilijev Jun 24, 2016
@dilijev dilijev assigned dilijev and unassigned akroshg Nov 11, 2016
@dilijev dilijev added this to the 1.4 milestone Nov 11, 2016
@dilijev dilijev changed the title from RegExp: Match updated spec for `/\w/iu` and `/\W/iu` to [RegExp] Match updated spec for `/\w/iu` and `/\W/iu` Nov 11, 2016
@dilijev dilijev modified the milestone: 1.4.1, 1.4.0 Dec 20, 2016
@dilijev
Member
dilijev commented Jan 13, 2017 edited

Just to clarify, I'm seeing in the following in ChakraCore for actual behavior.

/\w/iu.test('\u017F') // expected: true,  actual: false ***
/\w/iu.test('\u212A') // expected: true,  actual: false ***
/\W/iu.test('\u017F') // expected: false, actual: true
/\W/iu.test('\u212A') // expected: false, actual: true
/\W/iu.test('s')      // expected: false
/\W/iu.test('S')      // expected: false
/\W/iu.test('K')      // expected: false
/\W/iu.test('k')      // expected: false

Since sharp s and kelvin are word characters, \w should match those two characters (first two lines above).
Also since \W should be the inverse set of \w, \W does not match those characters. Correct?

@dilijev dilijev added a commit to dilijev/ChakraCore that referenced this issue Jan 13, 2017
@dilijev dilijev Add Sharp S and Kelvin K to word-chars. Fixes #1181. 6dd0009
@dilijev dilijev added a commit to dilijev/ChakraCore that referenced this issue Jan 13, 2017
@dilijev dilijev Add Sharp S and Kelvin K to word-chars. Fixes #1181. 5638194
@mathiasbynens

The expected behavior you’ve outlined is correct.

One minor correction, though:

Since sharp s and kelvin are word characters

They’re not (as far as \w goes in JavaScript), but with the i flag set they canonicalize to S and K which are.

@dilijev
Member
dilijev commented Jan 13, 2017

@mathiasbynens Oh, I see. That makes sense. I still think the simplest / most efficient way to get this behavior is to add them to the char set we use for \w when both /i and /u flags are set.

@chakrabot chakrabot pushed a commit that referenced this issue Jan 13, 2017
@dilijev dilijev [MERGE #2375 @dilijev] Add Sharp S and Kelvin sign to word-chars. Fixes
#1181.

Merge pull request #2375 from dilijev:regex-w

Fixes #1181

Reviewed by @tcare and @bterlson
115f937
@chakrabot chakrabot closed this in 5104ab2 Jan 13, 2017
@chakrabot chakrabot pushed a commit that referenced this issue Jan 13, 2017
@dilijev dilijev [1.4>master] [MERGE #2375 @dilijev] Add Sharp S and Kelvin sign to wo…
…rd-chars. Fixes #1181.

Merge pull request #2375 from dilijev:regex-w

Fixes #1181

Reviewed by @tcare and @bterlson
25a7f3f
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment