Decide what to do for `\b` and `\B` #1

mathiasbynens · 2014-08-26T08:14:13Z

http://unicode.org/reports/tr18/#Simple_Word_Boundaries or http://unicode.org/reports/tr18/#Default_Word_Boundaries? Or something else entirely?

See the UTS#18 <word_character> production:

The class of <word_character> includes all the Alphabetic values from the Unicode character database, from UnicodeData.txt [UData], plus the decimals (General_Category=Decimal_Number, or equivalently Numeric_Type=Decimal), and the U+200C ZERO WIDTH NON-JOINER and U+200D ZERO WIDTH JOINER (Join_Control=True).

However, http://unicode.org/reports/tr18/#b says:

If there is a requirement that \b align with \w, then it would use the approximation above instead.

The text was updated successfully, but these errors were encountered:

patch · 2014-09-12T23:04:11Z

For compatibility with other Unicode regex engines, I think \b and \B should probably operate on simple word boundaries. What's most important though is that \b should operate on the same level as \w, so if \w matches a Unicode word character then \b should match on the boundary of a word consisting of Unicode word characters, but if \w only matches an ASCII word character then \b should also match on the boundary of a word consisting of ASCII word characters. Default word boundaries could then be implemented as \b{w} and \B{w} as well as extended grapheme cluster boundaries with \b{g}, sentence boundaries with \b{s}, etc.

Details on the \b{…} syntax:
http://unicode.org/reports/tr18/#Default_Grapheme_Clusters

mathiasbynens · 2014-11-08T09:36:25Z

This is not gonna happen as per https://esdiscuss.org/topic/questions-regarding-es6-unicode-regular-expressions#content-5. Closing the issue.

mathiasbynens closed this as completed Nov 8, 2014

zeratax mentioned this issue May 10, 2018

mentions don't work for nonascii characters matrix-org/matrix-appservice-discord#118

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Decide what to do for `\b` and `\B` #1

Decide what to do for `\b` and `\B` #1

mathiasbynens commented Aug 26, 2014 •

edited

patch commented Sep 12, 2014

mathiasbynens commented Nov 8, 2014

Decide what to do for \b and \B #1

Decide what to do for \b and \B #1

Comments

mathiasbynens commented Aug 26, 2014 • edited

patch commented Sep 12, 2014

mathiasbynens commented Nov 8, 2014

Decide what to do for `\b` and `\B` #1

Decide what to do for `\b` and `\B` #1

mathiasbynens commented Aug 26, 2014 •

edited