You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
For compatibility with other Unicode regex engines, I think \b and \B should probably operate on simple word boundaries. What's most important though is that \b should operate on the same level as \w, so if \w matches a Unicode word character then \b should match on the boundary of a word consisting of Unicode word characters, but if \w only matches an ASCII word character then \b should also match on the boundary of a word consisting of ASCII word characters. Default word boundaries could then be implemented as \b{w} and \B{w} as well as extended grapheme cluster boundaries with \b{g}, sentence boundaries with \b{s}, etc.
http://unicode.org/reports/tr18/#Simple_Word_Boundaries or http://unicode.org/reports/tr18/#Default_Word_Boundaries? Or something else entirely?
See the UTS#18
<word_character>
production:However, http://unicode.org/reports/tr18/#b says:
The text was updated successfully, but these errors were encountered: