Skip to content

Commit d1e2a85

Browse files
committed
Add isIDCONT_lazy_if_safe()
Various places in the code are using isWORDCHAR to match the continuation in an identifier. This mostly works, but the two sets are not identical, and the proper thing to do is to match continuation characters. The infrastructure was lacking this macro that would make it easy to do the right thing. This commit adds the infrastructure, leaving it to future commits to use it. A reasonably complete list of characters that differ between the two sets is: MIDDLE DOT GREEK YPOGEGRAMMENI GREEK ANO TELEIA COMBINING CYRILLIC HUNDRED THOUSANDS SIGN COMBINING CYRILLIC MILLIONS SIGN ARMENIAN MODIFIER LETTER LEFT HALF RING ARMENIAN EMPHASIS MARK NEW TAI LUE THAM DIGIT ONE COMBINING PARENTHESES OVERLAY COMBINING ENCLOSING CIRCLE COMBINING ENCLOSING CIRCLE BACKSLASH COMBINING ENCLOSING SCREEN COMBINING ENCLOSING UPWARD POINTING TRIANGLE MANDAIC LETTER AZ ESTIMATED SYMBOL CIRCLED LATIN CAPITAL LETTER A ... CIRCLED LATIN SMALL LETTER Z VERTICAL TILDE KATAKANA MIDDLE DOT COMBINING CYRILLIC TEN MILLIONS SIGN COMBINING CYRILLIC THOUSAND MILLIONS SIGN ARABIC LIGATURE SHADDA WITH DAMMATAN ISOLATED FORM ARABIC LIGATURE SHADDA WITH SUPERSCRIPT ALEF ISOLATED FORM ARABIC LIGATURE SALLALLAHOU ALAYHE WASALLAM ARABIC LIGATURE JALLAJALALOUHOU ARABIC FATHATAN ISOLATED FORM ARABIC DAMMATAN ISOLATED FORM ARABIC KASRATAN ISOLATED FORM ARABIC FATHA ISOLATED FORM ARABIC DAMMA ISOLATED FORM ARABIC KASRA ISOLATED FORM ARABIC SHADDA ISOLATED FORM ARABIC SUKUN ISOLATED FORM HALFWIDTH KATAKANA MIDDLE DOT SQUARED LATIN CAPITAL LETTER A SQUARED LATIN CAPITAL LETTER Z NEGATIVE CIRCLED LATIN CAPITAL LETTER A ... NEGATIVE CIRCLED LATIN CAPITAL LETTER Z NEGATIVE SQUARED LATIN CAPITAL LETTER A ... NEGATIVE SQUARED LATIN CAPITAL LETTER Z
1 parent d4c2561 commit d1e2a85

File tree

1 file changed

+4
-0
lines changed

1 file changed

+4
-0
lines changed

utf8.h

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -951,6 +951,10 @@ implementation of the latter. */
951951
((IN_BYTES || !UTF) \
952952
? isIDFIRST(*(p)) \
953953
: isIDFIRST_utf8_safe(p, e))
954+
#define isIDCONT_lazy_if_safe(p, e, UTF) \
955+
((IN_BYTES || !UTF) \
956+
? isIDCONT(*(p)) \
957+
: isIDCONT_utf8_safe(p, e))
954958
#define isWORDCHAR_lazy_if_safe(p, e, UTF) \
955959
((IN_BYTES || !UTF) \
956960
? isWORDCHAR(*(p)) \

0 commit comments

Comments
 (0)