You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
It seems to me that it would be good advice for specification writers to use the Unicode Category Values as basis for defining (amongst other things) names, rather than apparently randomly chosen lists of character numbers.
The text was updated successfully, but these errors were encountered:
Unicode® Standard Annex #31, Unicode Identifier and Pattern Syntax (http://www.unicode.org/reports/tr31/) may be relevant here, although last time I had a look at it, I didn't agree with all of it.
Several specifications define "names". As one example, XML says (https://www.w3.org/TR/REC-xml/#NT-Nmtoken)
NameStartChar ::= ":" | [A-Z] | "_" | [a-z] | [#xC0-#xD6] | [#xD8-#xF6] | [#xF8-#x2FF] | [#x370-#x37D] | [#x37F-#x1FFF] | [#x200C-#x200D] | [#x2070-#x218F] | [#x2C00-#x2FEF] | [#x3001-#xD7FF] | [#xF900-#xFDCF] | [#xFDF0-#xFFFD] | [#x10000-#xEFFFF]
NameChar ::= NameStartChar | "-" | "." | [0-9] | #xB7 | [#x0300-#x036F] | [#x203F-#x2040]
It is really not clear where these list of characters come from, and why some of these are acceptable as name characters, and others not.
Unicode has the concept of 'category values', http://www.unicode.org/reports/tr44/#General_Category_Values that classify characters as, for instance "Uppercase_Letter", "Lowercase_Letter", etc.
It seems to me that it would be good advice for specification writers to use the Unicode Category Values as basis for defining (amongst other things) names, rather than apparently randomly chosen lists of character numbers.
The text was updated successfully, but these errors were encountered: