Should include advice on specifying what a letter is. #16

spemberton · 2016-07-11T13:06:30Z

Several specifications define "names". As one example, XML says (https://www.w3.org/TR/REC-xml/#NT-Nmtoken)

NameChar ::= NameStartChar | "-" | "." | [0-9] | #xB7 | [#x0300-#x036F] | [#x203F-#x2040]

It is really not clear where these list of characters come from, and why some of these are acceptable as name characters, and others not.

Unicode has the concept of 'category values', http://www.unicode.org/reports/tr44/#General_Category_Values that classify characters as, for instance "Uppercase_Letter", "Lowercase_Letter", etc.

It seems to me that it would be good advice for specification writers to use the Unicode Category Values as basis for defining (amongst other things) names, rather than apparently randomly chosen lists of character numbers.

duerst · 2016-07-12T07:08:54Z

Unicode® Standard Annex #31, Unicode Identifier and Pattern Syntax (http://www.unicode.org/reports/tr31/) may be relevant here, although last time I had a look at it, I didn't agree with all of it.

aphillips · 2022-05-12T18:13:30Z

Is this addressed by the recently-added section in specdev found here?

aphillips · 2023-10-27T18:16:46Z

No response on the last comment. We appear to have addressed it. Please reopen if needed.

aphillips added the close? label May 12, 2022

aphillips closed this as completed Oct 27, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Should include advice on specifying what a letter is. #16

Should include advice on specifying what a letter is. #16

spemberton commented Jul 11, 2016

duerst commented Jul 12, 2016

aphillips commented May 12, 2022

aphillips commented Oct 27, 2023

Should include advice on specifying what a letter is. #16

Should include advice on specifying what a letter is. #16

Comments

spemberton commented Jul 11, 2016

duerst commented Jul 12, 2016

aphillips commented May 12, 2022

aphillips commented Oct 27, 2023