-
Notifications
You must be signed in to change notification settings - Fork 17
Update references to match current UTS 35 spec #77
Comments
We did just discuss what "editorial" means the other day, so I should remember quite clearly what it was...but even without remembering the exact meaning, this does not seem editorial. Perhaps most notably given the call yesterday, newer TR35 (or at least what we referenced in it) removed (or, moved) a requirement to do alias/preferred replacement in Unicode locale extensions when canonicalizing. That removal was what led me to no longer have concerns about advancing But if #43 is correct, we specifically chose to do replacements. So if TR35 updating changed that, we would also need to change to do replacements again. And that's a significant change in how Given that this proposal modifies |
So, looking closely at this, it seems like the only way to invoke replacements any more is to invoke TR35's new "canonical form" setup. We could hand-roll our own version of this to perform replacements, of course, but that seems best avoided. So...probably we want to invoke canonical form in |
Note that BCP 47 Language Tag to Unicode BCP 47 Locale Identifier, that is the operation currently performed by @anba Am I dumb, or is this just TR35 bug? (Which would raise the question of whether this algorithm in TR35 should change, and if it did change -- and ideally deduplicated -- we'd only have duplicates to deal with at all in this spec.) |
Yes, this is a TR35 bug. When TR35 was changed to differentiate between "canonical syntax" and "canonical form" (in version 36), that sentence wasn't updated to use "canonical syntax". |
I ended up making us invoke canonical form in |
@zbraniecki #83 will solve one aspect of this issue, namely responding to the canonical form/syntax split introduced in UTS35 v35. I can't say with confidence that that is definitely the only problem this issue covers -- merely the most serious one from my point of view. @anba could say more, more quickly, than I could here, I think. |
This indeed was not an editorial issue. I'll wait for @anba to verify if there's anything left now and if what's left is editorial :) |
For example see step 5 in Intl.Locale.prototype.language:
The locale id "en-t-en" contains two |
Thank you! That's a good catch. I clarified that in the PR. Lmk if there's anything else you noticed. |
The current draft spec was written against UTS 35, version 34, but the UTS 35 is now at version 35 and version 35 contained many changes for Unicode BCP 47 locale identifiers. I'd suggest making a check over the complete Intl.Locale spec to verify it still matches what's currently in UTS 35.
For example:
unicode_language_subtag
is not only used inunicode_language_id
, but also in thetlang
production.The text refers to what was in http://www.unicode.org/reports/tr35/tr35-53/tr35.html#u_Extension, but that's now part of http://unicode.org/reports/tr35/#Canonical_Unicode_Locale_Identifiers (cf. canonical syntax and canonical form in that section).
The text was updated successfully, but these errors were encountered: