-
Notifications
You must be signed in to change notification settings - Fork 17
Change the CanonicalizeLanguageTag operation so that it removes duplicate attributes/keywords in Unicode locale extension sequences just as Intl.Locale does #83
Conversation
LGTM! It looks like V8 is already removing duplicate keywords, so there shouldn't be any web-compat problems: d8> Intl.getCanonicalLocales("de-u-ca-gregory-ca-islamicc")
["de-u-ca-gregory"] And duplicate attributes don't seem to be supported in V8, which is a V8 bug, but also means we don't need to worry about web-compat issues. d8> Intl.getCanonicalLocales("de-u-attr-attr")
(d8):1: RangeError: Invalid language tag: de-u-attr-attr |
…removes duplicate attributes/keywords in Unicode locale extension sequences just as Intl.Locale does. Fixes tc39#82.
121c725
to
b8b8b78
Compare
This doesn't cover several steps in https://www.unicode.org/reports/tr35/tr35.html#Canonical_Unicode_Locale_Identifiers
is that ok? |
This is one step of two. This only deals with duplications. Sorting and removal of |
…canonical syntax.
Okay, I added a second commit here that additionally makes
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm!
@srl295 - can you verify that this change is in line with ICU thinking? |
This PR is blocking stage advancement, and it looks like the PR is waiting for review from a subject matter expert. Is that correct? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This change makes sense to me and seems like it should be an improvement vs previous behavior. I wonder if we should consider "upstreaming" this algorithm into the Unicode document.
let me see if i can get a review |
I think the abstract operation name is In CLDR, Unicode BCP 47 locale identifier is defined separately from BCP 47 language tag. And such canonicalization is in a scope of LDML/u-extension, not BCP 47 language tag specification. I think removal of duplicated key is good and should be clarified in LDML, but this operation is for Unicode locale identifier. If the abstract operation name is |
@yumaoka Thanks for the review. It sounds like you're saying that there are editorial cleanups that we should do, but that the semantics are appropriate. Is that accurate? |
rename the operation to make it clear that it is not generic to BCP47 but specific to Unicode locale IDs (which are a strict subset of BCP47)? |
…ore-precise name.
Made the change to rename |
Thank you! |
...wait, that's not it. I did a git and forgot to git add the changes, so that third commit doesn't actually make all the name-changes. :-( Sec, I'll fix. |
Well, in theory I can fix. I have force-pushed to the same branch in my fork, but it does not appear to be showing up here. I'll open a new PR for it. |
yeah, since I merged the PR, we need a new PR. Sorry for rushing the merge! |
This fixes #82.