New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Project C++ enums to Intl.js, implement Intl.Collator's "usage" option #5651
Project C++ enums to Intl.js, implement Intl.Collator's "usage" option #5651
Conversation
f590d1b
to
ee1b45a
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks Jack.
cc: @Microsoft/target-dev as FYI.
int collationsCount = uenum_count(collations, &status); | ||
|
||
// we expect collationsCount to have at least "standard" and "search" in it | ||
ICU_ASSERT(status, collationsCount > 2); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't if it is guaranteed that all locales will have 2 (unique to the locale) collation types. I think that in some cases the "search" might be aliased to the "default/standard" collation type.
However, that said, theres fallback to the root data well, so they might always have at least two, but based on falling back to root.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While debugging, I printed out all of the collation values for a few different locales and they all had at least two; if they don't, then our logic below could try to create a JavascriptArray of -1 elements (only if it has 0 collation values, which maybe is fine?), which would just failfast elsewhere (I hope...)
The Intl spec says that every locale has SearchLocaleData and SortLocaleData and usage must accept both search and sort for all locales, so I think the API was designed with the assumption that everyone would have both. It seems like a rare enough case, at least, that I would be interested in being alerted via a defensive failfast if at some point it doesn't hold.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks Jack. To clarify I mean that not every individual locale (in and of itself) would necessarily have multiple collation types. (Like for example "en" might not anything specific to that locale at all).
Since there is fallback to the root data, the net effect is that yes I think everybody would have multiple.
const char16 searchString[] = _u("search"); | ||
if (usage->BufferEquals(searchString, _countof(searchString) - 1)) // minus the null terminator | ||
{ | ||
uloc_setKeywordValue("collation", "search", localeID, _countof(localeID), &status); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, yeah, it is probably better to only set the "search" type, that way the defaults for the locale would be used for the "standard" type.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, the intl spec isn't super clear about whether usage=search or -u-co takes precendence since its working at a different abstraction level with the SortLocaleData and SearchLocaleData; if nothing else, SpiderMonkey refuses to accept -u-co when usage == search
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, it does seem a bit odd.. I was wondering about the following scenario, though perhaps it is an edge case.
For example, what if (somehow) the default user locale was set to something other than the standard type. Then if the usage is "standard", would it make sense to change it to force the standard type? If yes, then how would you get the "default" type? Seems a bit odd.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Though I guess you have to include the locale in the parameters first, before you can specify the options -- so maybe my scenario doesn't make sense, unless you can somehow pass "und" as the locale, and it picks up the host OS default locale.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also good point about what takes precedence as well.
Ex:
Intl.Collator('de-DE-u-co-phonebk', { usage: 'search' })
Will this give "search" type collation, or "phonebook" type collation?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for any confusion -- it was more of a question to @jackhorton about how the code/change currently works in the pull-request at the moment.
Right now the code only looks to see if the caller of the Intl API user passed in the "usage: search
" in the options bag. If the caller passes in "usage: sort
" then it doesn't explicitly set anything, and will use the locale/language tag "as-is".
My question was perhaps it should also explicitly set the collation keyword in the "usage: sort
" case as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My understanding was that to ICU, en@collation=standard
was ~equivalent to en
, so we don't need to do anything special in that case (since standard === sort === the default for both Intl and ICU)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, what about something like this:
new Intl.Collator('es-u-co-trad', { usage: 'sort' })
If "sort === standard", and the options bag takes precedence, it sounds like in the above case you should be effectively getting 'es-u-co-standard' instead of 'traditional' -- right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh... everyone (chakra/v8/sm) currently agrees that results in locale === es-u-co-trad and usage === sort. But I see your point that if usage is specified it should override -u-co, even if the specified value is sort... I can bring it up on the original tc39 issue that prompted me to implement usage, or bring it up at the meeting.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can bring it up on the original tc39 issue that prompted me to implement usage, or bring it up at the meeting.
Yeah, would be good to discuss it. I guess the issue is that we don't want to alter unicode extensions you passed as we give them back in resolvedOptions. But maybe we should remove them? (those are the locales we matched, and those are the options we matched)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
ee1b45a
to
962624d
Compare
…l.Collator's "usage" option Merge pull request #5651 from jackhorton:intl/projected-enums-and-usage Also, some assorted fixes and cleanup: - We now correctly set the initSlotCapacity of the Intl native interface object - Moved around some of the asserts in GetLocaleData to be more defensive. None hit, just noticed they had gaps. - Added caching to Number.prototype.toLocaleString, which is now ~O(1) when calling with default arguments (locally on a debug build, time spent calling toLocaleString 100 times went from 140ms to 20ms) Fixes #5648 Fixes #4007
Also, some assorted fixes and cleanup:
Fixes #5648
Fixes #4007