Clarify "combining options" support in the spec #26
Comments
What is allowed to be implementation-dependent about it, and what's the rationale for that? |
This has to do with the "LocaleData" internal field. This is the part of the spec that is really long and wordy. I made a flow chart to demonstrate the data specified to be in LocaleData: Once you're out the bottom of the flow chart (into negativePattern, zeroPattern, or positivePattern), then this line from the spec applies:
What this means: if a user requests to format a currency like GBP, or a unit like meter, the implementation is required to produce a string that has a placeholder for the currency sign or the unit name. For the notation, if the implementation doesn't know how to display a certain notation, that's fine as long as the implementation doesn't artificially change the magnitude of the number without providing a corresponding pattern. However, right now, the spec doesn't forbid them from doing that. FYI, the scale of the number is determined by ComputeExponentForMagnitude, which reads:
Probably what we need to do is:
Does that make any sense? Thoughts? |
@sffc I did not have full time to absorb all of this information today. I have dedicated time tomorrow to look at this and I just wanted to ping you with this update. Thanks for being descriptive and the flow chart. |
I haven't found a full conclusion on the topic but I guess we could reuse the flow chart in the specs for a better visualization? If the image is too big maybe SVG is doable?
Seems valid.
I agree.
+1
I'm +1 but guess we use just one notation? If we can't reuse Symbol and Name together, Back from the top post:
I'm in favor and I appreciate the proposed outcome. Hopefully it finds acceptance for possible compatibility across implementations. |
Traditionally, in ECMA-402, for things which are in ICU algorithms but not specified in CLDR, we specified the algorithm in ECMA-402 itself. This can be helpful for people who are writing polyfills based on the CLDR data, so they don't have to read ICU source. Would such an approach be possible here? |
Another problem is that there are locales that might not have a number in the unit or compact notation. For example, Hebrew and Somali and a few other languages conventionally use a singular word in place of the digit 1. For example, the compact notation in Somali for 1000 is "Kun" (not "1 Kun", just "Kun"). So, anything that generates a pattern with a placeholder does not technically work for all locales, based on what ICU does under the hood. LocaleData is really just a hack right now. I would prefer to see this overhauled, probably not just in Intl.NumberFormat but also across Intl. We should build it with the following ticket in mind:
Not sure; is there an accepted way to put images into the spec?
I followed the model from currencies, which was to use different placeholder symbols for symbols vs. names.
I already put in the spec the algorithm ICU uses for picking the exponent in scientific notation. I left the algorithm for picking the exponent in compact notation abstract. If I wanted to make that algorithm non-abstract, it would solve some of the problems in this thread, but it would mean adding another LocaleData tree specifically for encoding the compact notation. |
FYI, the algorithm for picking the exponent in compact notation is actually in the CLDR specification, but it requires data in order to execute on it: https://unicode.org/reports/tr35/tr35-numbers.html#Compact_Number_Formats |
Related: tc39/test262#2233 |
…attern As per tc39#26, the pattern hierarchy is `patterns` -> `signDisplay` -> `displayNotation` -> `zero/negative/positionPattern`. Currently in the spec it’s reversed (`patterns` -> `displayNotation` -> `signDisplay`) Reference implementation; https://github.com/formatjs/formatjs/blob/master/packages/intl-unified-numberformat/src/core.ts#L889
I'm really hoping this would be addressed before stage-4. The current spec actually doesn't allow us (FormatJS) to write a 100% polyfill because compact patterns are also ILD (which the spec doesn't allow room for in |
An idea I have is to create another layer in the tree to account for |
This issue is still open and I intend to address it before Stage 4.
The result of
I'll look into this. |
|
It's ILD in the sense that the tree comes from locale data, which is ILD. |
Sorry I might have misunderstood this. So compact pattern is in fact ILD already but the issue here is that it's exponent & plural-dependent (e.g the |
This is hand-waived by the line,
So we basically say that the data is something like |
So I understand the symbol can be exponent/plural-dependent but we need some wording for the compact pattern to be as well. I believe this is more than the The pattern for A more extreme one would be |
Oh are you saying to just insert |
As written, yes, basically that's how you're intended to do it. That's something specific that we could clarify in the spec, that As written, we don't handle cases where the symbol moves around or removes the number field entirely. It's not the only case, though, where the spec doesn't handle all edge cases of rendered localized output. Intl.NumberFormat already has this pre-existing problem in currency formatting where the symbol position could replace the decimal separator (tc39/ecma402#241). So we're not really adding any new problems by not handling some of the compact notation edge cases in the spec. |
Gotcha thanks for the clarification! |
Just an update I gave this a try and unfortunately got into an edge case in production w/ |
You should include the space in |
OK. Great point. I'll see if there's a way to fix this in the spec without too big of a diff. |
I believe all the issues in this thread were addressed by #90. |
The "combining options" feature is allowed to be implementation-dependent, but the spec should at least require that no information is lost when processing it. Investigate how to do this in the data loading section of the spec text.
The text was updated successfully, but these errors were encountered: