Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NumberFormat may give too much precision for some Numbers #128

Closed
littledan opened this issue Feb 12, 2017 · 32 comments
Closed

NumberFormat may give too much precision for some Numbers #128

littledan opened this issue Feb 12, 2017 · 32 comments
Assignees
Labels
c: numbers Component: numbers, currency, units s: in progress Status: the issue has an active proposal Small Smaller change solvable in a Pull Request web reality
Milestone

Comments

@littledan
Copy link
Member

@Yaffle found a pretty interesting case for what the spec requires and wrote a test for it at tc39/test262#856 . The test fails on multiple implementations, and the web reality may be the more intuitive answer. See that thread for details.

@littledan
Copy link
Member Author

It's not just for large numbers, also for decimals showing up to 20 digits of precision, we may get less precise answers. See the test here: tc39/test262#857

I wonder if the answer here is to have a default maxSignificantDigits settling which reflects the inherent precision of doubles. I can understand one ICU doesn't have this default--the same DecimalFormat class is used for 64-bit ints, which may have more significant figures (maybe we'll extend NumberFormat for a BigInt overload, which could make this idea not make sense, anyway). Or we could leave that setting the same, and just adopt whatever algorithm ICU is using here for finding the string of digits.

@littledan littledan changed the title NumberFormat may give too much precision for large Numbers NumberFormat may give too much precision for some Numbers Feb 12, 2017
@Yaffle
Copy link

Yaffle commented Feb 14, 2017

(1.000000000000045).toLocaleString("ru",{"useGrouping":false,"minimumSignificantDigits":1,"maximumSignificantDigits":14})
===
"1,0000000000001"; // How?

UPDATE: this bug was fixed, seems

@littledan
Copy link
Member Author

@Yaffle Yikes, that's awful. Seems like an ICU bug, since multiple browsers seem to just call out to ICU for all of this logic and they share the same bug. I reduced the test case to just calling ICU, and filed an issue upstream at http://bugs.icu-project.org/trac/ticket/12989 .

@Yaffle
Copy link

Yaffle commented Feb 15, 2017

@littledan , thanks!
This case also gives "1" in IE and Edge, "1,0000000000001" - in other browsers.

@littledan
Copy link
Member Author

This issue should probably be tagged "Web Reality", but I don't have tagging rights.

leobalter pushed a commit to tc39/test262 that referenced this issue Mar 1, 2017
Note:
12344501000000000487815444678311936 === 12344501000000000000000000000000000 for binary64 floating points;

Ref tc39/ecma402#128
leobalter pushed a commit to tc39/test262 that referenced this issue Mar 1, 2017
1) (123.44500) == 123.444999999999993179
2) (123.44500).toPrecision(5) === "123.44" gives correct value in Chrome and Firefox;

Ref tc39/ecma402#128
leobalter added a commit to tc39/test262 that referenced this issue Mar 1, 2017
@sffc
Copy link
Contributor

sffc commented Mar 7, 2017

This is shane from the ICU team. This is a known issue. Here's the upstream ticket:

http://bugs.icu-project.org/trac/ticket/11318

I'm working on a major overhaul of the number formatting pipeline in ICU, and this is going to be addressed as part of that effort.

@littledan
Copy link
Member Author

littledan commented Mar 9, 2017

@sffc Great, thanks Shane! Just wondering, when this is done, will it be resolved within DecimalFormat, or will users need to upgrade to a different class?

@jungshik
Copy link

@littledan It looks like it'll be resolved in DecimalFormat.

@caridy
Copy link
Contributor

caridy commented Aug 10, 2017

We need a brave soul to work on this before we go to the committee for this.

@littledan
Copy link
Member Author

Seems like we should add an upper limit on the right things related to numbers, to match web reality. Does anyone want to make a PR?

@sffc
Copy link
Contributor

sffc commented Mar 16, 2018

FYI, ICU4C 61 now uses google/double-conversion to convert doubles into digits. It may resolve issues such as this one.

@littledan
Copy link
Member Author

That's great that ICU is fixed now. Does the new algorithm correspond to Number.prototype.toString or is it something else?

The other side of the fix will be updating the specification text and test262 tests to verify that the right number of digits is output.

@sffc
Copy link
Contributor

sffc commented Mar 16, 2018

It should in principle be the same algorithm, because it's the same library as used in V8 and elsewhere.

https://github.com/google/double-conversion

@anba
Copy link
Contributor

anba commented Mar 16, 2018

This specific issue won't be fixed by ICU61, because the current algorithm still requires too much precision, just as mentioned in tc39/test262#856 (comment) ("[...] to allow some inaccuracy when x >= 1e+21.")

@littledan
Copy link
Member Author

Good to know it's the same algorithm, though. That means we can probably just copy the text from Number.prototype.toString. Would anyone be interested in creating such a patch?

@sffc
Copy link
Contributor

sffc commented Apr 7, 2018

I seems counterintuitive to me that the spec insists on showing digits outside of the range supported by an IEEE double, which is only 15-17 significant digits. The double 1E21, or 0x444B1AE4D6E2EF50 in IEEE bytes, should be rendered as 1000000000000000000000, not as 1000000000000000012906. The stuff at the end is just noise and is outside the range that IEEE Double was designed to support.

@sffc
Copy link
Contributor

sffc commented Apr 7, 2018

If ICU really needs to support un-rounded doubles, that will need to be a feature added to ICU. However, I would like to first see a discussion on whether or not this tc39/test262 spec test is actually the right behavior.

@Yaffle
Copy link

Yaffle commented Apr 7, 2018

@sffc,

1000000000000000012906

How did you get this noise? 1E21 is stored exactly in double:

1E21=2**21 * 5**21 and 5**21 < 2**53, 2**21 goes to the exponent

@littledan
Copy link
Member Author

@sffc It sounds like ICU's behavior is good, and we should look into a specification change to match it.

@sffc
Copy link
Contributor

sffc commented Jan 9, 2019

I am still confused about test262 and Yaffle's interpretation of the spec.

The spec says:

Let n be an integer for which the exact mathematical value of n ÷ 10f – x is as close to zero as possible. If there are two such n, pick the larger n.

It says nothing about that operation taking place in double space. In fact, the phrase "exact mathematical value" to me says, the result of the operation without floating-point noise.

@Yaffle, can you please clarify?

Implementation-wise, ICU 61 and higher use Google double-conversion such that conversion from numbers to strings is exact, not constrained by floating-point noise. As I said before, if ICU needs to support double values, that would be a feature that would need to be added to a future version of ICU.

FYI, @FrankYFTang opened tc39/test262#2027, which changes part of test262 to reflects the current (post-61) behavior of ICU.

@Yaffle
Copy link

Yaffle commented Jan 9, 2019

@sffc,

(123.445).toLocaleString("en", {maximumFractionDigits:2})  // 123.45 in Chrome and Firefox

BUT here the 123.445 is a IEEE 754 64-bit value, it is not a string.
And that means, that it is 123.44499999999999317878973670303821563720703125.
So you have:

(123.44499999999999317878973670303821563720703125).toLocaleString("en", {maximumFractionDigits:2})

it is closer to decimal123.44, than to 123.45.

@sffc
Copy link
Contributor

sffc commented Jan 9, 2019

The floating-point representation of 123.445 is an implementation detail; that's the point I'm trying to make. Every IEEE double has a one-to-one mapping with a short form of 15-17 significant digits. In this case, the bits 0x405EDC7AE147AE14 map to 123.445 (short form) or 123.444999999999993178789736703~ (long form). I do not see why you think we should use the long form for the rounding logic.

@Yaffle
Copy link

Yaffle commented Jan 9, 2019

  1. JavaScript says, that it is a 64 bit floating binary format.
  2. This format can store the value 2171667406252933 / 2**44 exactly. So this is the exact mathematical value used when the JavaScript parser converts "123.445" into the number representation format.
  3. The spec says nothing about the short form. It says Let n be an integer for which the exact mathematical value of n ÷ 10f – x is as close to zero as possible. If there are two such n, pick the larger n..
  4. the exact mathematical "123.44" is closer to exact mathematical "2171667406252933 / 2**44", than exact mathematical "123.45"

@Yaffle
Copy link

Yaffle commented Jan 9, 2019

one-to-one mapping

The spec does not say to do the mapping, then to do the rounding, which leads to the "double rounding".

@sffc
Copy link
Contributor

sffc commented Jan 9, 2019

Okay. I also noticed that Number.prototype.toPrecision has the same language:

Let e and n be integers such that 10p-1 ≤ n < 10p and for which the exact mathematical value of n × 10e-p+1 - x is as close to zero as possible. If there are two such sets of e and n, pick the e and n for which n × 10e-p+1 is larger.

https://www.ecma-international.org/ecma-262/9.0/index.html#sec-number.prototype.toprecision

The web reality of toFixed and toPrecision (not the intl versions) is:

[3.15, 30.15, 300.15, 3000.15, 30000.15, 300000.15, 3000000.15].forEach((num) => {
    console.log(num.toFixed(1) + " " + num.toPrecision(21));
});

Output (Chrome and Firefox agree):

3.1 3.14999999999999991118
30.1 30.1499999999999985789
300.1 300.149999999999977263
3000.2 3000.15000000000009095
30000.2 30000.1500000000014552
300000.2 300000.150000000023283
3000000.1 3000000.14999999990687

So, the web reality of the non-intl versions of these functions is to treat doubles as their long form when rounding. This is of course different from the web reality for intl, which assumes the numbers should be treated as if they were equivalent to their short form:

[3.15, 30.15, 300.15, 3000.15, 30000.15, 300000.15, 3000000.15].forEach((num) => { 
    console.log(num.toLocaleString("en", { maximumFractionDigits: 1 }) + " "
        + num.toLocaleString("en", { maximumSignificantDigits: 21 }));
});

Output (Chrome/FF):

3.2 3.15
30.2 30.15
300.2 300.15
3,000.2 3,000.15
30,000.2 30,000.15
300,000.2 300,000.15
3,000,000.2 3,000,000.15

I personally find the web reality Intl behavior more intuitive.

Was this ever discussed in TC39 at the language level? Are there backlinks to if, how, and when it was decided to make the non-intl functions use the floating-point behavior?

@sffc
Copy link
Contributor

sffc commented Jan 12, 2019

Although I can see some reasons why one could make an argument to change Intl.NumberFormat to match the behavior of Number.prototype, here are some reasons to keep the status quo.

  1. Web reality. (Always a plus.)
  2. Intl.NumberFormat is intended to be "human friendly", and I claim that most humans would be surprised that 3.15 rounds down due only to a computer precision issue.
  3. The "default" behavior for stringifying a Number is Number.prototype.toString, which uses the short form, so one could argue that the default behavior of Intl.NumberFormat should also be to use short form, possibly with long form available via an override.
  4. There would not otherwise be a way in Intl.NumberFormat to produce the short form.
  5. This feature is not in ICU and will require some work, possibly coming at performance costs.

I could almost see an argument that when fraction digits are set, short form rounding is used, and when significant digits are set, long form rounding is used. This would retain benefits 2-4. However, we would lose benefits 1 and 5.

I will bring this up at next week's Ecma 402 meeting.

@sffc
Copy link
Contributor

sffc commented Jan 19, 2019

The resolution from the Ecma 402 meeting is that the web reality behavior is acceptable and we should change the spec to allow for this behavior. It was suggested that the spec can specify that the direction of half rounding can be implementation dependent.

@sffc sffc added c: numbers Component: numbers, currency, units Small Smaller change solvable in a Pull Request s: help wanted Status: help wanted; needs proposal champion and removed help wanted labels Mar 19, 2019
@sffc sffc modified the milestones: 3rd Edition, ES 2021 Mar 20, 2020
@sffc
Copy link
Contributor

sffc commented Mar 20, 2020

I'd like to see any required spec work completed in time for ES 2021.

Also relevant: a similar topic came up when discussing https://github.com/tc39/proposal-decimal on at TC39 on 2020-02-04. CC @waldemarhorwat.

@sffc sffc added s: in progress Status: the issue has an active proposal and removed s: help wanted Status: help wanted; needs proposal champion labels Jun 5, 2020
@sffc sffc self-assigned this Jun 5, 2020
@Yaffle
Copy link

Yaffle commented Jun 20, 2020

So, it seems, it works like at first number.toString() is called, and then the resulting decimal is formatted.

@sffc
Copy link
Contributor

sffc commented Sep 18, 2023

We spent a great bit of time working in this problem space in the context of the Intl NumberFormat V3 proposal, which reached Stage 4 earlier this year. You can see the results of some of this effort here:

https://tc39.es/ecma402/#sec-tointlmathematicalvalue

I believe that ToIntlMathematicalValue resolves this issue involving the precision of Numbers in Intl.NumberFormat. If you disagree, please comment or re-open the issue.

@sffc sffc closed this as completed Sep 18, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
c: numbers Component: numbers, currency, units s: in progress Status: the issue has an active proposal Small Smaller change solvable in a Pull Request web reality
Projects
No open projects
Development

No branches or pull requests

6 participants