Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

double.ToString("R") not working for certain number on x64 #13106

Closed
mazong1123 opened this issue Jul 29, 2017 · 19 comments

Comments

@mazong1123
Copy link
Collaborator

commented Jul 29, 2017

Extract this issue from #10651 (comment)

Generally speaking, double.ToString("R") tries to:

  1. Convert the double to string in 15 digits precision.
  2. Convert the string back to double and compare to the original double. If they are the same, we return the converted string whose precision is 15.
  3. Otherwise, convert the double to string in 17 digits precision.

This introduced a bug and also listed as a known issue on MSDN (See "Notes to Callers"):
https://stackoverflow.com/questions/24299692/why-is-a-round-trip-conversion-via-a-string-not-safe-for-a-double
https://msdn.microsoft.com/en-us/library/kfsatb94(v=vs.110).aspx

To reproduce this issue, simplely try following code:

using System;

namespace lab
{
    class Program
    {
        static void Main(string[] args)
        {
            double d1 = 0.84551240822557006;
            string s = d1.ToString("R");
            double d2 = double.Parse(s);
            Console.WriteLine(d1 == d2);
        }
    }
}

The output is False which we expect True. The reason is d1.ToString("R") wrongly chose the result in precision of 15, which is 0.84551240822557. If we choose result in precision of 17, which is 0.84551240822557006, we can convert it back to double accurately.

The proposal is just convert the double to string in 17 digits precision directly. We can give it a try but it may be a breaking change with compat issue as @jkotas said here: #10651 (comment)

@tannergooding

This comment has been minimized.

Copy link
Member

commented Jul 29, 2017

For the bug: 0.84551240822557 is 3FEB0E7009B61CDF and 0.84551240822557006 is 3FEB0E7009B61CE0, so the delta for their bit representations is 1.

So, it would seem that there is an off-by-one error somewhere in NumberToDouble.

I would think we should find this off-by-one error before taking any other course of action.

@tannergooding

This comment has been minimized.

Copy link
Member

commented Jul 29, 2017

It looks like we are taking the incorrect path here: https://github.com/dotnet/coreclr/blob/master/src/classlibnative/bcltype/number.cpp#L534.

Rounding to even in this scenario causes us to generate the incorrect value.

In any case, It's late and I'm not going to finish processing exactly what it is doing wrong tonight.

@tannergooding

This comment has been minimized.

Copy link
Member

commented Jul 29, 2017

Okay, the rounding logic is correct, so the error must be earlier on...

val is 0xD873804DB0E6FC00, this is the mantissa (0xB0E7009B61CDF) + the guard bit.

The logic for 'round to even' is, do nothing if the guard bit is 0 or if the guard bit is 1 and the LSB of the mantissa is 0. In this case, both the guard bit and the LSB of the mantissa are 1, so it rounds up setting val to 0xD873804DB0E70000.

I'm going to keep digging to see if the guard bit is being set incorrectly at some point.

@tannergooding

This comment has been minimized.

Copy link
Member

commented Jul 29, 2017

Ok. Implementation looks and behaves correctly, as best as I can tell.

After going through the IEEE spec again, it comes down to these two rules:

Conversions from a supported binary format bf to an external character sequence and back again results in a copy of the original number so long as there are at least Pmin (bf ) significant digits specified and the rounding-direction attributes in effect during the two conversions are round to nearest rounding-direction attributes.

Conversions from a supported decimal format df to an external character sequence and back again results in a canonical copy of the original number so long as the conversion to the external character sequence is one that preserves the quantum

For double-precision, the first rule is essentially: 'a double converted to a string with at least 17 digits and back must match'
The second rule is essentially: a string with 15 or less digits converted to a double and back to a string with the same number of digits must match

Here, we are only converting from a double to a string with 15 digits and back, so we are not meeting the requirements put forth in the IEEE spec.

@tannergooding

This comment has been minimized.

Copy link
Member

commented Jul 29, 2017

IMO, it is worth doing one of the following:

  1. Taking the breaking change and adding a quirk on .NET Framework if possible
    -or-
  2. Adding a new standard numeric format string for IEEE Compliant round-tripping (which always uses 17 digits)
@tarekgh

This comment has been minimized.

Copy link
Member

commented Jul 29, 2017

I believe for net core we can go ahead and take the breaking change there without any quirks. for the desktop, if we need to apply the fix there we have to introduce it with a quirk.

The second option listed by @tannergooding is possible too but I prefer to fix "R" to do the right thing as introducing a new format specifier will make "R" useless anyway.

CC @AlexGhiondea

@mazong1123

This comment has been minimized.

Copy link
Collaborator Author

commented Jul 30, 2017

@tannergooding are you starting fixing this? If not, I can help working for this issue when the option decided.

@tannergooding

This comment has been minimized.

Copy link
Member

commented Jul 30, 2017

@mazong1123, not yet. I was going to wait for a couple more people (those tagged already) to chime in first (with regards to back-compat concerns).

@tarekgh

This comment has been minimized.

Copy link
Member

commented Jul 30, 2017

As I mentioned we can apply the fix in .Net Core without any quirks. if we want to apply this fix from now, we need to ensure there is no perf regression when using "R' options.

CC @jkotas @AlexGhiondea

@mazong1123

This comment has been minimized.

Copy link
Collaborator Author

commented Aug 1, 2017

@tannergooding are you working on this? If not, I'd like to send a PR to see if it would break anything.

@tarekgh

This comment has been minimized.

Copy link
Member

commented Aug 1, 2017

@mazong1123 go ahead and submit the PR. please try to measure the perf with your changes to know always formating with 17 significant digits can change the perf.

@tannergooding

This comment has been minimized.

Copy link
Member

commented Aug 22, 2018

Tagging all the people who were involved in #13140, plus a couple others. @eerhardt, @jkotas, @danmosemsft, @tarekgh, @AlexGhiondea, @gafter, @joperezr, @JamesNK, @terrajobst

As indicated by this issue, the current handling of "R" is incorrect and results in non-roundtrippable values for certain scenarios. The current handling is:

  • Attempt to convert to G15 digits
  • Convert back
  • If the value converted back matches, return
  • Otherwise, attempt to convert to G17 and return that value

This also causes a fairly decent perf hit, as compared to just converting to G17 directly.

The people who are using "R" today, are likely doing so for serialization purposes (they want to be able to round-trip the result) and they would be broken for any of the values that don't currently round-trip successfully. There are also likely some repos (like Newtonsoft.JSON, but possibly also CoreFX, CoreCLR, and Roslyn) which are reliant on the current behavior. Some examples would be in tests where they are directly comparing the "expected" vs "actual" string produced (rather than parsing the strings back to floating-point values and validating that those are equal).

My opinion is that we should take this breaking change, as it results in more correct/compliant behavior; However, I believe this is a large enough breaking change that we should also provide a "quirk" switch and ensure that the change is properly documented/announced to the public (with the reasoning of why the change is being made).

@tarekgh

This comment has been minimized.

Copy link
Member

commented Aug 22, 2018

I think we need to think in the scenario exchanging data between the .Net Core and the full framework. Serialization scenario can cross the framework versions. I think having the opt-out switch is reasonable but I heard before we use the switches in .Net Core only in scenarios prevents migration to .Net Core.

@tannergooding

This comment has been minimized.

Copy link
Member

commented Aug 22, 2018

I think the general point is that serialization can occur anywhere, and incorrect serialization is bad.

We are already in a world where CoreFX is doing the wrong thing, which means we are broken for converting to/from anything that isn't also doing the same broken behavior. As an example, serializing from C# and then deserializing that value in C++ (or vice versa) can wind up with different results (and may not be round-trippable).

My preference is to ensure we are IEEE compliant here, and actually round-trippable with anything else that is IEEE compliant (when using the "round-trippable" specifier).

@jkotas

This comment has been minimized.

Copy link
Member

commented Aug 22, 2018

My preference is to ensure we are IEEE compliant here

Agree.

we use the switches in .Net Core only in scenarios prevents migration to .Net Core

We add quirk switches to .NET Core re-actively, only once we get feedback that they are really needed for something important. We do not add them proactively like in .NET Framework just because of they might be needed in theory.

@tannergooding

This comment has been minimized.

Copy link
Member

commented Aug 22, 2018

For reference, the relevant bits from the IEEE spec are under "5.12 Details of conversion between floating-point data and external character sequences":

Some excerpts that are particularly indicative of the requirements are:

Implementations shall provide conversions between each supported binary format and external decimal character sequences such that, under roundTiesToEven, conversion from the supported format to external decimal character sequence and back recovers the original floating-point representation, except that a signaling NaN might be converted to a quiet NaN.

For the purposes of discussing the limits on correctly rounded conversion, define the following quantities:
― for binary16, Pmin (binary16) = 5
― for binary32, Pmin (binary32) = 9
― for binary64, Pmin (binary64) = 17

Conversions to and from supported decimal formats shall be correctly rounded regardless of how many digits are requested or given.

Conversions from a supported binary format bf to an external character sequence and back again results in a copy of the original number so long as there are at least Pmin (bf) significant digits specified and the rounding-direction attributes in effect during the two conversions are round to nearest rounding-direction attributes

There are others covered in the pages as well, but the above should cover the current issue.

@tannergooding

This comment has been minimized.

Copy link
Member

commented Aug 29, 2018

@jkotas, @danmosemsft.

What (if anything) else do we need to do here to get such a breaking API change approved?

Additionally, once (and if) this is approved, what needs to be done to communicate the breaking change to things like Newtonsoft.JSON (where they are known to be impacted, but will need to take a fix)? -- CC @JamesNK

@JamesNK

This comment has been minimized.

Copy link
Member

commented Aug 29, 2018

I don't think this would break Newtonsoft.Json, it would just change the value of serialized doubles. Of course that could break people who depend on the current serialized value.

Only action in Newtonsoft.Json would likely be fixing some unit tests where the serialized double value has changed.

@tannergooding

This comment has been minimized.

Copy link
Member

commented Feb 1, 2019

Now that #22040 is merged, for .NET Core 3.0, ToString(), ToString("G"), ToString("R"), double.ToString("G17"), and float.ToString("G9") should now always produce a string that will roundtrip to the original value..

I am closing this as #3313 is tracking the addition of an explicit testcase for 0.84551240822557006.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
5 participants
You can’t perform that action at this time.