Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Double doesn't Parse correctly #17467

Closed
cyberphone opened this issue Apr 7, 2018 · 28 comments

Comments

@cyberphone
Copy link

commented Apr 7, 2018

While doing some mathematical stuff that worked fine in Java I found that C# / .NET does a rather poor job on parsing double values:

Number2Parse=4.3413853813852797e+192 IEEE754=67ee7316b1545878 C#=4.3413853813852797E+192
Number2Parse=4.34138538138528e+192 IEEE754=67ee7316b1545878 C#=4.3413853813852797E+192

You can verify in any browser that 4.3413853813852797e+192 and 4.34138538138528e+192 are distinct values. Version: CLR v4.0.30319

using System;

namespace mathbug
{
    class Program
    {
        static void ShowNumber(string number2Parse)
        {
            double d = Double.Parse(number2Parse);
            ulong ieee754 = (ulong)BitConverter.DoubleToInt64Bits(d);
            Console.WriteLine("Number2Parse=" + number2Parse + " IEEE754=" + Convert.ToString((long)ieee754, 16) + " C#=" + d.ToString("G17"));
        }
 
        static void Main(string[] args)
        {
            ShowNumber("4.3413853813852797e+192");
            ShowNumber("4.34138538138528e+192");
        }
    }
}
@danmosemsft

This comment has been minimized.

Copy link
Member

commented Apr 7, 2018

@tannergooding is this expected?

@cyberphone

This comment has been minimized.

Copy link
Author

commented Apr 7, 2018

This is what a Java equivalent does:

Number2Parse=4.3413853813852797e+192 IEEE754=67ee7316b1545878 Java=4.3413853813852797E192
Number2Parse=4.34138538138528e+192 IEEE754=67ee7316b1545879 Java=4.34138538138528E192
@tannergooding

This comment has been minimized.

Copy link
Member

commented Apr 7, 2018

@danmosemsft, no. There was quite a bit of work to make double.ToString accurate (and more performant).

However, it looks like there is a bug in the reverse case (converting string to double). The relevant code is here: https://source.dot.net/#System.Private.CoreLib/shared/System/Number.Parsing.cs,458977730d4a33f6, but I haven't dug in to find out what is wrong.

@tannergooding

This comment has been minimized.

Copy link
Member

commented Apr 7, 2018

The other relevant code is here: https://source.dot.net/#System.Private.CoreLib/shared/System/Number.Parsing.cs,1e8cf7c499ee514a

and here:

void NumberToDouble(NUMBER* number, double* value)

@Suchiman

This comment has been minimized.

Copy link
Collaborator

commented Apr 7, 2018

On my machines (i5-4670K, i5-6300U), on Debug and Release with netcoreapp2.0, netcoreapp2.1 and net4.7.1 this actually produces

Number2Parse=4.3413853813852797e+192 IEEE754=6b40e724b8d6a5ee C#=4,34138538138528E+208
Number2Parse=4.34138538138528e+192 IEEE754=6ad5a2be5d3bb5ac C#=4,3413853813852803E+206
@AlexGhiondea

This comment has been minimized.

Copy link
Member

commented Apr 12, 2018

@tarekgh could you help with this one?

@AlexGhiondea AlexGhiondea added this to the 2.1.0 milestone Apr 12, 2018

@tannergooding

This comment has been minimized.

Copy link
Member

commented Apr 12, 2018

@Suchiman, what locale are you? I'd like to try and reproduce your issue, since it doesn't appear to reproduce under my default.

@tarekgh

This comment has been minimized.

Copy link
Member

commented Apr 12, 2018

@cyberphone would be nice if you can share more info regarding your environment. I mean OS version, you user locale, your machine architecture. we just need to repro it first and then we can try to help to tell what is going on.

@tannergooding

This comment has been minimized.

Copy link
Member

commented Apr 12, 2018

@tarekgh, this repros on a clean Windows 10, English machine. There is definitely a bug somewhere in our the CoreCLR native parsing layer (see #17467 (comment)), but I didn't dig further into it.

@cyberphone

This comment has been minimized.

Copy link
Author

commented Apr 12, 2018

@tarekgh I have Windows 10 Pro, Version 1709, Build 16299.371, US Locale
running on a Dell XPS laptop with i7-4702HQ processor

@tarekgh

This comment has been minimized.

Copy link
Member

commented Apr 12, 2018

@cyberphone Thanks, I am able to repo it too now.

@jkotas

This comment has been minimized.

Copy link
Member

commented Apr 12, 2018

#13615 is related/dup.

@tarekgh

This comment has been minimized.

Copy link
Member

commented Apr 12, 2018

Thanks, @jkotas for pointing to the other issue. I am seeing the direction is to port the parsing code from Roslyn to coreclr as mentioned in the comment #13615 (comment), I'll move this issue to the future milestone and leave it open to track porting the parsing code.

@tarekgh tarekgh modified the milestones: 2.1.0, Future Apr 12, 2018

@tarekgh tarekgh removed their assignment Apr 12, 2018

@tarekgh tarekgh added the bug label Apr 12, 2018

@tarekgh

This comment has been minimized.

Copy link
Member

commented Apr 12, 2018

A quick note too, I am seeing Roslyn implementation is using BigInteger class which I am not sure if there will be a perf issue there or we need to get optimized version.

@jkotas

This comment has been minimized.

Copy link
Member

commented Apr 12, 2018

I am seeing the direction is to port the parsing code from Roslyn to coreclr

This is one possible direction. I think we should do our homework similar to what we have done for formatting. See what is the state of the art out there, and then pick the best option.

@tannergooding

This comment has been minimized.

Copy link
Member

commented Apr 12, 2018

A quick note too, I am seeing Roslyn implementation is using BigInteger class which I am not sure if there will be a perf issue there or we need to get optimized version.

The CoreCLR native code uses a highly optimized (and limited) BigInteger implementation

@cyberphone

This comment has been minimized.

Copy link
Author

commented Apr 13, 2018

Delete this!. I ran another program which was incomplete.

PARDON

@cyberphone

This comment has been minimized.

Copy link
Author

commented Apr 13, 2018

@tarekgh Delete the above,I erred :-(

BTW, I have a file with 100 million random and specific test values. This is how I found the problem. My application depends on perfect "roundtrips" including compatibility with ES6/V8.

This is my application:
https://cyberphone.github.io/doc/security/draft-rundgren-json-canonicalization-scheme.html#json.ser.number

It would be great if there was a Double.ToString("ES") method.

@Suchiman

This comment has been minimized.

Copy link
Collaborator

commented Apr 13, 2018

@tannergooding Windows 10 Pro 1709 de-DE.

@cyberphone

This comment has been minimized.

Copy link
Author

commented Apr 13, 2018

@Suchiman My simple test program didn't take locale in consideration so '.' doesn't get interpreted as expected in DE.

Double.Parse(number2Parse) must be changed to use invariant locale. Probably like:
Double.Parse(number2Parse, NumberStyles.Float, CultureInfo.InvariantCulture)

@Suchiman

This comment has been minimized.

Copy link
Collaborator

commented Apr 13, 2018

@cyberphone Wow 🤦‍♂️ i didn't pay attention to that, with , instead of . it gets the result described above 👍

@cyberphone

This comment has been minimized.

Copy link
Author

commented Apr 27, 2018

In case you are interested in 100M test values you may download the following file:
https://onedrive.live.com/embed?cid=9341770E0D0D5468&resid=9341770E0D0D5468%21222&authkey=ADOClRsuPv3_pTk

Associated test program:

using System;
using System.IO;
using System.Globalization;

namespace mathbug
{
    class Program
    {
        static void Main(string[] args)
        {
            using (StreamReader sr = new StreamReader("c:\\es6\\numbers\\es6testfile100m.txt"))
            {
                string line;
                // Read and display lines from the file until the end of 
                // the file is reached.
                long counter = 0;
                while ((line = sr.ReadLine()) != null)
                {
                    string origIeeeHex = line.Substring(0, line.IndexOf(','));
                    while (origIeeeHex.Length < 16)
                    {
                        origIeeeHex = '0' + origIeeeHex;
                    }
                    ulong origIeeeBin = Convert.ToUInt64(origIeeeHex, 16);
                    double origIeee = BitConverter.Int64BitsToDouble((long)origIeeeBin);
                    string number2Parse = line.Substring(line.IndexOf(',') + 1);
                    if (++counter % 100000 == 0)
                    {
                        Console.WriteLine("Count=" + counter);
                    }
                    double parsedIeee = double.Parse(number2Parse, NumberStyles.Float, CultureInfo.InvariantCulture);
                    String parsedIeeeHex = Convert.ToString(BitConverter.DoubleToInt64Bits(parsedIeee), 16);
                    while (parsedIeeeHex.Length < 16)
                    {
                        parsedIeeeHex = '0' + parsedIeeeHex;
                    }
                    bool roundTripOk = parsedIeee.ToString("G17").Equals(origIeee.ToString("G17"));
                    if (origIeee != parsedIeee || !roundTripOk)
                    {
                        Console.WriteLine("Number2Parse={0,-24:S} C#={1,-24:S} Original=" + origIeeeHex + 
                                          " Parsed=" + parsedIeeeHex +
                                          " Roundtrip OK=" +roundTripOk, number2Parse, origIeee.ToString("G17"));
                    }
                }
            }
        }
    }
}
@cyberphone

This comment has been minimized.

Copy link
Author

commented Jan 6, 2019

Hi Guys,
Since this seems to (finally) be headed for IETF standardization
https://tools.ietf.org/html/draft-rundgren-json-canonicalization-scheme-02
it would be interesting knowing when the fix reaches W10 in an .NET update because even if I do a local fix in my software, the standard JSON tools will have this problem, making signature validation fail

@danmosemsft

This comment has been minimized.

Copy link
Member

commented Jan 6, 2019

@cyberphone the fix is in master which means it will be in.NET Core 3.0. Another preview will be out in not long. There are no plans to port to.NET Framework, I think.

@cyberphone

This comment has been minimized.

Copy link
Author

commented Jan 6, 2019

@danmosemsft thanx!

@cyberphone

This comment has been minimized.

Copy link
Author

commented Jan 6, 2019

And it is working as well. SUPER!

@danmosemsft

This comment has been minimized.

Copy link
Member

commented Jan 6, 2019

That is great @cyberphone. Any interest in making a contribution of your own perhaps? Lots of up for grabs labeled issues. :)

@cyberphone

This comment has been minimized.

Copy link
Author

commented Jan 7, 2019

@danmosemsft Well, there are contributions and contributions...
https://github.com/cyberphone/json-canonicalization/tree/master/dotnet#json-canonicalizer-for-net
It contains among many things a 2000-line ES6-compatible number serializer. Anyway thanx again, it was a great relief for this project as a whole.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
8 participants
You can’t perform that action at this time.