Join GitHub today
jsonlint changes foat values #24
I'm seeing the following unexpected behaviour:
I have a datastructure which contains floats, like
After running jsonlint -s this gets changed to
which is a different value and causes a subsequent run of jsonlint to report the warning:
The parameter --keep-format doesn't influence this behaviour.
I would expect the linter not to change the value of floats and generate output which would not contain warnings when run through the linter again.
I'm not seeing that myself with the current version (2.2.4),
Can you report the jsonlint version and other platform information? The output from running:
Also you might try adding the
Demo of the issue on my system
My system is a RHEL 6.6 machine
Detailed stats info from jsonlint, rerunning the output back into jsonlit
Thanks for the output. The only thing that stands out at the moment is that it's using Python 2.6, which is quite ancient. Though demjson should still work with 2.6, I wonder if it has some slight difference in it's floating point operations. I've just tested with python 2.7 and don't see the problem. I'll need to set up a specific test environment that more closely matches yours to track this down. There's even
In the mean time, is it possible for you to install a newer python environment (2.7, or even 3.*) for running demjson/jsonlint?
Sorry, on this particular environment I cannot install anything.
This issue is seemingly caused by Python 2.6. Notice the difference (all run on the same linux OS, so it's not a libc thing):
# Python 2.6.9 >>> repr( 108.9 ) '108.90000000000001' # Python 2.7.3 >>> repr( 108.9 ) '108.9' # Python 3.5.1 >>> repr( 108.9 ) '108.9'
Although demjson implements a complete custom parser for the input of JSON numbers, for output (generating JSON) it simply relies on Python's builtin
Note that for most IEEE 754 floating point double-precision numbers (which the linux C implementation of Pyhon uses), there is approximately 16 significant decimal digits. However floats are represented in binary with a 53-bit significand, which is not an exact number decimal digits. So to output a number a compromise must be made in terms of whether to round to avoid partial digits, or not to preserve all the binary bits even if a partial (ambiguous) digit must be output.
In any version of Python the number 108.9 is actually 108.900000000000005684341... but those last digits from the "5" onward are not exact. In fact you'll find that 108.9 == 108.9000000000000056. Even if you round to one less significant digit you'll see the difficulty with decimal representations:
This is why demjson produces a lint warning when it sees the number 108.90000000000001, as it is not truly portable—notwithstanding that some JSON implementations may not even use IEEE 754 at all.
Generally, using 15 significant digits will insure that the decimal representations survive a round-trip conversion between string and numeric format. While using 17 significant digits insures that the binary representation survives a round trip. (This assumes that IEEE subnormal forms aren't used, which Python does not employ)
The reference C implementations of Python 2.6 and Python 2.7 render a different number of significant decimal digits. I don't know what other implementations (PyPy, Jython, etc) do.
Since Python 2.6 reached end-of-life in 2013 and I see no easy solution to this that isn't risky in terms of potentially introducing further bugs, I am unlikely to provide a patch for this issue. Though certainly feel free to fork and/or provide a pull request — or convince me this needs fixed.
Still, there are some things you may be able to do if you have to use Python 2.6.
Choice 1 — If you can edit the demjson.py source (or make a local copy), you may be able to change the code which outputs floating point values. In version 2.2.4 in demjson.py line 4038 you'll see:
else: # A normal float. state.append( repr(n) )
you may be able to substitute the
else: s = "%.16g" % n if "e" not in s and "." not in s: # [edited] s = s + ".0" state.append( s )
Choice 2 — You can tell json lint to ignore the excess significant digits and not output a warning,
though this will also suppress other kinds of warnings about non-portable data too.