New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Infinity handling differs to python native json #80
Comments
The specification on json.org is very clear about numeric types only being numbers or exponents. From: Danny Price [mailto:notifications@github.com] It looks like ujson handles infinity different to the native json encoder: import ujson, json, numpy a = np.array([1,2,3,4,numpy.inf]) b = json.dumps({"test" : a.tolist()}) outputs '{"test": [1,2,3,4,5, Infinity]}'b= ujson.dumps({"test" : a.tolist()}) raises OverflowError: Invalid Inf value when encoding doubleSimilarly, an Overflow error is raised when NaN is encountered. It seems the relevant lines are 499-508 in ultrajsonenc.chttps://github.com/esnme/ultrajson/blob/master/lib/ultrajsonenc.c#L499:
In the interests of keeping ujson as a drop-in replacement for json, may I suggest this is changed so that it converts the NaN / Infinity to strings (like json) and doesn't raise an error? The decoder would likely need to be changed too... Cheers — |
Good point - for anyone else interested there's a discussion on StackOverflow of exactly this. I find it bemusing that JSON cannot fully represent IEE754 floating point numbers! However, if ujson is intended as a drop-in replacement for python's json, I maintain that handling inf/nans in the exact same way would be preferable to raising an exception. So, may I instead request for an argument allow_nan for loading/dumping:
|
I think this should be solved in a more generic way by a allowing a custom encoder/decoder that handles unknown values. Adding an option for every special case where someone is abusing the JSON-spec is not the way to go IMO. |
I would argue that it's not "abuse" of the JSON spec, but is instead adding extra functionality and compatibility -- already included in simplejson, json, and cjson. Having an identical featureset to the standard json would be fantastic, and I really think handling IEEE floating point is a pretty solid move. |
Well, you are asking a JSON encoder to output invalid JSON... |
So following the robustness principle, uJson should accept nan and inf on the decode ("code that receives input should accept non-conformant input as long as the meaning is clear"), but shouldn't encode it ("code that sends commands or data to other machines should conform completely to the specifications")? While this is obviously not what I would prefer, I do see the advantages of following design principles. I'll leave the ball in your court on this one... |
Feel free to contribute a patch towards the robustness principle. Closing this as an issue |
+1 for encoding |
Have to agree strongly with @hsk81, and the robustness principle is clear on this too. You SHOULD decode things when the meaning is clear, and you SHOULD NOT encode things against the spec. In programming the meaning of SHOULD and MUST are distinct. (And now I shall use it in practice). You SHOULD NOT follow a rule containing the word SHOULD, when it introduces unnecessary complexity into a fundamental part of many people's code. |
We do appear to have a bit of a deviation here. >>> import math
>>> import ujson
>>> import json
>>> json.dumps([math.nan, math.inf, -math.inf])
'[NaN, Infinity, -Infinity]'
>>> ujson.dumps([math.nan, math.inf, -math.inf])
'[NaN,Inf,-Inf]'
>>> ujson.loads('[NaN,Inf,-Inf]')
ujson.JSONDecodeError: Unexpected character found when decoding 'Infinity'
>>> ujson.loads('[NaN, Infinity, -Infinity]')
[nan, inf, -inf] And on the topic of JSON's inclusion/exclusion of non finite floats, one of the original JSON authors said that he didn't add it to the spec only because he didn't expect anyone to need it and that if anyone found a good use for them then he would consider his own argument to be moot. |
Infinity was being encoded as 'Inf' which, whilst the JSON spec doesn't include any non-finite floats, differs from the conventions in other JSON libraries, JavaScript of using 'Infinity'. It also differs from what `ujson.loads()` expects so that `ujson.loads(ujson.dumps(math.inf))` raises an exception. Closes ultrajson#80.
Infinity was being encoded as 'Inf' which, whilst the JSON spec doesn't include any non-finite floats, differs from the conventions in other JSON libraries, JavaScript of using 'Infinity'. It also differs from what `ujson.loads()` expects so that `ujson.loads(ujson.dumps(math.inf))` raises an exception. Closes ultrajson#80.
Infinity was being encoded as 'Inf' which, whilst the JSON spec doesn't include any non-finite floats, differs from the conventions in other JSON libraries, JavaScript of using 'Infinity'. It also differs from what `ujson.loads()` expects so that `ujson.loads(ujson.dumps(math.inf))` raises an exception. Closes ultrajson#80.
Infinity was being encoded as 'Inf' which, whilst the JSON spec doesn't include any non-finite floats, differs from the conventions in other JSON libraries, JavaScript of using 'Infinity'. It also differs from what `ujson.loads()` expects so that `ujson.loads(ujson.dumps(math.inf))` raises an exception. Closes #80.
It looks like ujson handles infinity different to the native json encoder:
Similarly, an Overflow error is raised when NaN is encountered. It seems the relevant lines are 499-508 in ultrajsonenc.c:
In the interests of keeping ujson as a drop-in replacement for json, may I suggest this is changed so that it converts the NaN / Infinity to strings (like json) and doesn't raise an error? The decoder would likely need to be changed too...
Cheers
Dan
The text was updated successfully, but these errors were encountered: