/ ultrajson Public
Support parsing NaN, Infinity and -Infinity #514
Add this suggestion to a batch that can be applied as a single commit. This suggestion is invalid because no changes were made to the code. Suggestions cannot be applied while the pull request is closed. Suggestions cannot be applied while viewing a subset of changes. Only one suggestion per line can be applied in a batch. Add this suggestion to a batch that can be applied as a single commit. Applying suggestions on deleted lines is not supported. You must change the existing code in this line in order to create a valid suggestion. Outdated suggestions cannot be applied. This suggestion has been applied or marked resolved. Suggestions cannot be applied from pending reviews. Suggestions cannot be applied on multi-line comments. Suggestions cannot be applied while the pull request is queued to merge.
This ports the changes from pandas (https://github.com/pandas-dev/pandas/pull/30295/files
) mentioned by @WillAyd in #146 back to ujson.
Thus far I've only done a nearly verbatim port of the diff. There seems to be an issue. Currently:
So there are some details missing from the direct patch. I'll try and check for them, but my C is much worse than my Python, so any insights would be helpful.
I've fixed the above issue, although I'm not sure why the ordering of the function pointers in ultrajson.h
__JSONObjectDecoderwas important. I assume it's a C thing, if anyone wants to tell me why that fixed worked, I'd like to know.
While working on this I discovered a (hidden) bug in the pandas port. When they parse NaN in C, they actually return None instead of NaN. The pandas part after that seems to interpret None as Nan, which is why their test passes:
But the parser itself is actually wrong
And that can be demonstrated by adding a dict to the list, which prevents the pandas engine from realizing that None should have been a nan. This probably needs to be a pandas bug report (see here).
But I think I can fix it here. I'm getting a handle on how this is working.