New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
json fails to serialise numpy.int64 #68501
Comments
When I run the attached example in Python 2.7.9, it succeeds. In Python 3.4, it fails as shown below. I use json 2.0.9 and numpy 1.9.2 with both versions of Python. Python and all packages provided by Anaconda 2.2.0. --------------------------------------------------------------------------- TypeError Traceback (most recent call last)
/home/tha/tmp/debug_json/debug_json.py in <module>()
4 test = {'value': np.int64(1)}
5
----> 6 obj=json.dumps(test) /home/tha/.conda/envs/python3/lib/python3.4/json/init.py in dumps(obj, skipkeys, ensure_ascii, check_circular, allow_nan, cls, indent, separators, default, sort_keys, **kw) /home/tha/.conda/envs/python3/lib/python3.4/json/encoder.py in encode(self, o) /home/tha/.conda/envs/python3/lib/python3.4/json/encoder.py in iterencode(self, o, _one_shot) /home/tha/.conda/envs/python3/lib/python3.4/json/encoder.py in default(self, o) TypeError: 1 is not JSON serializable |
All python3 ints are what used to be long ints in python2, so the code that recognized short ints no longer exists. Do the numpy types implement __index__? It looks like json doesn't check for __index__, and I wonder if it should. |
I don't know. Simply, under 2.7, int64 inherits from int: >>> np.int64.__mro__
(<type 'numpy.int64'>, <type 'numpy.signedinteger'>, <type 'numpy.integer'>, <type 'numpy.number'>, <type 'numpy.generic'>, <type 'int'>, <type 'object'>) while it doesn't under 3.x: >>> np.int64.__mro__
(<class 'numpy.int64'>, <class 'numpy.signedinteger'>, <class 'numpy.integer'>, <class 'numpy.number'>, <class 'numpy.generic'>, <class 'object'>) |
Ah, so this is a numpy bug? |
Yes, it looks as a bug (or rather lack of feature) in numpy, but numpy have no chance to fix it without help from Python. The json module is not flexible enough. For now this issue can be workarounded only from user side, with special default handler. >>> import numpy, json
>>> def default(o):
... if isinstance(o, numpy.integer): return int(o)
... raise TypeError
...
>>> json.dumps({'value': numpy.int64(42)}, default=default)
'{"value": 42}' |
I wouldn't call it a bug in Numpy (a quirk perhaps?). Numpy ints are fixed-width ints, so some of them can inherit from Python int in 2.x, but not in 3.x. >>> issubclass(np.int64, int)
True
>>> issubclass(np.int32, int)
False
>>> issubclass(np.int16, int)
False |
So in python2, some were json serializable and some weren't? Yes, I'd call that a quirk :) So back to the question of whether it makes sense for json to look for __index__ to decide if something can be serialized as an int. If not, I don't think there is anything we can do. |
I don't know about __index__, but there's the ages-old discussion of allowing some kind of __json__ hook on types. Of course, none of those solutions would allow round-tripping. |
On 64-bit Windows, my 64-bit Python 2.7.9 and my 32-bit 2.7.10 Python both reproduce the failure with a similar traceback. |
Is there any possibility that json could implement special handling of NumPy types? This "lack of a feature" seems to have propagated back into Python 2.7 now in some recent update... |
Nothing's changed in python 2.7. Basically: (a) no numpy ints have ever serialized in py3. (b) in py2, either np.int32 *xor* np.int64 will serialize correctly, and which one it is depends on sizeof(long) in the C compiler used to build Python. (This follows from the fact that in py2, the Python 'int' type is always the same size as C 'long'.) So the end result is: on OS X and Linux, 32-bit Pythons can JSON-serialize np.int32 objects, and 64-bit Pythons can JSON-serialize np.int64 objects, because 64-bit OS X and Linux is ILP64. On Windows, both 32- and 64-bit Pythons can JSON-serialize np.int32 objects, and can't serialize np.int64 objects, because 64-bit Windows is LLP64. |
Thanks for the clarification. |
This is still broken. With pandas being popular, it's more likely someone might hit it. Can we fix this? At the very least, the error message needs to be made much more specific. I have created a dictionary containing pandas stats.
Apparently, pandas (sometimes) returns numpy ints and numpy floats. Here's a piece of the dictionary:
with open('Data/station_stats.json', 'w') as fp:
TypeError: Object of type int64 is not JSON serializable
for key, value in station_stats['657']['Fluorescence'].items(): count 3183 <class 'numpy.int64'>
Problem descriptionpandas statistics sometimes produce numpy numerics. numpy ints are not supported by json.dump Expected OutputI expect ints, floats, strings, ... to be JSON srializable. INSTALLED VERSIONScommit : None pandas : 0.25.0 |
Note also that pandas DataFrame.to_json() method has no issue with int64. Perhaps you could borrow their code. |
What is the next step of this 4-year-old issue? I think i can prepare a patch for using __index__ (as suggested by @r.david.murray) |
We could use __index__ for serializing numpy.int64. But what to do with numpy.float32 and numpy.float128? It is a part of a much larger problem (which includes other numbers, collections, encoded strings, named tuples and data classes, etc). I am working on it, but there is a lot of work. |
Just ran into this. Are there any updates? Is there any task to contribute to regarding this? |
See #71549 |
As far as I can see, this issue can be closed. |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: