json fails to serialise numpy.int64 #68501

thomas-arildsen · 2015-05-28T08:32:31Z

BPO	24313
Nosy	@pitrou, @bitdancer, @njsmith, @eli-b, @serhiy-storchaka, @isidentical, @vlbrown, @mxposed
Files	debug_json.py: Minimal example to demonstrate the problem

^{Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.}

Show more details

GitHub fields:

assignee = None
closed_at = None
created_at = <Date 2015-05-28.08:32:31.477>
labels = ['type-bug', '3.7']
title = 'json fails to serialise numpy.int64'
updated_at = <Date 2022-02-24.00:38:48.258>
user = 'https://bugs.python.org/thomas-arildsen'

bugs.python.org fields:

activity = <Date 2022-02-24.00:38:48.258>
actor = 'mxposed'
assignee = 'none'
closed = False
closed_date = None
closer = None
components = []
creation = <Date 2015-05-28.08:32:31.477>
creator = 'thomas-arildsen'
dependencies = []
files = ['39530']
hgrepos = []
issue_num = 24313
keywords = []
message_count = 17.0
messages = ['244288', '244321', '244352', '244355', '244359', '244363', '244370', '244371', '254734', '257451', '257455', '257459', '350567', '350581', '355133', '355143', '413869']
nosy_count = 10.0
nosy_names = ['pitrou', 'r.david.murray', 'njs', 'Eli_B', 'serhiy.storchaka', 'thomas-arildsen', 'Amit Feller', 'BTaskaya', 'vlbrown', 'mxposed']
pr_nums = []
priority = 'normal'
resolution = None
stage = None
status = 'open'
superseder = None
type = 'behavior'
url = 'https://bugs.python.org/issue24313'
versions = ['Python 3.7']

thomas-arildsen · 2015-05-28T08:32:31Z

When I run the attached example in Python 2.7.9, it succeeds. In Python 3.4, it fails as shown below. I use json 2.0.9 and numpy 1.9.2 with both versions of Python. Python and all packages provided by Anaconda 2.2.0.
The error seems to be caused by the serialised object containing a numpy.int64 type. It might fail with other 64-bit numpy types as well (untested).

---------------------------------------------------------------------------

TypeError                                 Traceback (most recent call last)
/home/tha/tmp/debug_json/debug_json.py in <module>()
      4 test = {'value': np.int64(1)}
      5 
----> 6 obj=json.dumps(test)

/home/tha/.conda/envs/python3/lib/python3.4/json/init.py in dumps(obj, skipkeys, ensure_ascii, check_circular, allow_nan, cls, indent, separators, default, sort_keys, **kw)
228 cls is None and indent is None and separators is None and
229 default is None and not sort_keys and not kw):
--> 230 return _default_encoder.encode(obj)
231 if cls is None:
232 cls = JSONEncoder

/home/tha/.conda/envs/python3/lib/python3.4/json/encoder.py in encode(self, o)
190 # exceptions aren't as detailed. The list call should be roughly
191 # equivalent to the PySequence_Fast that ''.join() would do.
--> 192 chunks = self.iterencode(o, _one_shot=True)
193 if not isinstance(chunks, (list, tuple)):
194 chunks = list(chunks)

/home/tha/.conda/envs/python3/lib/python3.4/json/encoder.py in iterencode(self, o, _one_shot)
248 self.key_separator, self.item_separator, self.sort_keys,
249 self.skipkeys, _one_shot)
--> 250 return _iterencode(o, 0)
251
252 def _make_iterencode(markers, _default, _encoder, _indent, _floatstr,

/home/tha/.conda/envs/python3/lib/python3.4/json/encoder.py in default(self, o)
171
172 """
--> 173 raise TypeError(repr(o) + " is not JSON serializable")
174
175 def encode(self, o):

TypeError: 1 is not JSON serializable

bitdancer · 2015-05-28T16:54:40Z

All python3 ints are what used to be long ints in python2, so the code that recognized short ints no longer exists. Do the numpy types implement __index__? It looks like json doesn't check for __index__, and I wonder if it should.

pitrou · 2015-05-28T23:10:57Z

It looks like json doesn't check for __index__, and I wonder if it should.

I don't know. Simply, under 2.7, int64 inherits from int:

>>> np.int64.__mro__
(<type 'numpy.int64'>, <type 'numpy.signedinteger'>, <type 'numpy.integer'>, <type 'numpy.number'>, <type 'numpy.generic'>, <type 'int'>, <type 'object'>)

while it doesn't under 3.x:

>>> np.int64.__mro__ 
(<class 'numpy.int64'>, <class 'numpy.signedinteger'>, <class 'numpy.integer'>, <class 'numpy.number'>, <class 'numpy.generic'>, <class 'object'>)

bitdancer · 2015-05-29T01:13:57Z

Ah, so this is a numpy bug?

serhiy-storchaka · 2015-05-29T05:46:35Z

Yes, it looks as a bug (or rather lack of feature) in numpy, but numpy have no chance to fix it without help from Python. The json module is not flexible enough.

For now this issue can be workarounded only from user side, with special default handler.

>>> import numpy, json
>>> def default(o):
...     if isinstance(o, numpy.integer): return int(o)
...     raise TypeError
... 
>>> json.dumps({'value': numpy.int64(42)}, default=default)
'{"value": 42}'

pitrou · 2015-05-29T09:40:57Z

I wouldn't call it a bug in Numpy (a quirk perhaps?). Numpy ints are fixed-width ints, so some of them can inherit from Python int in 2.x, but not in 3.x.
But not all of them do, since the bitwidth can be different:

>>> issubclass(np.int64, int)
True
>>> issubclass(np.int32, int)
False
>>> issubclass(np.int16, int)
False

bitdancer · 2015-05-29T11:59:03Z

So in python2, some were json serializable and some weren't? Yes, I'd call that a quirk :)

So back to the question of whether it makes sense for json to look for __index__ to decide if something can be serialized as an int. If not, I don't think there is anything we can do.

pitrou · 2015-05-29T12:01:56Z

I don't know about __index__, but there's the ages-old discussion of allowing some kind of __json__ hook on types. Of course, none of those solutions would allow round-tripping.

eli-b · 2015-11-16T14:29:45Z

On 64-bit Windows, my 64-bit Python 2.7.9 and my 32-bit 2.7.10 Python both reproduce the failure with a similar traceback.

thomas-arildsen · 2016-01-04T10:14:57Z

Is there any possibility that json could implement special handling of NumPy types? This "lack of a feature" seems to have propagated back into Python 2.7 now in some recent update...

njsmith · 2016-01-04T11:20:24Z

Nothing's changed in python 2.7. Basically: (a) no numpy ints have ever serialized in py3. (b) in py2, either np.int32 *xor* np.int64 will serialize correctly, and which one it is depends on sizeof(long) in the C compiler used to build Python. (This follows from the fact that in py2, the Python 'int' type is always the same size as C 'long'.)

So the end result is: on OS X and Linux, 32-bit Pythons can JSON-serialize np.int32 objects, and 64-bit Pythons can JSON-serialize np.int64 objects, because 64-bit OS X and Linux is ILP64. On Windows, both 32- and 64-bit Pythons can JSON-serialize np.int32 objects, and can't serialize np.int64 objects, because 64-bit Windows is LLP64.

thomas-arildsen · 2016-01-04T11:44:22Z

Thanks for the clarification.

vlbrown · 2019-08-26T20:49:42Z

This is still broken. With pandas being popular, it's more likely someone might hit it. Can we fix this?

At the very least, the error message needs to be made much more specific.

I have created a dictionary containing pandas stats.

def summary_stats(s):
    """ 
    Calculate summary statistics for a series or list, s 
    returns a dictionary
    """
    
    stats = {
      'count': 0,
      'max': 0,
      'min': 0,
      'mean': 0,
      'median': 0,
      'mode': 0,
      'std': 0,
      'z': (0,0)
    }
    
    stats['count'] = s.count()
    stats['max'] = s.max()
    stats['min'] = s.min()
    stats['mean'] = round(s.mean(),3)
    stats['median'] = s.median()
    stats['mode'] = s.mode()[0]
    stats['std'] = round(s.std(),3)

    std3 = 3* stats['std']
    low_z = round(stats['mean'] - (std3),3)
    high_z = round(stats['mean'] + (std3),3)
    stats['z'] = (low_z, high_z)
        
    return(stats)

Apparently, pandas (sometimes) returns numpy ints and numpy floats.

Here's a piece of the dictionary:

 {'count': 597,
   'max': 0.95,
   'min': 0.01,
   'mean': 0.585,
   'median': 0.58,
   'mode': 0.59,
   'std': 0.122,
   'z': (0.219, 0.951)}
```\`

It looks fine, but when I try to dump the dict to json

with open('Data/station_stats.json', 'w') as fp:
json.dump(station_stats, fp)


I get this error

TypeError: Object of type int64 is not JSON serializable


\*\*Much searching** led me to discover that I apparently have numpy ints which I have confirmed.

for key, value in station_stats['657']['Fluorescence'].items():
print(key, value, type(value))

count 3183 <class 'numpy.int64'>
max 2.8 <class 'float'>
min 0.02 <class 'float'>
mean 0.323 <class 'float'>
median 0.28 <class 'float'>
mode 0.24 <class 'numpy.float64'>
std 0.194 <class 'float'>
z (-0.259, 0.905) <class 'tuple'>

Problem description

pandas statistics sometimes produce numpy numerics.

numpy ints are not supported by json.dump

Expected Output

I expect ints, floats, strings, ... to be JSON srializable.

INSTALLED VERSIONS

commit : None
python : 3.7.3.final.0
python-bits : 64
OS : Darwin
OS-release : 15.6.0
machine : x86_64
processor : i386
byteorder : little
LC_ALL : None
LANG : en_US.UTF-8
LOCALE : en_US.UTF-8

pandas : 0.25.0
numpy : 1.16.4
pytz : 2019.1
dateutil : 2.8.0
pip : 19.1.1
setuptools : 41.0.1
Cython : 0.29.12
pytest : 5.0.1
hypothesis : None
sphinx : 2.1.2
blosc : None
feather : None
xlsxwriter : 1.1.8
lxml.etree : 4.3.4
html5lib : 1.0.1
pymysql : 0.9.3
psycopg2 : None
jinja2 : 2.10.1
IPython : 7.7.0
pandas_datareader: None
bs4 : 4.7.1
bottleneck : 1.2.1
fastparquet : None
gcsfs : None
lxml.etree : 4.3.4
matplotlib : 3.1.0
numexpr : 2.6.9
odfpy : None
openpyxl : 2.6.2
pandas_gbq : None
pyarrow : None
pytables : None
s3fs : None
scipy : 1.3.0
sqlalchemy : 1.3.5
tables : 3.5.2
xarray : None
xlrd : 1.2.0
xlwt : 1.3.0
xlsxwriter : 1.1.8

```

vlbrown · 2019-08-26T22:51:02Z

Note also that pandas DataFrame.to_json() method has no issue with int64. Perhaps you could borrow their code.

isidentical · 2019-10-22T15:07:48Z

What is the next step of this 4-year-old issue? I think i can prepare a patch for using __index__ (as suggested by @r.david.murray)

serhiy-storchaka · 2019-10-22T17:46:43Z

We could use __index__ for serializing numpy.int64. But what to do with numpy.float32 and numpy.float128? It is a part of a much larger problem (which includes other numbers, collections, encoded strings, named tuples and data classes, etc). I am working on it, but there is a lot of work.

mxposed · 2022-02-24T00:38:48Z

Just ran into this. Are there any updates? Is there any task to contribute to regarding this?

petsuter · 2022-05-21T06:36:55Z

there's the ages-old discussion of allowing some kind of json hook on types

See #71549

flofriday · 2024-02-10T21:36:56Z

As far as I can see, this issue can be closed.
There is no president in giving some libraries like numpy special treatment in the interpreter (even if they are popular) and a more general discussion for allowing custom hooks for json serialisation already has their own discussion as previously mentioned.

thomas-arildsen mannequin added the type-crash A hard crash of the interpreter, possibly with a core dump label May 28, 2015

pitrou added type-feature A feature request or enhancement and removed type-crash A hard crash of the interpreter, possibly with a core dump labels May 28, 2015

vlbrown mannequin added 3.7 (EOL) end of life type-bug An unexpected behavior, bug, or error and removed type-feature A feature request or enhancement labels Aug 26, 2019

ezio-melotti transferred this issue from another repository Apr 10, 2022

iritkatriel added the stdlib Python modules in the Lib dir label Nov 23, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

json fails to serialise numpy.int64 #68501

json fails to serialise numpy.int64 #68501

thomas-arildsen mannequin commented May 28, 2015

thomas-arildsen mannequin commented May 28, 2015

bitdancer commented May 28, 2015

pitrou commented May 28, 2015

bitdancer commented May 29, 2015

serhiy-storchaka commented May 29, 2015

pitrou commented May 29, 2015

bitdancer commented May 29, 2015

pitrou commented May 29, 2015

eli-b mannequin commented Nov 16, 2015

thomas-arildsen mannequin commented Jan 4, 2016

njsmith commented Jan 4, 2016

thomas-arildsen mannequin commented Jan 4, 2016

vlbrown mannequin commented Aug 26, 2019

INSTALLED VERSIONS

vlbrown mannequin commented Aug 26, 2019

isidentical commented Oct 22, 2019

serhiy-storchaka commented Oct 22, 2019

mxposed mannequin commented Feb 24, 2022

petsuter commented May 21, 2022

flofriday commented Feb 10, 2024

json fails to serialise numpy.int64 #68501

json fails to serialise numpy.int64 #68501

Comments

thomas-arildsen mannequin commented May 28, 2015

thomas-arildsen mannequin commented May 28, 2015

bitdancer commented May 28, 2015

pitrou commented May 28, 2015

bitdancer commented May 29, 2015

serhiy-storchaka commented May 29, 2015

pitrou commented May 29, 2015

bitdancer commented May 29, 2015

pitrou commented May 29, 2015

eli-b mannequin commented Nov 16, 2015

thomas-arildsen mannequin commented Jan 4, 2016

njsmith commented Jan 4, 2016

thomas-arildsen mannequin commented Jan 4, 2016

vlbrown mannequin commented Aug 26, 2019

Problem description

Expected Output

INSTALLED VERSIONS

vlbrown mannequin commented Aug 26, 2019

isidentical commented Oct 22, 2019

serhiy-storchaka commented Oct 22, 2019

mxposed mannequin commented Feb 24, 2022

petsuter commented May 21, 2022

flofriday commented Feb 10, 2024