You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The current behaviour doesn't return a dateframe for a valid JSON. Note when the number is smaller, it works fine. It also works when only big numbers are present. It would be cool to have it work with big numbers as it works for small numbers.
Expected Output
A dataframe with a number and string
col
0 3.190044e+19
1 Text
Output of pd.read_json()
Traceback (most recent call last):
File "", line 1, in
File ".../.venv/lib/python3.6/site-packages/pandas/io/json/_json.py", line 592, in read_json
result = json_reader.read()
File ".../.venv/lib/python3.6/site-packages/pandas/io/json/_json.py", line 717, in read
obj = self._get_object_parser(self.data)
File ".../.venv/lib/python3.6/site-packages/pandas/io/json/_json.py", line 739, in _get_object_parser
obj = FrameParser(json, **kwargs).parse()
File ".../.venv/lib/python3.6/site-packages/pandas/io/json/_json.py", line 855, in parse
self._try_convert_types()
File ".../.venv/lib/python3.6/site-packages/pandas/io/json/_json.py", line 1151, in _try_convert_types
lambda col, c: self._try_convert_data(col, c, convert_dates=False)
File ".../.venv/lib/python3.6/site-packages/pandas/io/json/_json.py", line 1131, in _process_converter
new_data, result = f(col, c)
File ".../.venv/lib/python3.6/site-packages/pandas/io/json/_json.py", line 1151, in
lambda col, c: self._try_convert_data(col, c, convert_dates=False)
File ".../.venv/lib/python3.6/site-packages/pandas/io/json/_json.py", line 927, in _try_convert_data
new_data = data.astype("int64")
File ".../.venv/lib/python3.6/site-packages/pandas/core/generic.py", line 5882, in astype
dtype=dtype, copy=copy, errors=errors, **kwargs
File ".../.venv/lib/python3.6/site-packages/pandas/core/internals/managers.py", line 581, in astype
return self.apply("astype", dtype=dtype, **kwargs)
File ".../.venv/lib/python3.6/site-packages/pandas/core/internals/managers.py", line 438, in apply
applied = getattr(b, f)(**kwargs)
File ".../.venv/lib/python3.6/site-packages/pandas/core/internals/blocks.py", line 559, in astype
return self._astype(dtype, copy=copy, errors=errors, values=values, **kwargs)
File ".../.venv/lib/python3.6/site-packages/pandas/core/internals/blocks.py", line 643, in _astype
values = astype_nansafe(vals1d, dtype, copy=True, **kwargs)
File ".../.venv/lib/python3.6/site-packages/pandas/core/dtypes/cast.py", line 707, in astype_nansafe
return lib.astype_intsafe(arr.ravel(), dtype).reshape(arr.shape)
File "pandas/_libs/lib.pyx", line 547, in pandas._libs.lib.astype_intsafe
OverflowError: Python int too large to convert to C long
The text was updated successfully, but these errors were encountered:
I'm new to Open Source contributions so please bear with me. It seems that we are coercing ints wherever possible while parsing JSON. The code b/w line 943 to line 950 in file pandas/io/json/_json.py is what is causing the problem. The int coerce is being checked using a try except which only catches TypeErrors and ValueErrors. If it catches an OverflowError too then things work as intended. Will submit a PR regarding this soon.
if test_json is [{"col": "31900441201190696999"}, {"col": "Text"}], would we not expect the result to be a string and OverflowError: int too big to convert should not be raised. (eg. a number as a string could be a barcode)
If test_json is [{"col": 31900441201190696999}, {"col": "Text"}], then expecting a DataFrame with a number and a string would be reasonable. This currently raises ValueError: Value is too big
Code Sample, a copy-pastable example if possible
Problem description
The current behaviour doesn't return a dateframe for a valid JSON. Note when the number is smaller, it works fine. It also works when only big numbers are present. It would be cool to have it work with big numbers as it works for small numbers.
Expected Output
A dataframe with a number and string
Output of
pd.read_json()
The text was updated successfully, but these errors were encountered: