Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: json_normalize crashes when len(record_path)==1 and non-leaf node of nested meta path is None #44312

Closed
3 tasks done
qdbp opened this issue Nov 4, 2021 · 0 comments · Fixed by #44325
Closed
3 tasks done
Labels
Bug IO JSON read_json, to_json, json_normalize
Milestone

Comments

@qdbp
Copy link
Contributor

qdbp commented Nov 4, 2021

  • I have checked that this issue has not already been reported.

  • I have confirmed this bug exists on the latest version of pandas.

  • I have confirmed this bug exists on the master branch of pandas.

Reproducible Example

import pandas as pd
j = {'a': 'foo', 'b': [1, 2], 'c': None}
pd.json_normalize(j, record_path='b', meta=['a', ['c', 'e']], errors="ignore")

>>>
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/main/.local/lib/python3.9/site-packages/pandas/io/json/_normalize.py", line 504, in _json_normalize
    _recursive_extract(data, record_path, {}, level=0)
  File "/home/main/.local/lib/python3.9/site-packages/pandas/io/json/_normalize.py", line 492, in _recursive_extract
    meta_val = _pull_field(obj, val[level:])
  File "/home/main/.local/lib/python3.9/site-packages/pandas/io/json/_normalize.py", line 388, in _pull_field
    result = result[field]

Issue Description

When calling json_normalize with:

  • a record path length of one (above succeed with longer record path)
  • a nested metadata field where a non-leaf node is None
  • errors="ignore"

The parsing crashes with a TypeError.

The error happens on line 388 of _normalize.py.

Expected Behavior

The missing nested metadata columns are filled with NaNs.

Installed Versions

INSTALLED VERSIONS

commit : 945c9ed
python : 3.9.7.final.0
python-bits : 64
OS : Linux
OS-release : 5.14.14-arch1-1
Version : #1 SMP PREEMPT Wed, 20 Oct 2021 21:35:18 +0000
machine : x86_64
processor :
byteorder : little
LC_ALL : None
LANG : en_US.UTF-8
LOCALE : en_US.UTF-8

pandas : 1.3.4
numpy : 1.20.3
pytz : 2021.3
dateutil : 2.8.2
pip : 21.2.4
setuptools : 41.6.0
Cython : 0.29.24
pytest : 6.2.5
hypothesis : None
sphinx : 4.2.0
blosc : None
feather : None
xlsxwriter : None
lxml.etree : 4.6.3
html5lib : 1.1
pymysql : None
psycopg2 : None
jinja2 : 3.0.2
IPython : 7.19.0
pandas_datareader: None
bs4 : 4.9.3
bottleneck : None
fsspec : None
fastparquet : None
gcsfs : None
matplotlib : 3.4.3
numexpr : None
odfpy : None
openpyxl : None
pandas_gbq : None
pyarrow : None
pyxlsb : None
s3fs : None
scipy : 1.7.1
sqlalchemy : 1.4.23
tables : None
tabulate : None
xarray : None
xlrd : 2.0.1
xlwt : None
numba : 0.54.1

@qdbp qdbp added Bug Needs Triage Issue that has not been reviewed by a pandas team member labels Nov 4, 2021
@qdbp qdbp changed the title BUG: json_normalize crashes BUG: json_normalize crashes when len(record_path)==1 and non-leaf node of nested meta path is None Nov 4, 2021
qdbp added a commit to qdbp/pandas that referenced this issue Nov 4, 2021
qdbp added a commit to qdbp/pandas that referenced this issue Nov 5, 2021
qdbp added a commit to qdbp/pandas that referenced this issue Nov 5, 2021
qdbp added a commit to qdbp/pandas that referenced this issue Nov 5, 2021
qdbp added a commit to qdbp/pandas that referenced this issue Nov 5, 2021
qdbp added a commit to qdbp/pandas that referenced this issue Nov 5, 2021
qdbp added a commit to qdbp/pandas that referenced this issue Nov 5, 2021
qdbp added a commit to qdbp/pandas that referenced this issue Nov 5, 2021
qdbp added a commit to qdbp/pandas that referenced this issue Nov 5, 2021
@mroeschke mroeschke added IO JSON read_json, to_json, json_normalize and removed Needs Triage Issue that has not been reviewed by a pandas team member labels Nov 6, 2021
@jreback jreback added this to the 1.4 milestone Nov 6, 2021
qdbp added a commit to qdbp/pandas that referenced this issue Nov 8, 2021
qdbp added a commit to qdbp/pandas that referenced this issue Nov 9, 2021
jreback pushed a commit that referenced this issue Nov 14, 2021
nickleus27 pushed a commit to nickleus27/pandas that referenced this issue Nov 28, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug IO JSON read_json, to_json, json_normalize
Projects
None yet
3 participants