Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

json_normalize should be able to accept an empty list #15534

Closed
rgbkrk opened this issue Feb 28, 2017 · 1 comment

Comments

Projects
None yet
3 participants
@rgbkrk
Copy link
Contributor

commented Feb 28, 2017

Code Sample, a copy-pastable example if possible

In[21]: pandas.io.json.json_normalize([])
---------------------------------------------------------------------------
IndexError                                Traceback (most recent call last)
<ipython-input-21-1bd834af8a9b> in <module>()
----> 1 pandas.io.json.json_normalize([])

/usr/local/lib/python3.6/site-packages/pandas/io/json.py in json_normalize(data, record_path, meta, meta_prefix, record_prefix)
    791 
    792     if record_path is None:
--> 793         if any([isinstance(x, dict) for x in compat.itervalues(data[0])]):
    794             # naive normalization, this is idempotent for flat records
    795             # and potentially will inflate the data considerably for

IndexError: list index out of range

Problem description

This should probably return an empty dataframe when the list is empty, based on len(data) being 0.

Expected Output

>>> pandas.io.json.json_normalize([])

Empty DataFrame
Columns: []
Index: []

Admittedly, this could be a series too -- I was using this across several collections and noticed I had to code around when some entries were empty.

Output of pd.show_versions()

INSTALLED VERSIONS
------------------
commit: None
python: 3.6.0.final.0
python-bits: 64
OS: Darwin
OS-release: 16.4.0
machine: x86_64
processor: i386
byteorder: little
LC_ALL: None
LANG: None
LOCALE: None.None

pandas: 0.19.2
nose: None
pip: 9.0.1
setuptools: 34.1.0
Cython: None
numpy: 1.12.0
scipy: 0.18.1
statsmodels: None
xarray: None
IPython: 5.2.0
sphinx: None
patsy: None
dateutil: 2.6.0
pytz: 2016.10
blosc: None
bottleneck: None
tables: None
numexpr: None
matplotlib: 2.0.0
openpyxl: None
xlrd: None
xlwt: None
xlsxwriter: None
lxml: None
bs4: None
html5lib: 0.9999999
httplib2: None
apiclient: None
sqlalchemy: None
pymysql: None
psycopg2: None
jinja2: 2.8.1
boto: None
pandas_datareader: None

@jreback jreback added this to the Next Major Release milestone Feb 28, 2017

@jreback

This comment has been minimized.

Copy link
Contributor

commented Feb 28, 2017

yep, I think returning an empty DataFrame would be fine. PR's welcome!

@jreback jreback added IO JSON and removed IO JSON labels Feb 28, 2017

@rgbkrk rgbkrk referenced this issue Mar 1, 2017

Merged

BUG: handle empty lists in json_normalize #15535

4 of 4 tasks complete

rgbkrk added a commit to rgbkrk/pandas that referenced this issue Mar 1, 2017

rgbkrk added a commit to rgbkrk/pandas that referenced this issue Mar 4, 2017

rgbkrk added a commit to rgbkrk/pandas that referenced this issue Mar 4, 2017

@jorisvandenbossche jorisvandenbossche modified the milestones: 0.20.0, Next Major Release Mar 4, 2017

jorisvandenbossche added a commit that referenced this issue Mar 4, 2017

AnkurDedania added a commit to AnkurDedania/pandas that referenced this issue Mar 21, 2017

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.