New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Empty data frames not round-trippable to JSON #21287

Closed
ludaavics opened this Issue Jun 1, 2018 · 5 comments

Comments

Projects
None yet
5 participants
@ludaavics

ludaavics commented Jun 1, 2018

Code Sample

import pandas as pd
df = pd.DataFrame([], columns=['a', 'b', 'c'])
df.to_json('tmp.json', orient='table')
pd.read_json('tmp.json', orient='table')

>> KeyError: "['index' 'a' 'b' 'c'] not in index"

Problem description

Empty data frames saved as JSON fail to load back to data frames.
Quick fix: replace this line with

df = DataFrame(table['data'], columns=col_order)[col_order]

Expected Output

print(df)
Empty DataFrame
Columns: [a, b, c]
Index: []

Output of pd.show_versions()

INSTALLED VERSIONS

commit: None
python: 3.6.3.final.0
python-bits: 64
OS: Windows
OS-release: 10
machine: AMD64
processor: Intel64 Family 6 Model 158 Stepping 9, GenuineIntel
byteorder: little
LC_ALL: None
LANG: None
LOCALE: None.None

pandas: 0.23.0
pytest: 3.2.3
pip: 10.0.1
setuptools: 36.6.0
Cython: 0.27.2
numpy: 1.13.3
scipy: 1.0.0
pyarrow: 0.7.1
xarray: 0.9.6
IPython: 6.2.1
sphinx: 1.6.5
patsy: 0.4.1
dateutil: 2.6.1
pytz: 2017.2
blosc: 1.5.1
bottleneck: 1.2.1
tables: 3.4.2
numexpr: 2.6.4
feather: 0.4.0
matplotlib: 2.1.0
openpyxl: None
xlrd: 1.1.0
xlwt: None
xlsxwriter: 1.0.2
lxml: None
bs4: 4.6.0
html5lib: 1.0b10
sqlalchemy: 1.2.3
pymysql: None
psycopg2: 2.7.3.2 (dt dec pq3 ext lo64)
jinja2: 2.9.6
s3fs: None
fastparquet: 0.1.3
pandas_gbq: None
pandas_datareader: None

INSTALLED VERSIONS

commit: None
python: 3.6.5.final.0
python-bits: 64
OS: Darwin
OS-release: 17.5.0
machine: x86_64
processor: i386
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8
LOCALE: en_US.UTF-8

pandas: 0.23.0
pytest: None
pip: 9.0.3
setuptools: 39.1.0
Cython: None
numpy: 1.14.3
scipy: 1.1.0
pyarrow: None
xarray: None
IPython: 6.4.0
sphinx: None
patsy: 0.5.0
dateutil: 2.7.3
pytz: 2018.4
blosc: None
bottleneck: None
tables: None
numexpr: 2.6.4
feather: None
matplotlib: 2.2.2
openpyxl: None
xlrd: 1.1.0
xlwt: None
xlsxwriter: 1.0.4
lxml: None
bs4: None
html5lib: 1.0.1
sqlalchemy: 1.2.7
pymysql: None
psycopg2: 2.7.4 (dt dec pq3 ext lo64)
jinja2: 2.10
s3fs: None
fastparquet: None
pandas_gbq: None
pandas_datareader: None

@WillAyd

This comment has been minimized.

Member

WillAyd commented Jun 1, 2018

Thanks for the report and investigation - care to make a PR?

@ludaavics

This comment has been minimized.

ludaavics commented Jun 5, 2018

@pyryjook

This comment has been minimized.

Contributor

pyryjook commented Jun 5, 2018

Hi!

I actually ended up diving into this with my PR #21318. I did the changes the way @ludaavics initially suggested (BTW, thanks for heads-up with the example!). After writing a unit test for it I realised that now that the actual error is fixed the DF read back from the JSON gets a different index type than it originally had:

By using: tm.assert_frame_equal(expected, result) this is the result:

E       AssertionError: DataFrame.index are different
E
E       DataFrame.index classes are not equivalent
E       [left]:  Index([], dtype='object')
E       [right]: Float64Index([], dtype='float64')

Need to dig this a bit deeper now. Any initial thoughts on why this might happen or am I missing something?

(This is my first contribution for this project, so might be something obvious that I have not (yet) noticed)

@WillAyd

This comment has been minimized.

Member

WillAyd commented Jun 5, 2018

@pyryjook for questions specific to your commits it is easier to help if you push the commit to the PR and ask the question there

@pyryjook

This comment has been minimized.

Contributor

pyryjook commented Jun 5, 2018

Yeah, I'll do it and let's then continue there.

@gfyoung gfyoung added the Bug label Jun 6, 2018

@jreback jreback added this to the 0.23.1 milestone Jun 8, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment