Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix ValueError when reading a Dataframe with HDFStore in Python 3 fro… #24510

Merged
merged 8 commits into from
Jan 1, 2019
Merged

Conversation

faulaire
Copy link
Contributor

Pull request to solve : 24404

@codecov
Copy link

codecov bot commented Dec 31, 2018

Codecov Report

Merging #24510 into master will not change coverage.
The diff coverage is 0%.

Impacted file tree graph

@@           Coverage Diff           @@
##           master   #24510   +/-   ##
=======================================
  Coverage   31.89%   31.89%           
=======================================
  Files         166      166           
  Lines       52421    52421           
=======================================
  Hits        16722    16722           
  Misses      35699    35699
Flag Coverage Δ
#multiple 30.29% <0%> (ø) ⬆️
#single 31.89% <0%> (ø) ⬆️
Impacted Files Coverage Δ
pandas/io/pytables.py 19.53% <0%> (ø) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 100ffff...66c7e8d. Read the comment docs.

@codecov
Copy link

codecov bot commented Dec 31, 2018

Codecov Report

Merging #24510 into master will decrease coverage by 0.01%.
The diff coverage is 0%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master   #24510      +/-   ##
==========================================
- Coverage   31.89%   31.88%   -0.02%     
==========================================
  Files         166      166              
  Lines       52421    52426       +5     
==========================================
- Hits        16722    16717       -5     
- Misses      35699    35709      +10
Flag Coverage Δ
#multiple 30.29% <0%> (-0.01%) ⬇️
#single 31.88% <0%> (-0.02%) ⬇️
Impacted Files Coverage Δ
pandas/io/pytables.py 19.52% <0%> (-0.01%) ⬇️
pandas/core/internals/blocks.py 38.09% <0%> (-0.4%) ⬇️
pandas/core/frame.py 27.82% <0%> (-0.14%) ⬇️
pandas/core/reshape/reshape.py 7.97% <0%> (-0.09%) ⬇️
pandas/core/reshape/pivot.py 8.77% <0%> (ø) ⬆️
pandas/core/computation/expressions.py 58.82% <0%> (ø) ⬆️
pandas/core/generic.py 31.41% <0%> (ø) ⬆️
pandas/plotting/_style.py 23.65% <0%> (ø) ⬆️
pandas/core/groupby/generic.py 13.71% <0%> (ø) ⬆️
pandas/core/arrays/categorical.py 29.81% <0%> (+0.12%) ⬆️
... and 2 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 100ffff...173265c. Read the comment docs.

@gfyoung gfyoung added IO HDF5 read_hdf, HDFStore 2/3 Compat labels Dec 31, 2018
@jreback
Copy link
Contributor

jreback commented Dec 31, 2018

so need a sample hdf file that we can include in the repo, generated in py2, that fails on reading (make a new test), and works with your patch. you can put it in here: pandas/tests/io/data/legacy_hdf

read it like

def test_legacy_table_read(self, datapath):
        # legacy table types
        with ensure_clean_store(
                datapath('io', 'data', 'legacy_hdf', 'legacy_table.h5'),
                mode='r') as store:

            with catch_warnings():
                simplefilter("ignore", pd.io.pytables.IncompatibilityWarning)
                store.select('df1')
                store.select('df2')
                store.select('wp1')

Copy link
Contributor

@jreback jreback left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

see comments, would also need a whatsnew note

@@ -2661,6 +2661,8 @@ def read_index_node(self, node, start=None, stop=None):

if 'name' in node._v_attrs:
name = _ensure_str(node._v_attrs.name)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you can just directly call this, no if is needed

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

datapath('io', 'data', 'legacy_hdf', 'legacy_table_fixed_py2.h5'),
mode='r') as store:
with catch_warnings():
simplefilter("ignore", pd.io.pytables.IncompatibilityWarning)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what does this come up?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy / past error, this filter is useless

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed

@@ -1515,6 +1515,7 @@ MultiIndex
I/O
^^^

- Fix ``ValueError`` when reading a Dataframe with HDFStore in Python 3 from fixed format written in Python 2
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you add the issue number

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Add issue  number to whatsnew
Try to decode index name systematically
Remove useless simplefilter
Copy link
Contributor

@jreback jreback left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks ok, some minor comments, you have some lint issue: https://dev.azure.com/pandas-dev/pandas/_build/results?buildId=6198

@@ -1515,6 +1515,7 @@ MultiIndex
I/O
^^^

- Fix ``ValueError`` when reading a Dataframe with HDFStore in Python 3 from fixed format written in Python 2 (:issue:`24510`)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you move below (e.g. add it right before Plotting)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry but I didn't understand your request, I'm not familiar with rst format

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I got It, I moved my entry in the file

@@ -4540,6 +4540,18 @@ def test_pytables_native2_read(self, datapath):
d1 = store['detector']
assert isinstance(d1, DataFrame)

def test_legacy_table_fixed_format_read_py2(self, datapath):
# legacy table with fixed format written en Python 2
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add the issue number here as a comments

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added

datapath('io', 'data', 'legacy_hdf',
'legacy_table_fixed_py2.h5'),
mode='r') as store:
with catch_warnings():
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what are the warnings from?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just used test_legacy_table_read(...) as a template to read the file. I removed this context manager locally and the test is still ok. I'll commit it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

great

@jreback jreback added this to the 0.24.0 milestone Dec 31, 2018
@jreback
Copy link
Contributor

jreback commented Dec 31, 2018

lgtm @faulaire ping on green.

@jreback jreback merged commit d5283ea into pandas-dev:master Jan 1, 2019
@jreback
Copy link
Contributor

jreback commented Jan 1, 2019

thanks @faulaire

@faulaire
Copy link
Contributor Author

faulaire commented Jan 1, 2019

Thanks for the review & commit @jreback

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
IO HDF5 read_hdf, HDFStore
Projects
None yet
Development

Successfully merging this pull request may close these issues.

ValueError when reading a Dataframe with HDFStore in Python 3 from fixed format written in Python 2
3 participants