Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: DataFrame doesn't roundtrip with HDFStore(..., format='table', dropna=True) #37624

Open
arw2019 opened this issue Nov 4, 2020 · 0 comments
Labels
Bug Index Related to the Index class or subclasses IO HDF5 read_hdf, HDFStore

Comments

@arw2019
Copy link
Member

arw2019 commented Nov 4, 2020

xref #37564

Example

In [3]: import numpy as np
   ...: import pandas as pd
   ...: import pandas._testing as tm
   ...: from pandas.tests.io.pytables.common import ensure_clean_path
   ...: 
   ...: df_with_missing = pd.DataFrame(
   ...:             {"col1": [0, np.nan, 2], "col2": [1, np.nan, np.nan]}
   ...:         )
   ...: df_without_missing = pd.DataFrame({"col1":[0, 2], "col2": [1, np.nan]})
   ...: 
   ...: setup_path = '/tmp/store'
   ...: with ensure_clean_path(setup_path) as path:
   ...:     df_with_missing.to_hdf(path, "df_with_missing", dropna=True, index=False, format="table")
   ...:     reloaded = pd.read_hdf(path, "df_with_missing")
   ...:     tm.assert_frame_equal(df_without_missing, reloaded)
   ...: 
---------------------------------------------------------------------------
AssertionError                            Traceback (most recent call last)
<ipython-input-3-77cd279acb0b> in <module>
     13     df_with_missing.to_hdf(path, "df_with_missing", dropna=True, index=False, format="table")
     14     reloaded = pd.read_hdf(path, "df_with_missing")
---> 15     tm.assert_frame_equal(df_without_missing, reloaded)
     16 

    [... skipping hidden 2 frame]

/workspaces/pandas/pandas/_libs/testing.pyx in pandas._libs.testing.assert_almost_equal()
     44 
     45 
---> 46 cpdef assert_almost_equal(a, b,
     47                           rtol=1.e-5, atol=1.e-8,
     48                           bint check_dtype=True,

/workspaces/pandas/pandas/_libs/testing.pyx in pandas._libs.testing.assert_almost_equal()
    159             msg = (f"{obj} values are different "
    160                    f"({np.round(diff * 100.0 / na, 5)} %)")
--> 161             raise_assert_detail(obj, msg, lobj, robj, index_values=index_values)
    162 
    163         return True

/workspaces/pandas/pandas/_testing.py in raise_assert_detail(obj, message, left, right, diff, index_values)
   1053         msg += f"\n[diff]: {diff}"
   1054 
-> 1055     raise AssertionError(msg)
   1056 
   1057 

AssertionError: DataFrame.index are different

DataFrame.index values are different (50.0 %)
[left]:  RangeIndex(start=0, stop=2, step=1)
[right]: Int64Index([0, 2], dtype='int64')
@arw2019 arw2019 added Bug Needs Triage Issue that has not been reviewed by a pandas team member IO HDF5 read_hdf, HDFStore Dtype Conversions Unexpected or buggy dtype conversions Index Related to the Index class or subclasses and removed Dtype Conversions Unexpected or buggy dtype conversions labels Nov 4, 2020
@mroeschke mroeschke removed the Needs Triage Issue that has not been reviewed by a pandas team member label Aug 14, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Index Related to the Index class or subclasses IO HDF5 read_hdf, HDFStore
Projects
None yet
Development

No branches or pull requests

2 participants