You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In pandas 0.10.0 I cannot use hdf5 compression when storing sparse series for which all values are "sparsified"/censored (i.e. the same as fill_value).
Compression works fine if there's at least one non-sparse value in each series of a sparse dataframe. And dataframes with series that are completely sparse can be stored in hdf5 without compression.
But combining the two, as in case 4 below, breaks:
importpandasaspdimportnumpyasnp# make sparse dataframedf=pd.DataFrame(np.random.binomial(n=1, p=.01, size=(1e4, 1e2))).to_sparse(fill_value=0)
# case 1: store uncompressed (works)store1=pd.HDFStore('sparse_uncompressed.h5')
store1['sparse_df'] =dfstore1.close()
# case 2: store compressed (works)store2=pd.HDFStore('sparse_compressed.h5', complib='zlib', complevel=9)
store2['sparse_df'] =dfstore2.close()
# set one series to be completely sparsedf[0] =np.zeros(1e4)
# case 3: store df with completely sparse series uncompressed (works)store3=pd.HDFStore('sparser_uncompressed.h5')
store3['sparse_df'] =dfstore3.close()
# case 4: try storing df with completely sparse series compressed (fails)store4=pd.HDFStore('sparser_compressed.h5', complib='zlib', complevel=9)
store4['sparse_df'] =dfstore4.close()
In pandas 0.10.0 I cannot use hdf5 compression when storing sparse series for which all values are "sparsified"/censored (i.e. the same as fill_value).
Compression works fine if there's at least one non-sparse value in each series of a sparse dataframe. And dataframes with series that are completely sparse can be stored in hdf5 without compression.
But combining the two, as in case 4 below, breaks:
The resulting error comes from
tables
:The text was updated successfully, but these errors were encountered: