Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cannot deserialize pandas SparseDataFrame #18230

Closed
asfimport opened this issue Mar 6, 2018 · 6 comments
Closed

Cannot deserialize pandas SparseDataFrame #18230

asfimport opened this issue Mar 6, 2018 · 6 comments

Comments

@asfimport
Copy link

import pyarrow
import pandas
a = pandas.SparseDataFrame({'A': [1, 2, 3], 'B': [4, 5, 6], 'C': [7, 8, 9]})
pyarrow.deserialize(pyarrow.serialize(a).to_buffer())
Traceback (most recent call last):
File "", line 1, in
File "serialization.pxi", line 441, in pyarrow.lib.deserialize
File "serialization.pxi", line 404, in pyarrow.lib.deserialize_from
File "serialization.pxi", line 257, in pyarrow.lib.SerializedPyObject.deserialize
File "serialization.pxi", line 174, in pyarrow.lib.SerializationContext._deserialize_callback
File ".../.virtualenv/arrow/lib/python3.6/site-packages/pyarrow/serialization.py", line 77, in _deserialize_pandas_dataframe
return pdcompat.serialized_dict_to_dataframe(data)
File ".../.virtualenv/arrow/lib/python3.6/site-packages/pyarrow/pandas_compat.py", line 450, in serialized_dict_to_dataframe
for block in data['blocks']]
File ".../.virtualenv/arrow/lib/python3.6/site-packages/pyarrow/pandas_compat.py", line 450, in
for block in data['blocks']]
File ".../.virtualenv/arrow/lib/python3.6/site-packages/pyarrow/pandas_compat.py", line 478, in _reconstruct_block
block = _int.make_block(block_arr, placement=placement)
File ".../.virtualenv/arrow/lib/python3.6/site-packages/pandas/core/internals.py", line 2957, in make_block
return klass(values, ndim=ndim, fastpath=fastpath, placement=placement)
File ".../.virtualenv/arrow/lib/python3.6/site-packages/pandas/core/internals.py", line 120, in init
len(self.mgr_locs)))
ValueError: Wrong number of items passed 3, placement implies 1

Reporter: Mitar / @mitar
Assignee: Licht Takeuchi / @Licht-T

PRs and other links:

Note: This issue was originally created as ARROW-2273. Please see the migration documentation for further details.

@asfimport
Copy link
Author

Licht Takeuchi / @Licht-T:
SparseDataFrame is planned to be deprecated in pandas.
pandas-dev/pandas#19239

@asfimport
Copy link
Author

Uwe Korn / @xhochy:
Should we then simply check for that on the serialize side and raise an error?

@asfimport
Copy link
Author

Licht Takeuchi / @Licht-T:
Yes, I'll do that.

@asfimport
Copy link
Author

Mitar / @mitar:
Isn't it still open for a debate if it will be deprecated?

@asfimport
Copy link
Author

Licht Takeuchi / @Licht-T:
@mitar,

Yes, it is still there. SparseDataFrame is naive implementation and has many bugs. I've spent a lot of time to fix these, but it is hard to fix all. IMO, this is not the right time to support SparseDataFrame in pyarrow.

@asfimport
Copy link
Author

Uwe Korn / @xhochy:
Issue resolved by pull request 1997
#1997

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant