Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG-REPORT] describe_categorical in interchange columns is a tuple, not a dict #2113

Closed
honno opened this issue Jul 5, 2022 · 1 comment · Fixed by #2150
Closed

[BUG-REPORT] describe_categorical in interchange columns is a tuple, not a dict #2113

honno opened this issue Jul 5, 2022 · 1 comment · Fixed by #2150

Comments

@honno
Copy link
Contributor

honno commented Jul 5, 2022

In the interchange protocol, describe_categorical should return a dict (mind the spec's API type annotation is faulty), but Vaex returns a tuple

return ordered, is_dictionary, mapping

This prevents interchanging dataframes with categorical columns, e.g. with pandas-dev/pandas#46141

>>> import numpy as np
>>> import vaex
>>> df = vaex.from_items(("foo", np.asarray([4, 2, 1, 3, 3], dtype="int8")))
>>> df = df.categorize("foo")
>>> from pandas.api.exchange import from_dataframe
>>> from_dataframe(df)
.../pandas/core/exchange/from_dataframe.py:184, in categorical_column_to_series(col)
    169 """
    170 Convert a column holding categorical data to a pandas Series.
    171 
   (...)
    180     that keeps the memory alive.
    181 """
    182 categorical = col.describe_categorical
--> 184 if not categorical["is_dictionary"]:
    185     raise NotImplementedError("Non-dictionary categoricals not supported yet")
    187 mapping = categorical["mapping"]
TypeError: tuple indices must be integers or slices, not str
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant