DataFrame has incorrect dtype after .compute() when group by categorical column #6134
Labels
dataframe
good first issue
Clearly described and easy to accomplish. Good for beginners to the project.
If you join Dask DataFrame on a categorical column, then the outputted Dask DataFrame column is still
category
dtype. However, the moment you.compute()
the outputted Dask DataFrame, then the column is the wrong dtype, not categorical.Tested on Dask 2.14.0 and Pandas 1.0.3
This example where the category type looks like a float, so after .compute(), the dtype is float.
If the categorical column looks like an float, then Dask DataFrame says "category" type but upon
.compute()
, the pandas DataFrame saysfloat
type.Similarly, if the categorical column is a string that does not look like an integer, then Dask DataFrame says "category" type but upon
.compute()
, the pandas DataFrame saysobject
type.The text was updated successfully, but these errors were encountered: