Join GitHub today
GitHub is home to over 28 million developers working together to host and review code, manage projects, and build software together.Sign up
ENH: support .astype('category') on DataFrame / aka co-factorization #12860
We don't allow an astype of a DataFrame to category directly
Instead you can apply the astype per-column.
But if you have 'similar' cateogories then you would usually do this, automatically
This is failry straightforward to actually implement, and I think is a nice easy way of coding, w/o having to actually support 2D categoricals internally (and we are moving away from internal 2-d structures anyhow).
I'm not so sure what you are proposing here? That
For my usecase
My usecase is more:
well, you would oftentimes do this on a sub-set I think, e.g.
the reason I bring this up is whether we should form the uniques FIRST before conversions, IOW
As opposed to individually create them per-column.
referenced this issue
Sep 18, 2016
Here is a complete example
Note this can actually be implemented in a more performant way via https://github.com/pandas-dev/pandas/blob/master/pandas/core/reshape/merge.py#L1453