Allow categoricals/enums backed by smaller int dtypes #13109
Labels
A-dtype-categorical
Area: categorical data type
accepted
Ready for implementation
enhancement
New feature or an improvement of an existing feature
Description
Matlab's categorical arrays are always backed by the smallest integer type that can support the number of categories. While this isn't necessarily the best implementation (adding a new category after the limit is reached requires an upcast), it helps a lot with performance when the number of categories is small--for example, using a
u8
to represent your categories can help with space.I was wondering what people thought about either:
pl.Categorical(dtype=pl.UInt8)
.The text was updated successfully, but these errors were encountered: