Skip to content

machow/catfact

Repository files navigation

catfact

Categorical wrangling for Python. Supports both Polars and Pandas. Enables categorical and ordinal scales in plotting tools like Plotnine.

catfact addresses some common challenges when working categorical data. Categorical data is useful when you want to display your data in a specific way, like alphabetical, most frequent first, or along a scale. It is a port of the popular R package forcats.

Installation

pip install catfact

Basic example

import polars as pl
import catfact as fct
from catfact.polars.data import starwars

(
    starwars
    .group_by("eye_color")
    .agg(pl.len())
    .sort("len", descending=True)
)
shape: (15, 2)
eye_color len
str u32
"brown" 21
"blue" 19
"yellow" 11
"black" 10
"orange" 8
"white" 1
"pink" 1
"blue-gray" 1
"green, yellow" 1
"dark" 1
from plotnine import ggplot, aes, geom_bar, coord_flip

(
    ggplot(starwars, aes("eye_color"))
    + geom_bar()
    + coord_flip()
)

(
    starwars
    .with_columns(
        fct.infreq(pl.col("eye_color"))
    )
    >> ggplot(aes("eye_color"))
    + geom_bar()
    + coord_flip()
)