# Polars unit

polars plugin to support physical unit


This package is still in early phases of development. Supported features:

- [x] use expression transparently on the numeric columns
- [ ] functions that take an argument (e.g. `clip`) that is not an expression
- [ ] propagate physical units when combining different units

The value and physical unit is stored as a struct

## Example

add `unit.with_("m")` to a `Series` to specify the unit of measure.

In [1]:
import polars as pl
import polars_unit as plu

df = pl.DataFrame({
    "distance": pl.Series([1.0, 2.0, 3.0]).unit.with_("m"),
    "time": pl.Series([1.0, 2.0, 3.0]).unit.with_("s"),
})
df

distance,time
struct[2],struct[2]
"{1.0,""m""}","{1.0,""s""}"
"{2.0,""m""}","{2.0,""s""}"
"{3.0,""m""}","{3.0,""s""}"


Can apply functions on the underlying numeric column using `.unit.<func>` on an expression

In [2]:
df.select(pl.col("distance").unit.mean())

distance
struct[2]
"{2.0,""m""}"


`<func>` can be any expression function that is supported on a numeric column. It also works on functions that take 2 columns

In [6]:
df.with_columns(
    dist_neg = pl.col("distance").unit.neg(),
    dist_dist = pl.col("distance").unit.add(pl.col("distance"))
)

distance,time,dist_neg,dist_dist
struct[2],struct[2],struct[2],struct[2]
"{1.0,""m""}","{1.0,""s""}","{-1.0,""m""}","{2.0,""m""}"
"{2.0,""m""}","{2.0,""s""}","{-2.0,""m""}","{4.0,""m""}"
"{3.0,""m""}","{3.0,""s""}","{-3.0,""m""}","{6.0,""m""}"


you need to use the `.unit` on at least one operand (cannot subclass `pl.Series`) when doing basic arithmetic

In [7]:
df.with_columns(
    dist_squared = pl.col("distance").unit * pl.col("distance") 
)

distance,time,dist_neg,dist_dist,dist_squared
struct[2],struct[2],struct[2],struct[2],struct[2]
"{1.0,""m""}","{1.0,""s""}","{-1.0,""m""}","{2.0,""m""}","{1.0,""m""}"
"{2.0,""m""}","{2.0,""s""}","{-2.0,""m""}","{4.0,""m""}","{4.0,""m""}"
"{3.0,""m""}","{3.0,""s""}","{-3.0,""m""}","{6.0,""m""}","{9.0,""m""}"


it enforces that there can't be operations between different units (soon will calculate the appropriate unit)

In [17]:
try:
    df.with_columns(
        speed = pl.col("distance").unit / pl.col("time")
    )
except Exception as e:
    print(e)

the plugin failed with message: Expected units to be the same, got Scalar { dtype: String, value: StringOwned("m") } and Scalar { dtype: String, value: StringOwned("s") }


## Details

the plugin is implemented as a rust polars plugin. 

A *unit* `Series` is stored as a Struct with two fields:

- `value` a numeric column
- `unit` a string (soon to be Enum) with the unit

Polars doesn't support yet Extentions Dtype so this implementation detail is shown to the user.

The core of the plugin unpacks the `value` from the given series, applies the original expression, and then repacks it a *unit* Series

### Unit system

we need a runtime unit system so we can't use the `uom` crate, which is a compile time unit check.

The inspiration for the design is from Julia'a Unitful.jl

We need:

- Dimension, this is a physical dimension (e.g. speed, length) and can be derived from basic dimensions. It has a name
- uni