Basically, we can add columns that are the result of CASE statements, or some IF THEN ELSE logic based on the other existing columns. The syntax to do that is via `pl.when`.

In [1]:
import polars as pl
import pathlib
path_to_data = pathlib.Path("data/titanic.csv")
path_to_data.exists()

True

In [3]:
df = pl.scan_csv(path_to_data)
df.head()

Suppose we wanted a column called `firstClass`, that is `1` if `Pclass == 1`, else `0`. You could do that like this:

In [5]:
(
    df
    .select([
        pl.col("Pclass"),
        pl.when(
            pl.col("Pclass") == 1
        )
        .then(1)
        .otherwise(0)
        .alias("firstClass")
    ]).fetch(2)
)

Pclass,firstClass
i64,i32
3,0
1,1


The full syntax is:
```python
pl.when(**Bolean Expression**)
.then(**Value if True**)
.otherwise(**Value if False**)
.alias(**New Column Name**)
```

In [7]:
# here is an example of a combined condition for creating a new column
(
    df
    .select([
        pl.col("Pclass"),
        pl.col("Age"),
        pl.when(
            (pl.col("Pclass") == 1) & (pl.col("Age") < 30)
        )
        .then(1)
        .otherwise(0)
        .alias("youngFirstClass")
    ]).collect().tail(5)
)

Pclass,Age,youngFirstClass
i64,f64,i32
2,27.0,0
1,19.0,1
3,,0
1,26.0,1
3,32.0,0


And to continue to have conditions, you just repeat the `pl.when`, and `pl.then` pair of commands:

In [9]:
# here is an example of a combined condition for creating a new column
(
    df
    .select([
        pl.col("Pclass"),
        pl.col("Age"),
        pl.when(
            (pl.col("Pclass") == 1) & (pl.col("Age") < 30)
        )
        .then(1)
        .when(
            (pl.col("Pclass") == 1) & (pl.col("Age") >= 30)
        )
        .then(2)
        .otherwise(0)
        .alias("ageClass")
    ]).collect().head(5)
)

Pclass,Age,ageClass
i64,f64,i32
3,22.0,0
1,38.0,2
3,26.0,0
1,35.0,2
3,35.0,0
