## Selecting columns 5: Transforming and adding multiple columns
By the end of this lesson you will be able to:
- transform multiple columns in-place
- add multiple columns
- transform and add multiple columns based on dtype

In [None]:
import polars as pl

In [None]:
csvFile = "../data/titanic.csv"

In [None]:
df = pl.read_csv(csvFile)
df.head(3)

## Transforming existing columns

We can transform existing columns by passing a `list` of columns to `with_columns` 

In [None]:
df = pl.read_csv(csvFile)
(
    df
    .with_columns(
        [
            pl.col('Fare').round(0),
            pl.col('Age').round(0)
        ]
    )
    .head(3)
)

## Adding new columns from existing columns
Similarly we add new columns from existing columns by renaming them with `alias`

In [None]:
df = pl.read_csv(csvFile)
(
    df
    .with_columns(
        [
            pl.col('Fare').round(0).alias('roundFare'),
            pl.col('Age').round(0).alias('roundAge')
        ]
    )
    .select(
        ['Age','Fare','roundFare','roundAge']
    )
    .head(3)
)

## Adding multiple new columns with `with_columns`
We pass a `list` to `with_columns` to create multiple new columns

In [None]:
df = pl.read_csv(csvFile)
(
    df
    .with_columns(
        [
            (2*pl.col('Fare')).alias('DoubleFare'),
            pl.col('Fare').log(base=10).alias('LogFare')
        ]
    )
    .select(['Fare','DoubleFare','LogFare'])
    .head(2)
)

## Transforming multiple columns based on dtype
We can apply the same transformation to all columns of the same dtype.

In this example we use the `str.to_uppercase` method we will see more of in the Text section.

In [None]:
df = pl.read_csv(csvFile)
(
    df
    .with_columns(
        pl.col(pl.Utf8).str.to_uppercase()
    )
    .select(
        pl.col(pl.Utf8)
    )
    .head(2)
)

## Exercises

In the exercises you will develop your understanding of:
- adding multiple columns
- transforming multiple columns based on dtype

## Exercise 1: Adding multiple columns
Add 
- a `familySize` column as the sum of the siblings, parents and the passenger
- a Boolean `overThirty` column showing if a passenger is aged 30 or over

In [None]:
df = pl.read_csv(csvFile)
(
    df
    <blank>
    .head()
)

## Exercise 2: Transform columns based on dtype
Convert all of the floating point columns to integer dtype 

In [None]:
df = pl.read_csv(csvFile)
(
    df
    <blank>
    .head()
)

## Solutions

## Solution to Exercise 1: adding multiple columns

In [None]:
df = pl.read_csv(csvFile)
(
    df
    .with_columns(
        [
            (pl.col("SibSp")+pl.col("Parch")+1).alias("familySize"),
            (pl.col("Age")>=30).alias("overThirty")
        ]
    )
    .head()
)

## Solution to Exercise 2: Transform columns based on dtype

In [None]:
df = pl.read_csv(csvFile)
(
    df
    .with_columns(
        pl.col(pl.Float64).cast(pl.Int64)
    )
    .head()
)