# Pivoting and unpivoting

In [2]:
import polars as pl
import numpy as np

## Pivot

Make a `DataFrame` wide

In [3]:
sales_data = pl.DataFrame({
    'date': ['2022-01-01', '2022-01-02', '2022-01-01', '2022-01-02','2022-01-03'],
    'region': ['East', 'West', 'East', 'West','West'],
    'bike_type': ['Mountain', 'Mountain', 'Road', 'Road','Mountain'],
    'sales': [100, 200, 300, 400,500]
})
sales_data

date,region,bike_type,sales
str,str,str,i64
"""2022-01-01""","""East""","""Mountain""",100
"""2022-01-02""","""West""","""Mountain""",200
"""2022-01-01""","""East""","""Road""",300
"""2022-01-02""","""West""","""Road""",400
"""2022-01-03""","""West""","""Mountain""",500


The column we would like to fix, put it to the `index`

The values of particular column which will be expanded to the column name, put it to `on`

`values` is the value corresponding to the values which was put in `on`

In [4]:
sales_data.pivot(
    index="date",
    on="bike_type", # expand every values in this column to be an individual column
    values="sales"
)

date,Mountain,Road
str,i64,i64
"""2022-01-01""",100,300.0
"""2022-01-02""",200,400.0
"""2022-01-03""",500,


When there are multiple identical values in the same pivot column, Polars will use `aggregation function argument` to decide how to deal with them.

In [5]:
sales_data.pivot(
    index="date",
    on="bike_type",
    values="sales",
    aggregate_function="mean"
)

date,Mountain,Road
str,f64,f64
"""2022-01-01""",100.0,300.0
"""2022-01-02""",200.0,400.0
"""2022-01-03""",500.0,


On multiple columns

The column name is set in a JSON string with `{}`

In [6]:
sales_data.pivot(
    index="date",
    on=["region", "bike_type"],
    values="sales",
    aggregate_function="first"
).select(
    "date", '{"East","Mountain"}'
)

date,"{""East"",""Mountain""}"
str,i64
"""2022-01-01""",100.0
"""2022-01-02""",
"""2022-01-03""",


### Pivots and aggregation
When there are multiple values in the original `DataFrame` that correspond to a position in the pivoted `DataFrame` then Polars must aggregate them.

In [7]:
sales_data

date,region,bike_type,sales
str,str,str,i64
"""2022-01-01""","""East""","""Mountain""",100
"""2022-01-02""","""West""","""Mountain""",200
"""2022-01-01""","""East""","""Road""",300
"""2022-01-02""","""West""","""Road""",400
"""2022-01-03""","""West""","""Mountain""",500


There are bunch of aggregation function argument:
- `sum`
- `max`
- `min`
- `mean`
- `median`
- `last`
- `count`

In [8]:
sales_data.pivot(
    index="date",
    on=["region", "bike_type"],
    values="sales",
    aggregate_function=pl.element().quantile(0.75, interpolation="linear")
)

date,"{""East"",""Mountain""}","{""West"",""Mountain""}","{""East"",""Road""}","{""West"",""Road""}"
str,f64,f64,f64,f64
"""2022-01-01""",100.0,,300.0,
"""2022-01-02""",,200.0,,400.0
"""2022-01-03""",,500.0,,


In [9]:
sales_data.pivot(
    index="date",
    on=["region", "bike_type"],
    values="sales",
    aggregate_function="mean",
    sort_columns=True # order lexically
)

date,"{""East"",""Mountain""}","{""East"",""Road""}","{""West"",""Mountain""}","{""West"",""Road""}"
str,f64,f64,f64,f64
"""2022-01-01""",100.0,300.0,,
"""2022-01-02""",,,200.0,400.0
"""2022-01-03""",,,500.0,


### Pivot in lazy mode?
In lazy mode Polars have tp know the schema (column names and dtypes) at each stage of a query plan.

However, after a `pivot` the column names cannot be known in advance which means `pivot` is not - and will not - be available in lazy mode.

`collect` query -> `pivot` -> `lazy` to resume in lazy mode.

## Unpivoting

Make a `DataFrame` long

In [10]:
sales_pv = sales_data.pivot(
    index="date",
    on="bike_type",
    values="sales",
    aggregate_function="mean",
)

sales_pv

date,Mountain,Road
str,f64,f64
"""2022-01-01""",100.0,300.0
"""2022-01-02""",200.0,400.0
"""2022-01-03""",500.0,


The metadata columns which we want to fix, put it to the `index`

The columns which is gonna be combined and list in a column, put it to `on`

In [11]:
sales_pv.unpivot(
    on=["Mountain", "Road"],
    index="date"
)

date,variable,value
str,str,f64
"""2022-01-01""","""Mountain""",100.0
"""2022-01-02""","""Mountain""",200.0
"""2022-01-03""","""Mountain""",500.0
"""2022-01-01""","""Road""",300.0
"""2022-01-02""","""Road""",400.0
"""2022-01-03""","""Road""",


The column names in `on` becomes `variable` in unpivoting dataframe.

However, if we would like to use all columns, we can omit `on`

In [12]:
sales_pv.unpivot(
    index="date"
)

date,variable,value
str,str,f64
"""2022-01-01""","""Mountain""",100.0
"""2022-01-02""","""Mountain""",200.0
"""2022-01-03""","""Mountain""",500.0
"""2022-01-01""","""Road""",300.0
"""2022-01-02""","""Road""",400.0
"""2022-01-03""","""Road""",


Set the variable and values name

In [13]:
sales_pv.unpivot(
    index="date",
    variable_name="bike_type",
    value_name="sales"
)

date,bike_type,sales
str,str,f64
"""2022-01-01""","""Mountain""",100.0
"""2022-01-02""","""Mountain""",200.0
"""2022-01-03""","""Mountain""",500.0
"""2022-01-01""","""Road""",300.0
"""2022-01-02""","""Road""",400.0
"""2022-01-03""","""Road""",


### Unpivot in lazy mode?

`unpivot` supports lazy mode as the new column names along with their dtypes can be known in advance.

### Unstacking

Another way to transform from long to wide format is called `unstack`.

Unlike a `pivot` where Polars first does a `group_by` to get the pivot keys, `unstack` method works off an integer `step` argument.

For example, `step=2`:
- goes through the column and finds the first two values
- creates the first column of the new `DataFrame` from these two values
- gets the next two values in that column
- creates another column with these two values
- repeats this process for each column

In [14]:
sales_pv.unpivot(
    index="date"
).unstack(step=2, how="vertical")

date_0,date_1,date_2,variable_0,variable_1,variable_2,value_0,value_1,value_2
str,str,str,str,str,str,f64,f64,f64
"""2022-01-01""","""2022-01-03""","""2022-01-02""","""Mountain""","""Mountain""","""Road""",100.0,500.0,400.0
"""2022-01-02""","""2022-01-01""","""2022-01-03""","""Mountain""","""Road""","""Road""",200.0,300.0,


Horizontal

In [15]:
sales_pv.unpivot(
    index="date"
).unstack(step=2, how="horizontal")

date_0,date_1,variable_0,variable_1,value_0,value_1
str,str,str,str,f64,f64
"""2022-01-01""","""2022-01-02""","""Mountain""","""Mountain""",100.0,200.0
"""2022-01-03""","""2022-01-01""","""Mountain""","""Road""",500.0,300.0
"""2022-01-02""","""2022-01-03""","""Road""","""Road""",400.0,


`unstack` is typically faster then `pivot` due to the integer step.

## Exercises

### Exercise 1

In [16]:
sales_df = (
    pl.read_parquet("data/bike_sales.parquet")
    .with_columns(
        pl.col("date").dt.year().alias("year")
    )
)
sales_df.head(3)

date,customer age,customer gender,country,sub category,order quantity,unit cost,unit price,cost,revenue,year
date,i64,str,str,str,i64,i64,i64,i64,i64,i32
2013-01-28,31,"""M""","""Australia""","""Mountain Bikes""",1,1912,3400,1912,2856,2013
2015-01-28,31,"""M""","""Australia""","""Mountain Bikes""",1,1912,3400,1912,2856,2015
2013-07-22,31,"""M""","""Australia""","""Mountain Bikes""",1,1912,3400,1912,2856,2013


Pivot the data to have a year on each row and a column for each `sub category`. 

Aggregate by getting the sum of the `order quantity`. 

Ensure the years are in ascending order

In [17]:
sales_df.pivot(
    on="sub category",
    index="year",
    values="order quantity",
    aggregate_function="sum"
).sort("year")

year,Mountain Bikes,Road Bikes,Touring Bikes
i32,i64,i64,i64
2011,1245,4015,0
2012,1230,4124,0
2013,2088,2797,825
2014,1724,1856,1024
2015,3124,4202,1230
2016,2581,2777,1569


We want to visualize this data as a time series with Plotly so melt the pivoted `DataFrame` and assign it to `annual_sales_df`

In [18]:
annual_sales_df = sales_df.pivot(
    on="sub category",
    index="year",
    values="order quantity",
    aggregate_function="sum"
).sort("year").unpivot(
    index="year"
)

annual_sales_df

year,variable,value
i32,str,i64
2011,"""Mountain Bikes""",1245
2012,"""Mountain Bikes""",1230
2013,"""Mountain Bikes""",2088
2014,"""Mountain Bikes""",1724
2015,"""Mountain Bikes""",3124
…,…,…
2012,"""Touring Bikes""",0
2013,"""Touring Bikes""",825
2014,"""Touring Bikes""",1024
2015,"""Touring Bikes""",1230


We can now plot the output using `px.line` in Plotly

In [19]:
import plotly.express as px

px.line(
    data_frame=annual_sales_df,
    x="year",
    y="value",
    color="variable"
)

### Exercise 2

In [20]:
fake_news_df = pl.DataFrame({
    'publication': ['The Daily Deception', 'Faux News Network', 'The Fabricator', 'The Misleader', 
                     'The Hoax Herald', ],
    'date': ['2022-01-01', '2022-01-03', '2022-01-04', '2022-01-05', '2022-01-06', 
             ],
    'title': ['Scientists Discover New Species of Flying Elephant', 
              'Aliens Land on Earth and Offer to Solve All Our Problems', 
              'Study Shows That Eating Pizza Every Day Leads to Longer Life', 
              'New Study Finds That Smoking is Good for You', 
              "World's Largest Iceberg Discovered in Florida"],
    'text': ['In a groundbreaking discovery, scientists have found a new species of elephant that can fly. The flying elephants, which were found in the Amazon rainforest, have wings that span over 50 feet and can reach speeds of up to 100 miles per hour. This is a game-changing discovery that could revolutionize the field of zoology.',
             'In a historic moment for humanity, aliens have landed on Earth and offered to solve all our problems. The extraterrestrial visitors, who arrived in a giant spaceship that landed in Central Park, have advanced technology that can cure disease, end hunger, and reverse climate change. The world is waiting to see how this incredible offer will play out.',
             'A new study has found that eating pizza every day can lead to a longer life. The study, which was conducted by a team of Italian researchers, looked at the eating habits of over 10,000 people and found that those who ate pizza regularly lived on average two years longer than those who didn\'t. The study has been hailed as a breakthrough in the field of nutrition.',
             'In a surprising twist, a new study has found that smoking is actually good for you. The study, which was conducted by a team of British researchers, looked at the health outcomes of over 100,000 people and found that those who smoked regularly had lower rates of heart disease and cancer than those who didn\'t. The findings have sparked controversy among health experts.',
             'In a bizarre turn of events, the world\'s largest iceberg has been discovered in Florida. The iceberg, which is over 100 miles long and 50 miles wide, was found off the coast of Miami by a group of tourists on a whale-watching tour. Scientists are baffled by the discovery and are scrambling to figure out how an iceberg of this size could have']
})
fake_news_df

publication,date,title,text
str,str,str,str
"""The Daily Deception""","""2022-01-01""","""Scientists Discover New Specie…","""In a groundbreaking discovery,…"
"""Faux News Network""","""2022-01-03""","""Aliens Land on Earth and Offer…","""In a historic moment for human…"
"""The Fabricator""","""2022-01-04""","""Study Shows That Eating Pizza …","""A new study has found that eat…"
"""The Misleader""","""2022-01-05""","""New Study Finds That Smoking i…","""In a surprising twist, a new s…"
"""The Hoax Herald""","""2022-01-06""","""World's Largest Iceberg Discov…","""In a bizarre turn of events, t…"


Begin by:
- converting the text to lowercase and splitting the text by whitespace
- adding a new column called `placeholder` with 1 as a placeholder value

In [21]:
fake_news_df.with_columns(
    pl.col("text").str.to_lowercase().str.split(" "),
    placeholder = pl.lit(1)
)

publication,date,title,text,placeholder
str,str,str,list[str],i32
"""The Daily Deception""","""2022-01-01""","""Scientists Discover New Specie…","[""in"", ""a"", … ""zoology.""]",1
"""Faux News Network""","""2022-01-03""","""Aliens Land on Earth and Offer…","[""in"", ""a"", … ""out.""]",1
"""The Fabricator""","""2022-01-04""","""Study Shows That Eating Pizza …","[""a"", ""new"", … ""nutrition.""]",1
"""The Misleader""","""2022-01-05""","""New Study Finds That Smoking i…","[""in"", ""a"", … ""experts.""]",1
"""The Hoax Herald""","""2022-01-06""","""World's Largest Iceberg Discov…","[""in"", ""a"", … ""have""]",1


Explode the lists in the `text` column

In [22]:
fake_news_df.with_columns(
    pl.col("text").str.to_lowercase().str.split(" "),
    placeholder = pl.lit(1)
).explode("text")

publication,date,title,text,placeholder
str,str,str,str,i32
"""The Daily Deception""","""2022-01-01""","""Scientists Discover New Specie…","""in""",1
"""The Daily Deception""","""2022-01-01""","""Scientists Discover New Specie…","""a""",1
"""The Daily Deception""","""2022-01-01""","""Scientists Discover New Specie…","""groundbreaking""",1
"""The Daily Deception""","""2022-01-01""","""Scientists Discover New Specie…","""discovery,""",1
"""The Daily Deception""","""2022-01-01""","""Scientists Discover New Specie…","""scientists""",1
…,…,…,…,…
"""The Hoax Herald""","""2022-01-06""","""World's Largest Iceberg Discov…","""of""",1
"""The Hoax Herald""","""2022-01-06""","""World's Largest Iceberg Discov…","""this""",1
"""The Hoax Herald""","""2022-01-06""","""World's Largest Iceberg Discov…","""size""",1
"""The Hoax Herald""","""2022-01-06""","""World's Largest Iceberg Discov…","""could""",1


Pivot the output so that the article metadata is preserved on each row and the remainder of the columns indicate if the column name is present in the text of that article. 

Ensure the column names are sorted

In [23]:
fake_news_df.with_columns(
    pl.col("text").str.to_lowercase().str.split(" "),
    placeholder = pl.lit(1)
).explode("text").pivot(
    on="text",
    index=["publication", "date", "title"],
    values="placeholder",
    sort_columns=True,
    aggregate_function="max"
)

publication,date,title,"10,000",100,"100,000",50,a,actually,advanced,aliens,all,amazon,among,an,and,are,arrived,as,at,ate,average,baffled,been,bizarre,breakthrough,british,by,can,cancer,central,change.,climate,coast,conducted,controversy,could,…,spaceship,span,sparked,species,speeds,study,"study,",surprising,team,technology,than,that,the,this,those,to,tour.,tourists,turn,"twist,",two,up,"visitors,",waiting,was,were,whale-watching,which,who,"wide,",will,wings,world,world's,years,you.,zoology.
str,str,str,i32,i32,i32,i32,i32,i32,i32,i32,i32,i32,i32,i32,i32,i32,i32,i32,i32,i32,i32,i32,i32,i32,i32,i32,i32,i32,i32,i32,i32,i32,i32,i32,i32,i32,…,i32,i32,i32,i32,i32,i32,i32,i32,i32,i32,i32,i32,i32,i32,i32,i32,i32,i32,i32,i32,i32,i32,i32,i32,i32,i32,i32,i32,i32,i32,i32,i32,i32,i32,i32,i32,i32
"""The Daily Deception""","""2022-01-01""","""Scientists Discover New Specie…",,1.0,,1.0,1,,,,,1.0,,,1,,,,,,,,,,,,,1.0,,,,,,,,1.0,…,,1.0,,1.0,1.0,,,,,,,1.0,1,1.0,,1.0,,,,,,1.0,,,,1.0,,1.0,,,,1.0,,,,,1.0
"""Faux News Network""","""2022-01-03""","""Aliens Land on Earth and Offer…",,,,,1,,1.0,1.0,1.0,,,,1,,1.0,,,,,,,,,,,1.0,,1.0,1.0,1.0,,,,,…,1.0,,,,,,,,,1.0,,1.0,1,1.0,,1.0,,,,,,,1.0,1.0,,,,,1.0,,1.0,,1.0,,,,
"""The Fabricator""","""2022-01-04""","""Study Shows That Eating Pizza …",1.0,,,,1,,,,,,,,1,,,1.0,1.0,1.0,1.0,,1.0,,1.0,,1.0,1.0,,,,,,1.0,,,…,,,,,,1.0,1.0,,1.0,,1.0,1.0,1,,1.0,1.0,,,,,1.0,,,,1.0,,,1.0,1.0,,,,,,1.0,,
"""The Misleader""","""2022-01-05""","""New Study Finds That Smoking i…",,,1.0,,1,1.0,,,,,1.0,,1,,,,1.0,,,,,,,1.0,1.0,,1.0,,,,,1.0,1.0,,…,,,1.0,,,1.0,1.0,1.0,1.0,,1.0,1.0,1,,1.0,,,,,1.0,,,,,1.0,,,1.0,1.0,,,,,,,1.0,
"""The Hoax Herald""","""2022-01-06""","""World's Largest Iceberg Discov…",,1.0,,1.0,1,,,,,,,1.0,1,1.0,,,,,,1.0,1.0,1.0,,,1.0,,,,,,1.0,,,1.0,…,,,,,,,,,,,,,1,1.0,,1.0,1.0,1.0,1.0,,,,,,1.0,,1.0,1.0,,1.0,,,,1.0,,,


Replace the `null` values with 0

In [24]:
fake_news_df.with_columns(
    pl.col("text").str.to_lowercase().str.split(" "),
    placeholder = pl.lit(1)
).explode("text").pivot(
    on="text",
    index=["publication", "date", "title"],
    values="placeholder",
    sort_columns=True,
    aggregate_function="max"
).fill_null(value=0)

publication,date,title,"10,000",100,"100,000",50,a,actually,advanced,aliens,all,amazon,among,an,and,are,arrived,as,at,ate,average,baffled,been,bizarre,breakthrough,british,by,can,cancer,central,change.,climate,coast,conducted,controversy,could,…,spaceship,span,sparked,species,speeds,study,"study,",surprising,team,technology,than,that,the,this,those,to,tour.,tourists,turn,"twist,",two,up,"visitors,",waiting,was,were,whale-watching,which,who,"wide,",will,wings,world,world's,years,you.,zoology.
str,str,str,i32,i32,i32,i32,i32,i32,i32,i32,i32,i32,i32,i32,i32,i32,i32,i32,i32,i32,i32,i32,i32,i32,i32,i32,i32,i32,i32,i32,i32,i32,i32,i32,i32,i32,…,i32,i32,i32,i32,i32,i32,i32,i32,i32,i32,i32,i32,i32,i32,i32,i32,i32,i32,i32,i32,i32,i32,i32,i32,i32,i32,i32,i32,i32,i32,i32,i32,i32,i32,i32,i32,i32
"""The Daily Deception""","""2022-01-01""","""Scientists Discover New Specie…",0,1,0,1,1,0,0,0,0,1,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,1,…,0,1,0,1,1,0,0,0,0,0,0,1,1,1,0,1,0,0,0,0,0,1,0,0,0,1,0,1,0,0,0,1,0,0,0,0,1
"""Faux News Network""","""2022-01-03""","""Aliens Land on Earth and Offer…",0,0,0,0,1,0,1,1,1,0,0,0,1,0,1,0,0,0,0,0,0,0,0,0,0,1,0,1,1,1,0,0,0,0,…,1,0,0,0,0,0,0,0,0,1,0,1,1,1,0,1,0,0,0,0,0,0,1,1,0,0,0,0,1,0,1,0,1,0,0,0,0
"""The Fabricator""","""2022-01-04""","""Study Shows That Eating Pizza …",1,0,0,0,1,0,0,0,0,0,0,0,1,0,0,1,1,1,1,0,1,0,1,0,1,1,0,0,0,0,0,1,0,0,…,0,0,0,0,0,1,1,0,1,0,1,1,1,0,1,1,0,0,0,0,1,0,0,0,1,0,0,1,1,0,0,0,0,0,1,0,0
"""The Misleader""","""2022-01-05""","""New Study Finds That Smoking i…",0,0,1,0,1,1,0,0,0,0,1,0,1,0,0,0,1,0,0,0,0,0,0,1,1,0,1,0,0,0,0,1,1,0,…,0,0,1,0,0,1,1,1,1,0,1,1,1,0,1,0,0,0,0,1,0,0,0,0,1,0,0,1,1,0,0,0,0,0,0,1,0
"""The Hoax Herald""","""2022-01-06""","""World's Largest Iceberg Discov…",0,1,0,1,1,0,0,0,0,0,0,1,1,1,0,0,0,0,0,1,1,1,0,0,1,0,0,0,0,0,1,0,0,1,…,0,0,0,0,0,0,0,0,0,0,0,0,1,1,0,1,1,1,1,0,0,0,0,0,1,0,1,1,0,1,0,0,0,1,0,0,0


### Exercise 3

We have a table showing electricity rates paid by a household. The rates can vary by:
- day of the week
- time of day with 00:00 overnight and 12:00 during the day

We identify each household with an `ID`.

In [25]:
pl.Config.set_tbl_rows(14)
df_rates = (
    pl.DataFrame(
        {
            "id":['A','A'],
            "Mon":[1,None],
            "Tue":[1,None],
            "Wed":[1,None],
            "Thu":[1,None],
            "Fri":[1,None],
            "Sat":[None,1],
            "Sun":[None,1],
            "00:00":[0.1,0.3],
            "12:00":[0.15,0.25]
            
        }
    )
)
df_rates

id,Mon,Tue,Wed,Thu,Fri,Sat,Sun,00:00,12:00
str,i64,i64,i64,i64,i64,i64,i64,f64,f64
"""A""",1.0,1.0,1.0,1.0,1.0,,,0.1,0.15
"""A""",,,,,,1.0,1.0,0.3,0.25


Use `unpivot` and `pivot` (and other methods) to transform `df_rates` to the following `DataFrame` where each row is a (day-of-week,time-of-day) pair and each column is the rate for each `id` in that period

In [26]:
target_df = pl.DataFrame(
    [
        {"weekday": "Mon", "variable": "00:00", "A": 0.1},
        {"weekday": "Tue", "variable": "00:00", "A": 0.1},
        {"weekday": "Wed", "variable": "00:00", "A": 0.1},
        {"weekday": "Thu", "variable": "00:00", "A": 0.1},
        {"weekday": "Fri", "variable": "00:00", "A": 0.1},
        {"weekday": "Sat", "variable": "00:00", "A": 0.3},
        {"weekday": "Sun", "variable": "00:00", "A": 0.3},
        {"weekday": "Mon", "variable": "12:00", "A": 0.15},
        {"weekday": "Tue", "variable": "12:00", "A": 0.15},
        {"weekday": "Wed", "variable": "12:00", "A": 0.15},
        {"weekday": "Thu", "variable": "12:00", "A": 0.15},
        {"weekday": "Fri", "variable": "12:00", "A": 0.15},
        {"weekday": "Sat", "variable": "12:00", "A": 0.25},
        {"weekday": "Sun", "variable": "12:00", "A": 0.25},
    ]
)
target_df

weekday,variable,A
str,str,f64
"""Mon""","""00:00""",0.1
"""Tue""","""00:00""",0.1
"""Wed""","""00:00""",0.1
"""Thu""","""00:00""",0.1
"""Fri""","""00:00""",0.1
"""Sat""","""00:00""",0.3
"""Sun""","""00:00""",0.3
"""Mon""","""12:00""",0.15
"""Tue""","""12:00""",0.15
"""Wed""","""12:00""",0.15


In [30]:
df_rates.unpivot(
    index=["id", "00:00", "12:00"],
    variable_name="weekday"
)

id,00:00,12:00,weekday,value
str,f64,f64,str,i64
"""A""",0.1,0.15,"""Mon""",1.0
"""A""",0.3,0.25,"""Mon""",
"""A""",0.1,0.15,"""Tue""",1.0
"""A""",0.3,0.25,"""Tue""",
"""A""",0.1,0.15,"""Wed""",1.0
"""A""",0.3,0.25,"""Wed""",
"""A""",0.1,0.15,"""Thu""",1.0
"""A""",0.3,0.25,"""Thu""",
"""A""",0.1,0.15,"""Fri""",1.0
"""A""",0.3,0.25,"""Fri""",


In [31]:
df_rates.unpivot(
    index=["id", "00:00", "12:00"],
    variable_name="weekday"
).filter(
    pl.col("value").is_not_null()
).drop("value")

id,00:00,12:00,weekday
str,f64,f64,str
"""A""",0.1,0.15,"""Mon"""
"""A""",0.1,0.15,"""Tue"""
"""A""",0.1,0.15,"""Wed"""
"""A""",0.1,0.15,"""Thu"""
"""A""",0.1,0.15,"""Fri"""
"""A""",0.3,0.25,"""Sat"""
"""A""",0.3,0.25,"""Sun"""


In [32]:
df_rates.unpivot(
    index=["id", "00:00", "12:00"],
    variable_name="weekday"
).filter(
    pl.col("value").is_not_null()
).drop("value").unpivot(
    index=["id", "weekday"],
)

id,weekday,variable,value
str,str,str,f64
"""A""","""Mon""","""00:00""",0.1
"""A""","""Tue""","""00:00""",0.1
"""A""","""Wed""","""00:00""",0.1
"""A""","""Thu""","""00:00""",0.1
"""A""","""Fri""","""00:00""",0.1
"""A""","""Sat""","""00:00""",0.3
"""A""","""Sun""","""00:00""",0.3
"""A""","""Mon""","""12:00""",0.15
"""A""","""Tue""","""12:00""",0.15
"""A""","""Wed""","""12:00""",0.15


In [28]:
df_rates.unpivot(
    index=["id", "00:00", "12:00"],
    variable_name="weekday"
).filter(
    pl.col("value").is_not_null()
).drop("value").unpivot(
    index=["id", "weekday"],
).pivot(
    index=["weekday", "variable"],
    on="id",
    values="value"
)

weekday,variable,A
str,str,f64
"""Mon""","""00:00""",0.1
"""Tue""","""00:00""",0.1
"""Wed""","""00:00""",0.1
"""Thu""","""00:00""",0.1
"""Fri""","""00:00""",0.1
"""Sat""","""00:00""",0.3
"""Sun""","""00:00""",0.3
"""Mon""","""12:00""",0.15
"""Tue""","""12:00""",0.15
"""Wed""","""12:00""",0.15
