### London Weather

I've been learning a little bit of polars this week, the syntax is reasonably straightfoward coming from what I know of pandas and PySpark. I thought I'd just take a weather dataset and see how I could transform it to use some of the conditional formatting (.data_colors method) that the great_tables library allows.

The finished table allows for the resulting data frame to be transformed from displaying one attribute to another (i.e. switching between columns being aggregated - cloud, sunshine, temp, etc) by changing the "measure" variable. I expect this could be embedded into a dash or streamlit app with some work for it to work as a dashboard (maybe one for the future).

great_tables upgraded to v0.3.0 the other day, so I've made use of the new .cols_width method and some list comprehension to make my columns even widths. 

The data sources used in this project are:

[CSV of London Weather Data source](https://www.kaggle.com/datasets/emmanuelfwerr/london-weather-data/data)
[Original London Weather source](https://www.ecad.eu/dailydata/)

In [None]:
import great_tables
from great_tables import GT, loc, style, html, md
import polars as pl

def month_number_to_name(month_number):
    month_names = ["January", "February", "March", "April", "May", "June",
                   "July", "August", "September", "October", "November", "December"]
    return month_names[month_number - 1]

df = pl.read_csv("london_weather.csv")

In [37]:
# cloud_cover, sunshine, global_radiation, max_temp, mean_temp, min_temp, precipitation, pressure, snow_depth

measure = "min_temp"

measure_dict = {"cloud_cover": ["Cloud Cover", "cloud over (oktas)"],
    "sunshine": ["Sunshine", "sunshine (hours)"],
    "global_radiation": ["Global Radiation", "global radiation (W/m^2)"],
    "max_temp": ["Max Temperature", "maximum temperature (°C)"],
    "mean_temp": ["Mean Temperature", "mean temperature (°C)"],
    "min_temp": ["Minimum Temperature", "minimum temperature (°C)"],
    "precipitation": ["Precipitation", "precipitation (mm)"],
    "pressure": ["Pressure", "pressure (Pa)"],
    "snow_depth": ["Snow Depth", "snow depth (cm)"]}

mean_measure = (df
        .with_columns(date=pl.col("date").cast(str).str.strptime(pl.Date, "%Y%m%d"),
                    Year=pl.col("date").cast(str).str.strptime(pl.Date, "%Y%m%d").dt.year(),
                    month=pl.col("date").cast(str).str.strptime(pl.Date, "%Y%m%d").dt.month())
        .filter(pl.col("Year") % 5 == 0)
        .group_by(["Year", "month"]).agg(pl.mean(measure).round(2))
        .sort("Year", "month")
        .with_columns(month=pl.col("month").map_elements(month_number_to_name))
        .pivot(index="Year", columns="month", values=measure)
        )

min_val = (df
.with_columns(date=pl.col("date").cast(str).str.strptime(pl.Date, "%Y%m%d"),
            Year=pl.col("date").cast(str).str.strptime(pl.Date, "%Y%m%d").dt.year(),
            month=pl.col("date").cast(str).str.strptime(pl.Date, "%Y%m%d").dt.month())
.filter(pl.col("Year") % 5 == 0)
.group_by(["Year", "month"]).agg(pl.mean(measure).round(2).alias("value"))
.with_columns(pl.col("value").min())
).select(pl.col("value")).min()

max_val = (df
.with_columns(date=pl.col("date").cast(str).str.strptime(pl.Date, "%Y%m%d"),
            Year=pl.col("date").cast(str).str.strptime(pl.Date, "%Y%m%d").dt.year(),
            month=pl.col("date").cast(str).str.strptime(pl.Date, "%Y%m%d").dt.month())
.filter(pl.col("Year") % 5 == 0)
.group_by(["Year", "month"]).agg(pl.mean(measure).round(2).alias("value"))
.with_columns(pl.col("value").max())
).select(pl.col("value")).max()

(GT(mean_measure)
.tab_header(
    title=f"London {measure_dict[measure][0]} Measurements",
    subtitle=f"Mean {measure_dict[measure][1]} measurements in London (1980-2020)",
)
.opt_align_table_header(align="center")
.cols_label(January=md("<center>Jan</center>"), February=md("<center>Feb</center>"), March=md("<center>Mar</center>"), 
            April=md("<center>Apr</center>"), May=md("<center>May</center>"), June=md("<center>Jun</center>"), 
            July=md("<center>Jul</center>"), August=md("<center>Aug</center>"), September=md("<center>Sep</center>"), 
            October=md("<center>Oct</center>"), November=md("<center>Nov</center>"), December=md("<center>Dec</center>"))
.cols_width(
    cases={col: "50px" for col in mean_measure.columns[1:]}
    )
.tab_style(
    style=[
        style.text(align="center", size="12px")
    ],
    locations=loc.body(columns=mean_measure.columns[1:])
)
.data_color(
        domain=[max_val.item(), min_val.item()],
        palette=["rebeccapurple", "white", "orange"],
        na_color="white",
        columns=mean_measure.columns[1:]
    )
.tab_source_note(
        source_note=html("Reference: European Climate Assessment & Dataset (<a href>https://www.ecad.eu/dailydata/</a>)")
    )
)

London Minimum Temperature Measurements,London Minimum Temperature Measurements,London Minimum Temperature Measurements,London Minimum Temperature Measurements,London Minimum Temperature Measurements,London Minimum Temperature Measurements,London Minimum Temperature Measurements,London Minimum Temperature Measurements,London Minimum Temperature Measurements,London Minimum Temperature Measurements,London Minimum Temperature Measurements,London Minimum Temperature Measurements,London Minimum Temperature Measurements
Mean minimum temperature (°C) measurements in London (1980-2020),Mean minimum temperature (°C) measurements in London (1980-2020).1,Mean minimum temperature (°C) measurements in London (1980-2020).2,Mean minimum temperature (°C) measurements in London (1980-2020).3,Mean minimum temperature (°C) measurements in London (1980-2020).4,Mean minimum temperature (°C) measurements in London (1980-2020).5,Mean minimum temperature (°C) measurements in London (1980-2020).6,Mean minimum temperature (°C) measurements in London (1980-2020).7,Mean minimum temperature (°C) measurements in London (1980-2020).8,Mean minimum temperature (°C) measurements in London (1980-2020).9,Mean minimum temperature (°C) measurements in London (1980-2020).10,Mean minimum temperature (°C) measurements in London (1980-2020).11,Mean minimum temperature (°C) measurements in London (1980-2020).12
1980,-0.14,2.9,2.42,5.35,7.44,10.85,11.88,13.32,12.01,5.76,4.52,2.7
1985,-1.83,0.12,1.28,5.27,8.22,9.92,13.18,11.94,11.36,8.75,1.22,5.19
1990,4.37,5.29,4.81,4.21,9.0,10.83,13.2,14.67,9.98,9.63,5.13,2.49
1995,2.23,4.39,2.0,6.05,8.18,10.84,15.2,15.62,10.73,10.17,5.14,0.97
2000,2.41,3.78,4.93,5.37,9.57,12.81,12.82,14.0,12.52,8.52,4.5,4.57
2005,3.79,2.52,4.49,6.17,8.51,12.88,14.12,13.01,12.73,11.25,3.52,1.91
2010,-0.31,1.71,3.68,5.55,7.67,12.13,15.07,13.18,11.24,8.33,3.95,-1.48
2015,1.63,1.75,4.1,5.95,8.76,11.4,13.84,14.07,10.16,9.26,7.95,8.85
2020,5.24,5.32,4.17,6.57,9.32,12.71,13.54,15.74,11.58,9.04,7.03,4.19
Year,Jan,Feb,Mar,Apr,May,Jun,Jul,Aug,Sep,Oct,Nov,Dec
