## Recode weather codes from open-meteo to custom clustering logic

WMO Weather interpretation codes (WW) ([source](https://open-meteo.com/en/docs))

| Code | Description |
|------|-------------|
| 0 | Clear sky |
| 1, 2, 3 | Mainly clear, partly cloudy, and overcast |
| 45, 48 | Fog and depositing rime fog |
| 51, 53, 55 | Drizzle: Light, moderate, and dense intensity |
| 56, 57 | Freezing Drizzle: Light and dense intensity |
| 61, 63, 65 | Rain: Slight, moderate and heavy intensity |
| 66, 67 | Freezing Rain: Light and heavy intensity |
| 71, 73, 75 | Snow fall: Slight, moderate, and heavy intensity |
| 77 | Snow grains |
| 80, 81, 82 | Rain showers: Slight, moderate, and violent |
| 85, 86 | Snow showers slight and heavy |
| 95 * | Thunderstorm: Slight or moderate |
| 96, 99 * | Thunderstorm with slight and heavy hail |

(*) Thunderstorm forecast with hail is only available in Central Europe

In [16]:
import pandas as pd

In [17]:
file_path = 'data/processed/weather_hourly_all_locations_2023.parquet'
weather = pd.read_parquet(file_path)

In [18]:
weather_clusters = {
    "clear_and_cloudy": [0, 1, 2, 3],
    "precipitation": [51, 53, 55, 61, 63, 65, 80, 81, 82],
    "frozen_precipitation": [56, 57, 66, 67, 71, 73, 75, 77, 85, 86],
    "low_visibility": [45, 48],
    "severe_weather": [95, 96, 99]
}

code_to_cluster = {}
for cluster_name, weather_codes in weather_clusters.items():
    for code in weather_codes:
        code_to_cluster[code] = cluster_name

In [19]:
def map_code_to_cluster(code):
    return code_to_cluster.get(code, "Unknown")

weather['weather_cluster'] = weather['weather_code'].map(code_to_cluster)

In [20]:
weather.to_parquet(file_path, index=False, compression='brotli')  