# Visualisations

- [Notebook Preparations](#notebook-preparations)

- [Visualisations](#visualisations)
    - Man-Kendall Test Results
        - Month
        - Month & Elevation
        - Country

## Notebook Preparations

In [1]:
# Import Packages

import pandas as pd
import numpy as np
import plotly
import matplotlib.pyplot as plt
import seaborn as sns
import plotly.express as px
import geopandas as gpd
import pandas as pd
from shapely.geometry import Point
from pathlib import Path
import sys


In [2]:
# Directories

NB_DIR = Path.cwd()                 # Notebook Directory
REPO_ROOT = NB_DIR.parent           # Main Repo Directory
sys.path.insert(0, str(REPO_ROOT))  # Assign REPO ROOT as ROOT Directory for Import searches

# Files

# Macro-perspective Trends
avg_country = pd.read_csv(REPO_ROOT / 'Data/Cleaned/Tests/avg_country_trends.csv',index_col=False)                  # Average Snowpack Depth Trend Per Country
avg_country_month = pd.read_csv(REPO_ROOT / 'Data/Cleaned/Tests/avg_country_month_trends.csv',index_col=False)      # Average Snowpack Depth Trend Per Country Month
avg_month = pd.read_csv(REPO_ROOT / 'Data/Cleaned/Tests/avg_month_trends.csv',index_col=False)                      # Average Snowpack Depth Trend per Month
avg_elevation_month = pd.read_csv(REPO_ROOT / 'Data/Cleaned/Tests/avg_elevation_month_trends.csv',index_col=False)  # Average Snowpack Depth Trend Per Elevation Bad Mong

# Micro-persective Trends
typical_country = pd.read_csv(REPO_ROOT / 'Data/Cleaned/Tests/station-month-time-series-by-country.csv',index_col=False)                # Typical Snowpack Depth Trend of Weather Station Per Country 
typical_country_month = pd.read_csv(REPO_ROOT / 'Data/Cleaned/Tests/station-month-time-series-by-country-month.csv',index_col=False )   # Typical Snowpack Depth Trend of Weather Station Per Country Month
typical_station_month = pd.read_csv(REPO_ROOT / 'Data/Cleaned/Tests/station-month-time-series.csv',index_col=False)                     # Typical Snowpack Depth Trend of Weather Station Per Month
per_station = pd.read_csv(REPO_ROOT / 'Data/Cleaned/Tests/per_station_series.csv',index_col=False)                                      # Annual Snowpack Depth Trend per Weather Station



## Visualisations

Visualisations of Statistical Trends for Average Snowpack Depths



### Station Coverage For Each Country

This chart shows, by country and winter month, the median number of stations per year that pass the analysis filters of ≥ 30 years of data and ≥ 10 stations/year. It communicates quality check on data coverage for the country and month-level trend plots that follow.

In [3]:
from Scripts.figures import country_coverage
coverage_fig = country_coverage(avg_country_month)
coverage_fig

- Austria maintains the highest and most stable coverage (≈ 358–362 stations/year) across all winter months.

- France, Slovenia and Italy have the lowest coverage (≈ 48–80 stations/year), with Italy dipping lowest in May and Nov.

- Germany and Switzerland cluster in the mid-range (≈ 145–155 stations/year) with minimal month-to-month variation.

### Country Trends

#### Distribution Of Station Slopes Per Country Month

This chart shows the distribution of station–month time-series slopes (Theil–Sen, cm/decade) by country. Each dot is one station-month series; the violin summarizes its spread. 

The black ♦ marks a macro signal, the median of the country-month slopes across the winter season (Nov–May), which captures each country’s central seasonal trend while remaining robust to outliers and month-to-month imbalance. 

The horizontal dashed blue line is zero-slope change; values below it indicate long-term declines.


In [4]:
from Scripts.figures import country_station_slope_distrib
country_station_disb = country_station_slope_distrib(typical_station_month, avg_country_month)
country_station_disb

##### Interpretation
- Extreme values are visibly present in both postive and negative directions for countries France, Germany, Italy and Switzerland. While all 5,309 station-month time series present in this chart have undergone cleaning meeting thresholds of >= 30 years of data and Mann–Kendall statistica testing with Hamed–Rao autocorrelation adjustments, further analysis may be viable to review extreme values

- All countries, except Italy, present a negative median Theil-Sen slope value per decade for the typical regional weather station. Comparatively, all countries present a negative mean/average Theil-Sen slope value per decade across their country, within the Interquartile Range of the typical-station distribution.


#### Country-Year Mean Snowpack Series

This figure compares station-level and country-level trend results using the Mann–Kendall (MK) test with the Hamed–Rao autocorrelation adjustment and Theil–Sen slope estimates (cm/decade).

- Left Chart 
    - Stations: each point is a station’s Theil–Sen slope plotted against its two-sided MK p-value.

- Right panel
    - Countries: one point per country from MK applied to the country–year mean snowpack time series.

Guide lines: 
- The horizontal dashed red line marks the significance threshold (α = 0.05); points below it are statistically significant. 
- The vertical dashed blue line marks zero slope (left = declines; right = increases).

In [5]:
from Scripts.figures import country_trends_fig
country_fig = country_trends_fig(per_station,avg_country)
country_fig

##### Interpretation

Countries Italy, Slovenia, and Austria exhibit statistically significant decreases in country-level mean snowpack depth, with Sen slopes of roughly −2 to −4 cm per decade. Germany and Switzerland show negative but non-significant trends (about −1 cm/decade), meaning the decreases are not distinguishable from zero at 𝛼 = 0.05. France shows a non-significant slight increase. Results are from Mann–Kendall (Hamed–Rao) tests applied to country-year average snowpack series; slopes are reported in cm per decade.

### Month Trends

#### Distribution Of Station Slopes Per Month

This chart shows the distribution of Theil–Sen slopes (cm/decade) for all station–month time series. Each dot is one station-month series for the violin summarizes its spread including summary statistics. 

The black ♦ is the month’s median across stations from a a macro-aggreated summary for that month. Diamonds below 0 indicate that the median station-month series from aggreated tests show declines that month; the more negative, the steeper the decline.

The horizontal dashed blue line is zero-slope change; values below it indicate long-term declines.


In [6]:
from Scripts.figures import month_station_slope_distrib
month_station_disb = month_station_slope_distrib(typical_station_month, avg_month)
month_station_disb





##### Interpretation

Each violin shows the distribution of station-month Theil–Sen slopes for a given month (cm/decade). The dashed blue line marks zero slope. The black ♦ is the month-aggregate median—the median slope from the aggregated month series (i.e., one series per month created by aggregating stations and then estimating its trend).

A difference is present between the **Macro (aggregated) median (per month) ♦** and the **Station-level median (per month)** box (the IQR, from Q1 to Q3) for each month. This difference in median the macro-aggregation tends to yield differents slopes than the typical station and therefore producing an aggregation bias. How much teh month-aggregate median differs from the typical station's median is mentioned below.

- February shows the largest aggregation bias (−2.1 cm/decade): the month-aggregate median is much more negative than the typical station’s median.

- May and December also skew more negative (−1.09 and −0.83 cm/decade).

- November and March are slightly positive (+0.36 and +0.34 cm/decade), meaning the aggregate month series is a bit less negative than the typical station’s behaviour in those months.

Across months, distributions remain predominantly on the negative side of zero, consistent with overall declines in snowpack, but the magnitude of that decline depends on how you aggregate.

**Area Of Concern**
Aggregating first (then estimating a single trend) does not always equal the median of station-level trends. Seasonal changes in coverage, station heterogeneity, and nonlinearity can make the aggregate month series exaggerate winter-core declines (Feb) relative to the typical station, and occasionally mute them (Nov/Mar).

Further analysis of this aggregation bias should be applicable to continuing this project.


#### Month Mean Snowpack Series

This figure uses the same dataframes of station-month-level and month-level trend results. Each point is the Theil–Sen slope (cm/decade) and p-value computed on the month-average time snowpack series of Nov -> May, produced from the Hamed–Rao autocorrelation adjustment variant of the Mann-Kendall test.

- Left Chart 
    - Station-Months: each point is a station-month time series Theil–Sen slope plotted against its two-sided MK p-value.

- Right panel
    - Month: one point per Month from MK applied to the Month mean snowpack time series.

Guide lines: 
- The horizontal dashed red line marks the significance threshold (α = 0.05); points below it are statistically significant. 
- The vertical dashed blue line marks zero slope (left = declines; right = increases).

In [7]:
from Scripts.figures import month_trends_fig
month_fig = month_trends_fig(typical_station_month,avg_month)
month_fig

##### Interpretation

Months April and May exhibit statistically significant decreases in European-Alps mean snowpack depth, with Sen slopes of  -1.66 and -1.10 cm per decade respectively. All other winter months, February, December, January, November and March show negative but non-significant trends, with Sen Slopes of -0.57 to -1.94 cm per decade.  These decreases ccannot be distinguished from zero trend due to insufficient evidence (α=0.05).

Further analysis can investiage changes in weather patterns of Springs months (April, May) to link correlation of decreasing Sen slopes of mean snowpack depths.



### Country Month Heatmap

The following figure shows the median Theil-Sen slope per decade for the typical station of each country-month. 

The value ontop of heat squares is the share of the station-month series that are statistically significant by the Hamed–Rao–adjusted Mann–Kendall test (two-sided, α = 0.05). 

Guide lines:
- Use color to compare the magnitude and direction of the typical (median) station trend for each country-month.
- Use the percent label to gauge how widespread that statistically significant trend is across stations for that country-month.
- The denominator for each tile is the number of stations available after quantility control in that country-month (Hover statistic:  'n_stations').

In [8]:
from Scripts.figures import country_month_heat
country_month_heatmap = country_month_heat(avg_country_month, typical_country_month)
country_month_heatmap


##### Interpretation

Each tile shows the **median station-level** Theil–Sen slope (cm/decade) for a given country × month.
Color encodes the slope (blue = decline, red = increase; centered at 0), and the label is the share of station-month series that are statistically significant by the Hamed–Rao–adjusted Mann–Kendall test (two-sided, α = 0.05). Hover text includes the exact slope and the number of stations contributing to that cell.


**Declines dominate**. Most tiles are blue, indicating negative trends in mean snowpack depth.

- Slovenia (Nov–Dec): Declines of 2.5 - 3.48 cm/decade with 43–64% of station series significant.
- Italy (Apr): Declines at 2.55 cm/decade with 50% significance.
- France (Mar): Declines at 4.09 cm/decade with 26% significant.
- Austria (Nov) and Switzerland (Apr) also show sizeable significant shares on fringe months of 3.33 - 2.64 cm/decade at 41% and 31% significance respestively.

Many cells have limited station-level significance (single-digit to low-teens %). In those months/countries, the median slope should be viewed as descriptive signal rather than broad, station-level consensus.

Seasonality & geography matter. The trends seem to be more negative during the core winter/early spring months for several countries. Mowever, the month of May lacks strong signifinace and any station-level trends due to overwhelming median levels at 0.00 cm/decade. This finding is conflictive with Macro Aggregation trends of previous charts and is an objective for further analysis

### Elevation Band Heatmap

The following figure shows the Theil-Sen slope per decade for the for **month-level mean snowpack depth for each Elevation Band** across the European Alps.

The calculated slope is the Theil–Sen estimate computed on the yearly mean snowpack series for each elevation band and month (i.e., slope of the band-level mean across years)

Guide Lines:
- Use color to compare the magnitude and direction of the typical (median) station trend for each elevation bands.
- The **●** black dot ontop of heat squares marks marks cells where the Hamed–Rao–adjusted Mann–Kendall test indicates the trend is statistically significant (**p ≤ 0.05**).
- Hover Tile Details
    - **Slope** : Theil-San Monotonic trend in cm / decade
    - **p-value** : Mann-Kendall (Hamed–Rao variant) two-sided p-value 
    - **Years Of Data** : The count of years contributing to that elevation band - month estimate
    - **Median # Stations** : The median number of stations per year withn that elevation band-month.

*Notes:* Treat cells with **few years** or **few stations** (see hover) with extra caution; small samples can yield unstable estimates.


In [9]:
# How many stations were removed from Data during the cleaning process in MK Testing?
#---------#

# How many distinct stations originally in each elevation band prior to MK Test?
snow_recordings = pd.read_csv(REPO_ROOT/'data/cleaned/snow_recordings.csv')
stations_per_band = (
    snow_recordings
    .dropna(subset=['station_id'])
    .groupby('elevation_band')['station_id']
    .nunique()                                  # distinct stations
    .rename('n_stations')
    .sort_values(ascending=False)
)

# The median number of stations from Elevation Band-Month series in each elevation band after cleaning during MK Test?
stations_per_band_MK = (avg_elevation_month[['elevation_band','median_stations_per_year']]
    .groupby('elevation_band')
    .median('median_stations_per_year')
    
)

# View comparison Pre-MK and Post-MK Testing
Elevation_band_stations = pd.merge(stations_per_band,stations_per_band_MK,how='inner',on='elevation_band')

Elevation_band_stations = (Elevation_band_stations.reset_index()
                        .rename(columns={
                                'median_stations_per_year':'Median # Stations After MK Test',
                                'n_stations':'# Stations Pre Testing'}))
Elevation_band_stations['Median # Stations After MK Test'] = Elevation_band_stations['Median # Stations After MK Test'].astype(int)

Elevation_band_stations

Unnamed: 0,elevation_band,# Stations Pre Testing,Median # Stations After MK Test
0,Low Elevation,750,431
1,Mid Elevation,604,249
2,High Elevation,60,16


In [10]:
from Scripts.figures import elevation_band_heat
elevation_fig = elevation_band_heat(avg_elevation_month)
elevation_fig

##### Interpretation

All elevation bands see a general decline in average snowpack depth per month. Strongest declines at High Elevation (>2,000 m). All tiles predominantly blue, with April showing the largest decrease, but at an extreme level even with ● p ≤ 0.05.

High Elevation has fewer contributing stations (hover shows median stations/year ≈ 13–19 and years of data). 
Data cleaning requirements during MK testing (≥ 30 station-years per time-month series) reduced station availability, especially at High Elevation (a reduction of 60 to ~16)t. The median stations/year in the hovers reflect this post-cleaning support. **Treat those cells as higher-variance** as small samples can yield unstable estimates.

Late-season signals at lower bands: In May, both Mid (1,000–2,000 m) and Low (≤1,000 m) show small but significant declines (●), while earlier months at these bands are weak or non-significant.

P-values are unadjusted across 21 cells; small significant signals (especially in May at lower bands) should be interpreted with that context.