# Stationbench Tutorial

This tutorial demonstrates how to use the stationbench repository to:
1. Preprocess weather forecast and ground truth data
2. Calculate verification metrics
3. Compare multiple forecasts and visualize results

This tutorial runs in a notebook environment. The same commands can be run in a terminal or script.

## Setup

First, complete the [setup guide](setup.md) then import the required packages.

In [1]:
# Authenticate with Google Cloud and restart the kernel
# !gcloud auth application-default login

In [2]:
import xarray as xr
import numpy as np
import stationbench

## 1. Data Preprocessing

Stationbench expects forecast data and ground truth observations in Zarr format. Let's look at both datasets.

### Data format

The forecast data should be a Zarr dataset with the following structure:

```
<xarray.Dataset>
Dimensions:
  - time: Forecast initialization times
  - prediction_timedelta: Forecast lead times
  - latitude: Grid latitudes
  - longitude: Grid longitudes

Coordinates:
  - latitude: (latitude) float32, grid latitudes in degrees North
  - longitude: (longitude) float32, grid longitudes in degrees East  
  - prediction_timedelta: (prediction_timedelta) timedelta64[ns], forecast lead times
  - time: (time) datetime64[ns], initialization times

Data variables:
  - 10m_wind_speed: (time, prediction_timedelta, latitude, longitude) float32
  - 2m_temperature: (time, prediction_timedelta, latitude, longitude) float32
```

The wind speed and temperature data should be in m/s and °C respectively.

Let's analyze ensemble forecast data from ECMWF's Integrated Forecast System using the datasets provided by WeatherBench2. 
We'll evaluate the benchmark performance of 10m wind speed and 2m temperature on every first day of a month throughout 2022 at 12:00 UTC. 


In [3]:
forecast = xr.open_zarr("gs://weatherbench2/datasets/ifs_ens/2018-2022-1440x721_mean.zarr")
forecast = forecast[['10m_wind_speed', '2m_temperature']]   

# select only the first day of each month
time_mask_forecast = (
    (forecast.time.dt.year == 2022) & 
    (forecast.time.dt.day == 1) & 
    (forecast.time.dt.hour == 12)
)
forecast = forecast.sel(time=forecast.time[time_mask_forecast])

# select only lead times 0 to 10 days every 24 hours (the data is 6 hourly -> step of 4|)
forecast = forecast.isel(prediction_timedelta=slice(0, 41, 4))  

forecast


Unnamed: 0,Array,Chunk
Bytes,522.80 MiB,3.96 MiB
Shape,"(12, 11, 721, 1440)","(1, 1, 721, 1440)"
Dask graph,132 chunks in 4 graph layers,132 chunks in 4 graph layers
Data type,float32 numpy.ndarray,float32 numpy.ndarray
"Array Chunk Bytes 522.80 MiB 3.96 MiB Shape (12, 11, 721, 1440) (1, 1, 721, 1440) Dask graph 132 chunks in 4 graph layers Data type float32 numpy.ndarray",12  1  1440  721  11,

Unnamed: 0,Array,Chunk
Bytes,522.80 MiB,3.96 MiB
Shape,"(12, 11, 721, 1440)","(1, 1, 721, 1440)"
Dask graph,132 chunks in 4 graph layers,132 chunks in 4 graph layers
Data type,float32 numpy.ndarray,float32 numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,522.80 MiB,3.96 MiB
Shape,"(12, 11, 721, 1440)","(1, 1, 721, 1440)"
Dask graph,132 chunks in 4 graph layers,132 chunks in 4 graph layers
Data type,float32 numpy.ndarray,float32 numpy.ndarray
"Array Chunk Bytes 522.80 MiB 3.96 MiB Shape (12, 11, 721, 1440) (1, 1, 721, 1440) Dask graph 132 chunks in 4 graph layers Data type float32 numpy.ndarray",12  1  1440  721  11,

Unnamed: 0,Array,Chunk
Bytes,522.80 MiB,3.96 MiB
Shape,"(12, 11, 721, 1440)","(1, 1, 721, 1440)"
Dask graph,132 chunks in 4 graph layers,132 chunks in 4 graph layers
Data type,float32 numpy.ndarray,float32 numpy.ndarray


Let's also have a look at the METEOSTAT ground truth data.

In [4]:
rng = np.random.default_rng(seed=42)  # Create a random number generator instance
stations = xr.open_zarr("https://opendata.jua.sh/stationbench/meteostat_benchmark.zarr")
stations_subset = stations.isel(station_id=rng.choice(stations.station_id.size, 1000, replace=False))
stations_subset

Unnamed: 0,Array,Chunk
Bytes,7.81 kiB,7.81 kiB
Shape,"(1000,)","(1000,)"
Dask graph,1 chunks in 3 graph layers,1 chunks in 3 graph layers
Data type,int64 numpy.ndarray,int64 numpy.ndarray
"Array Chunk Bytes 7.81 kiB 7.81 kiB Shape (1000,) (1000,) Dask graph 1 chunks in 3 graph layers Data type int64 numpy.ndarray",1000  1,

Unnamed: 0,Array,Chunk
Bytes,7.81 kiB,7.81 kiB
Shape,"(1000,)","(1000,)"
Dask graph,1 chunks in 3 graph layers,1 chunks in 3 graph layers
Data type,int64 numpy.ndarray,int64 numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,3.91 kiB,3.91 kiB
Shape,"(1000,)","(1000,)"
Dask graph,1 chunks in 3 graph layers,1 chunks in 3 graph layers
Data type,float32 numpy.ndarray,float32 numpy.ndarray
"Array Chunk Bytes 3.91 kiB 3.91 kiB Shape (1000,) (1000,) Dask graph 1 chunks in 3 graph layers Data type float32 numpy.ndarray",1000  1,

Unnamed: 0,Array,Chunk
Bytes,3.91 kiB,3.91 kiB
Shape,"(1000,)","(1000,)"
Dask graph,1 chunks in 3 graph layers,1 chunks in 3 graph layers
Data type,float32 numpy.ndarray,float32 numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,3.91 kiB,3.91 kiB
Shape,"(1000,)","(1000,)"
Dask graph,1 chunks in 3 graph layers,1 chunks in 3 graph layers
Data type,float32 numpy.ndarray,float32 numpy.ndarray
"Array Chunk Bytes 3.91 kiB 3.91 kiB Shape (1000,) (1000,) Dask graph 1 chunks in 3 graph layers Data type float32 numpy.ndarray",1000  1,

Unnamed: 0,Array,Chunk
Bytes,3.91 kiB,3.91 kiB
Shape,"(1000,)","(1000,)"
Dask graph,1 chunks in 3 graph layers,1 chunks in 3 graph layers
Data type,float32 numpy.ndarray,float32 numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,234.10 MiB,16.71 MiB
Shape,"(61368, 1000)","(4380, 1000)"
Dask graph,15 chunks in 3 graph layers,15 chunks in 3 graph layers
Data type,float32 numpy.ndarray,float32 numpy.ndarray
"Array Chunk Bytes 234.10 MiB 16.71 MiB Shape (61368, 1000) (4380, 1000) Dask graph 15 chunks in 3 graph layers Data type float32 numpy.ndarray",1000  61368,

Unnamed: 0,Array,Chunk
Bytes,234.10 MiB,16.71 MiB
Shape,"(61368, 1000)","(4380, 1000)"
Dask graph,15 chunks in 3 graph layers,15 chunks in 3 graph layers
Data type,float32 numpy.ndarray,float32 numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,234.10 MiB,16.71 MiB
Shape,"(61368, 1000)","(4380, 1000)"
Dask graph,15 chunks in 3 graph layers,15 chunks in 3 graph layers
Data type,float32 numpy.ndarray,float32 numpy.ndarray
"Array Chunk Bytes 234.10 MiB 16.71 MiB Shape (61368, 1000) (4380, 1000) Dask graph 15 chunks in 3 graph layers Data type float32 numpy.ndarray",1000  61368,

Unnamed: 0,Array,Chunk
Bytes,234.10 MiB,16.71 MiB
Shape,"(61368, 1000)","(4380, 1000)"
Dask graph,15 chunks in 3 graph layers,15 chunks in 3 graph layers
Data type,float32 numpy.ndarray,float32 numpy.ndarray


In [5]:
# Create map using plotly
import plotly.graph_objects as go

# Create figure
fig = go.Figure()

# Add scatter points for all stations
fig.add_trace(go.Scattergeo(
    lon=stations.longitude,
    lat=stations.latitude,
    mode='markers',
    marker=dict(
        size=1.5,
        color='blue',
        opacity=0.5
    ),
    name='All stations'
))

# Add scatter points for subset
fig.add_trace(go.Scattergeo(
    lon=stations_subset.longitude,
    lat=stations_subset.latitude,
    mode='markers',
    marker=dict(
        size=3,
        color='red',
        opacity=0.5
    ),
    name='Selected subset'
))

# Update layout
fig.update_layout(
    title='Data availability',
    geo=dict(
        showland=True,
        showcountries=True,
        showocean=True,
        countrywidth=0.5,
        landcolor='rgb(243, 243, 243)',
        oceancolor='rgb(204, 229, 255)',
        projection_type='equirectangular',
        showcoastlines=True,
        coastlinewidth=0.5,
    ),
    width=1200,
    height=600,
)

Let's randomly sample 1000 stations from the METEOSTAT ground truth data.

Notice the difference that the ground truth data is not a grid but unstructured point data made up of stations. This package will automatically align the grid data to the station locations using linear interpolation.

## 2. Calculate Verification Metrics

Now we'll calculate RMSE between the forecast and ground truth data.
For this we need to set the following parameters:
- `--forecast`: Path or Xarray Dataset of forecast data (required)
- `--stations`: Path or Xarray Dataset of ground truth data (optional, defaults to METEOSTAT)
- `--start_date`: Start date for benchmarking (required)
- `--end_date`: End date for benchmarking (required)
- `--output`: Output path for benchmarks (required)
- `--region`: Region to benchmark (see `regions.py` for available regions)
- `--name_10m_wind_speed`: Name of 10m wind speed variable (optional)
- `--name_2m_temperature`: Name of 2m temperature variable (optional)

In [6]:
start_date = "2022-01-01"
end_date = "2022-12-31"
output_forecast = "data/forecast_benchmark.zarr"
region = "global"
name_10m_wind_speed = "10m_wind_speed"
name_2m_temperature = "2m_temperature"

stationbench.calculate_metrics(
    forecast=forecast,
    stations=stations,
    start_date=start_date,
    end_date=end_date,
    output=output_forecast,
    region=region,
    name_10m_wind_speed=name_10m_wind_speed,
    name_2m_temperature=name_2m_temperature,
    use_dask=False
)

2025-02-21 11:04:22,007 - stationbench.calculate_metrics - INFO - Preparing stations data
2025-02-21 11:04:22,007 - stationbench.calculate_metrics - INFO - Selecting region: https://linestrings.com/bbox/#-180,-90,180,90
2025-02-21 11:04:22,327 - stationbench.calculate_metrics - INFO - Filtered stations: 14491 -> 14491
2025-02-21 11:04:22,328 - stationbench.calculate_metrics - INFO - Preparing forecast dataset
2025-02-21 11:04:22,338 - stationbench.calculate_metrics - INFO - Selecting region: https://linestrings.com/bbox/#-180,-90,180,90
2025-02-21 11:04:22,340 - stationbench.calculate_metrics - INFO - Converting longitudes from 0-360 to -180-180 range
2025-02-21 11:04:22,351 - stationbench.calculate_metrics - INFO - Renaming wind speed variable from 10m_wind_speed to 10m_wind_speed
2025-02-21 11:04:22,352 - stationbench.calculate_metrics - INFO - Renaming temperature variable from 2m_temperature to 2m_temperature
2025-02-21 11:04:22,352 - stationbench.calculate_metrics - INFO - Interpo

Unnamed: 0,Array,Chunk
Bytes,56.61 kiB,56.61 kiB
Shape,"(14491,)","(14491,)"
Dask graph,1 chunks in 6 graph layers,1 chunks in 6 graph layers
Data type,float32 numpy.ndarray,float32 numpy.ndarray
"Array Chunk Bytes 56.61 kiB 56.61 kiB Shape (14491,) (14491,) Dask graph 1 chunks in 6 graph layers Data type float32 numpy.ndarray",14491  1,

Unnamed: 0,Array,Chunk
Bytes,56.61 kiB,56.61 kiB
Shape,"(14491,)","(14491,)"
Dask graph,1 chunks in 6 graph layers,1 chunks in 6 graph layers
Data type,float32 numpy.ndarray,float32 numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,56.61 kiB,56.61 kiB
Shape,"(14491,)","(14491,)"
Dask graph,1 chunks in 6 graph layers,1 chunks in 6 graph layers
Data type,float32 numpy.ndarray,float32 numpy.ndarray
"Array Chunk Bytes 56.61 kiB 56.61 kiB Shape (14491,) (14491,) Dask graph 1 chunks in 6 graph layers Data type float32 numpy.ndarray",14491  1,

Unnamed: 0,Array,Chunk
Bytes,56.61 kiB,56.61 kiB
Shape,"(14491,)","(14491,)"
Dask graph,1 chunks in 6 graph layers,1 chunks in 6 graph layers
Data type,float32 numpy.ndarray,float32 numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,113.21 kiB,113.21 kiB
Shape,"(14491,)","(14491,)"
Dask graph,1 chunks in 6 graph layers,1 chunks in 6 graph layers
Data type,int64 numpy.ndarray,int64 numpy.ndarray
"Array Chunk Bytes 113.21 kiB 113.21 kiB Shape (14491,) (14491,) Dask graph 1 chunks in 6 graph layers Data type int64 numpy.ndarray",14491  1,

Unnamed: 0,Array,Chunk
Bytes,113.21 kiB,113.21 kiB
Shape,"(14491,)","(14491,)"
Dask graph,1 chunks in 6 graph layers,1 chunks in 6 graph layers
Data type,int64 numpy.ndarray,int64 numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,1.22 MiB,1.22 MiB
Shape,"(2, 11, 14491)","(2, 11, 14491)"
Dask graph,1 chunks in 48 graph layers,1 chunks in 48 graph layers
Data type,float32 numpy.ndarray,float32 numpy.ndarray
"Array Chunk Bytes 1.22 MiB 1.22 MiB Shape (2, 11, 14491) (2, 11, 14491) Dask graph 1 chunks in 48 graph layers Data type float32 numpy.ndarray",14491  11  2,

Unnamed: 0,Array,Chunk
Bytes,1.22 MiB,1.22 MiB
Shape,"(2, 11, 14491)","(2, 11, 14491)"
Dask graph,1 chunks in 48 graph layers,1 chunks in 48 graph layers
Data type,float32 numpy.ndarray,float32 numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,1.22 MiB,1.22 MiB
Shape,"(2, 11, 14491)","(2, 11, 14491)"
Dask graph,1 chunks in 48 graph layers,1 chunks in 48 graph layers
Data type,float32 numpy.ndarray,float32 numpy.ndarray
"Array Chunk Bytes 1.22 MiB 1.22 MiB Shape (2, 11, 14491) (2, 11, 14491) Dask graph 1 chunks in 48 graph layers Data type float32 numpy.ndarray",14491  11  2,

Unnamed: 0,Array,Chunk
Bytes,1.22 MiB,1.22 MiB
Shape,"(2, 11, 14491)","(2, 11, 14491)"
Dask graph,1 chunks in 48 graph layers,1 chunks in 48 graph layers
Data type,float32 numpy.ndarray,float32 numpy.ndarray


## 3. Compare Multiple Forecasts

For comparing the forecast against multiple reference forecasts, we need to set the following parameters:
- `--benchmark_datasets_locs`: Dictionary of benchmark datasets locations (required)
- `--run_name`: W&B run name (required)
- `--regions`: Comma-separated list of regions, see `regions.py` for available regions (required)

Let's use as reference forecast, the ECMWF's HRES dataset, and also calculate the metrics for this dataset.

In [7]:
reference = xr.open_zarr("gs://weatherbench2/datasets/hres/2016-2022-0012-1440x721.zarr")

# select only the first day of each month
time_mask_reference = (
    (reference.time.dt.year == 2022) & 
    (reference.time.dt.day == 1) & 
    (reference.time.dt.hour == 12)
)
reference = reference.sel(time=reference.time[time_mask_reference])


# select only lead times 0 to 10 days every 24 hours
reference = reference.isel(prediction_timedelta=slice(0, 41, 4))  

reference = reference[['10m_wind_speed', '2m_temperature']]
reference

Unnamed: 0,Array,Chunk
Bytes,522.80 MiB,3.96 MiB
Shape,"(12, 11, 721, 1440)","(1, 1, 721, 1440)"
Dask graph,132 chunks in 4 graph layers,132 chunks in 4 graph layers
Data type,float32 numpy.ndarray,float32 numpy.ndarray
"Array Chunk Bytes 522.80 MiB 3.96 MiB Shape (12, 11, 721, 1440) (1, 1, 721, 1440) Dask graph 132 chunks in 4 graph layers Data type float32 numpy.ndarray",12  1  1440  721  11,

Unnamed: 0,Array,Chunk
Bytes,522.80 MiB,3.96 MiB
Shape,"(12, 11, 721, 1440)","(1, 1, 721, 1440)"
Dask graph,132 chunks in 4 graph layers,132 chunks in 4 graph layers
Data type,float32 numpy.ndarray,float32 numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,522.80 MiB,3.96 MiB
Shape,"(12, 11, 721, 1440)","(1, 1, 721, 1440)"
Dask graph,132 chunks in 4 graph layers,132 chunks in 4 graph layers
Data type,float32 numpy.ndarray,float32 numpy.ndarray
"Array Chunk Bytes 522.80 MiB 3.96 MiB Shape (12, 11, 721, 1440) (1, 1, 721, 1440) Dask graph 132 chunks in 4 graph layers Data type float32 numpy.ndarray",12  1  1440  721  11,

Unnamed: 0,Array,Chunk
Bytes,522.80 MiB,3.96 MiB
Shape,"(12, 11, 721, 1440)","(1, 1, 721, 1440)"
Dask graph,132 chunks in 4 graph layers,132 chunks in 4 graph layers
Data type,float32 numpy.ndarray,float32 numpy.ndarray


In [8]:
output_reference = "data/reference_benchmark.zarr"

stationbench.calculate_metrics(
    forecast=reference,
    stations=stations,
    start_date=start_date,
    end_date=end_date,
    output=output_reference,
    region=region,
    name_10m_wind_speed=name_10m_wind_speed,
    name_2m_temperature=name_2m_temperature,
    use_dask=False
)

2025-02-21 11:06:06,885 - stationbench.calculate_metrics - INFO - Preparing stations data
2025-02-21 11:06:06,886 - stationbench.calculate_metrics - INFO - Selecting region: https://linestrings.com/bbox/#-180,-90,180,90
2025-02-21 11:06:07,065 - stationbench.calculate_metrics - INFO - Filtered stations: 14491 -> 14491
2025-02-21 11:06:07,066 - stationbench.calculate_metrics - INFO - Preparing forecast dataset
2025-02-21 11:06:07,070 - stationbench.calculate_metrics - INFO - Selecting region: https://linestrings.com/bbox/#-180,-90,180,90
2025-02-21 11:06:07,071 - stationbench.calculate_metrics - INFO - Converting longitudes from 0-360 to -180-180 range
2025-02-21 11:06:07,079 - stationbench.calculate_metrics - INFO - Renaming wind speed variable from 10m_wind_speed to 10m_wind_speed
2025-02-21 11:06:07,080 - stationbench.calculate_metrics - INFO - Renaming temperature variable from 2m_temperature to 2m_temperature
2025-02-21 11:06:07,080 - stationbench.calculate_metrics - INFO - Interpo

Unnamed: 0,Array,Chunk
Bytes,56.61 kiB,56.61 kiB
Shape,"(14491,)","(14491,)"
Dask graph,1 chunks in 6 graph layers,1 chunks in 6 graph layers
Data type,float32 numpy.ndarray,float32 numpy.ndarray
"Array Chunk Bytes 56.61 kiB 56.61 kiB Shape (14491,) (14491,) Dask graph 1 chunks in 6 graph layers Data type float32 numpy.ndarray",14491  1,

Unnamed: 0,Array,Chunk
Bytes,56.61 kiB,56.61 kiB
Shape,"(14491,)","(14491,)"
Dask graph,1 chunks in 6 graph layers,1 chunks in 6 graph layers
Data type,float32 numpy.ndarray,float32 numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,56.61 kiB,56.61 kiB
Shape,"(14491,)","(14491,)"
Dask graph,1 chunks in 6 graph layers,1 chunks in 6 graph layers
Data type,float32 numpy.ndarray,float32 numpy.ndarray
"Array Chunk Bytes 56.61 kiB 56.61 kiB Shape (14491,) (14491,) Dask graph 1 chunks in 6 graph layers Data type float32 numpy.ndarray",14491  1,

Unnamed: 0,Array,Chunk
Bytes,56.61 kiB,56.61 kiB
Shape,"(14491,)","(14491,)"
Dask graph,1 chunks in 6 graph layers,1 chunks in 6 graph layers
Data type,float32 numpy.ndarray,float32 numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,113.21 kiB,113.21 kiB
Shape,"(14491,)","(14491,)"
Dask graph,1 chunks in 6 graph layers,1 chunks in 6 graph layers
Data type,int64 numpy.ndarray,int64 numpy.ndarray
"Array Chunk Bytes 113.21 kiB 113.21 kiB Shape (14491,) (14491,) Dask graph 1 chunks in 6 graph layers Data type int64 numpy.ndarray",14491  1,

Unnamed: 0,Array,Chunk
Bytes,113.21 kiB,113.21 kiB
Shape,"(14491,)","(14491,)"
Dask graph,1 chunks in 6 graph layers,1 chunks in 6 graph layers
Data type,int64 numpy.ndarray,int64 numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,1.22 MiB,1.22 MiB
Shape,"(2, 11, 14491)","(2, 11, 14491)"
Dask graph,1 chunks in 48 graph layers,1 chunks in 48 graph layers
Data type,float32 numpy.ndarray,float32 numpy.ndarray
"Array Chunk Bytes 1.22 MiB 1.22 MiB Shape (2, 11, 14491) (2, 11, 14491) Dask graph 1 chunks in 48 graph layers Data type float32 numpy.ndarray",14491  11  2,

Unnamed: 0,Array,Chunk
Bytes,1.22 MiB,1.22 MiB
Shape,"(2, 11, 14491)","(2, 11, 14491)"
Dask graph,1 chunks in 48 graph layers,1 chunks in 48 graph layers
Data type,float32 numpy.ndarray,float32 numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,1.22 MiB,1.22 MiB
Shape,"(2, 11, 14491)","(2, 11, 14491)"
Dask graph,1 chunks in 48 graph layers,1 chunks in 48 graph layers
Data type,float32 numpy.ndarray,float32 numpy.ndarray
"Array Chunk Bytes 1.22 MiB 1.22 MiB Shape (2, 11, 14491) (2, 11, 14491) Dask graph 1 chunks in 48 graph layers Data type float32 numpy.ndarray",14491  11  2,

Unnamed: 0,Array,Chunk
Bytes,1.22 MiB,1.22 MiB
Shape,"(2, 11, 14491)","(2, 11, 14491)"
Dask graph,1 chunks in 48 graph layers,1 chunks in 48 graph layers
Data type,float32 numpy.ndarray,float32 numpy.ndarray


Let's compare our forecast against the reference forecast and visualize the results.

In [9]:
benchmark_datasets_locs = {"evaluation": "data/forecast_benchmark.zarr", "reference": "data/reference_benchmark.zarr"}
regions = "global"

stationbench.compare_forecasts(
    benchmark_datasets_locs=benchmark_datasets_locs,
    regions=regions
)

2025-02-21 11:07:45,888 - stationbench.compare_forecasts - INFO - Saving tables to stationbench-results


By default, the results are saved in the `stationbench-results` directory. You can change the output directory using 
```
stationbench.compare_forecasts(
    benchmark_datasets_locs=benchmark_datasets_locs,
    regions=regions,
    output_dir="your_output_directory"
)
```

In [10]:
!ls stationbench-results/

MBE_10m_wind_speed_Mid_term_(3-7_days).html
MBE_10m_wind_speed_Short_term_(6-48_hours).html
MBE_2m_temperature_Mid_term_(3-7_days).html
MBE_2m_temperature_Short_term_(6-48_hours).html
RMSE_10m_wind_speed_Mid_term_(3-7_days).html
RMSE_10m_wind_speed_Short_term_(6-48_hours).html
RMSE_2m_temperature_Mid_term_(3-7_days).html
RMSE_2m_temperature_Short_term_(6-48_hours).html
skill_score_10m_wind_speed_Mid_term_(3-7_days).html
skill_score_10m_wind_speed_Short_term_(6-48_hours).html
skill_score_2m_temperature_Mid_term_(3-7_days).html
skill_score_2m_temperature_Short_term_(6-48_hours).html
temporal_metrics.csv


## Understanding the Results

The comparison generates several visualizations:

1. **Geographical scatter plots**:
   - RMSE values at each station location
   - RMSE Skill scores comparing against reference forecasts
   - MBE values at each station location

In [11]:
import os
import webbrowser

output_path = "stationbench-results/RMSE_2m_temperature_Short_term_(6-48_hours).html"

# Get absolute path
abs_path = os.path.abspath(output_path)
webbrowser.open(f'file://{abs_path}')

print(f"To view the interactive plot, open this file in your browser:\nfile://{abs_path}")

To view the interactive plot, open this file in your browser:
file:///Users/leonie/Documents/stationbench/docs/stationbench-results/RMSE_2m_temperature_Short_term_(6-48_hours).html


2. **Time series plots**:
   - RMSE evolution over forecast lead time
   - RMSE Skill score evolution over forecast lead time
   - MBE evolution over forecast lead time

In [12]:
import pandas as pd
import plotly.graph_objects as go

# Read the CSV file
df = pd.read_csv('stationbench-results/temporal_metrics.csv')

# filters to plot
metric = 'rmse'
region = 'global'
variable = '2m_temperature'

# Filter for RMSE metric and global region for 2m temperature
mask = (df['metric'] == metric) & (df['region'] == region)
eval_data = df[mask & (df['model'] == 'evaluation')][['lead_time', variable]]
ref_data = df[mask & (df['model'] == 'reference')][['lead_time', variable]]

# Create the plot
fig = go.Figure()

fig.add_trace(go.Scatter(
    x=eval_data['lead_time'],
    y=eval_data[variable],
    name='Evaluation',
    mode='lines'
))

fig.add_trace(go.Scatter(
    x=ref_data['lead_time'], 
    y=ref_data[variable],
    name='Reference',
    mode='lines'
))

fig.update_layout(
    title='RMSE Evolution Over Forecast Lead Time',
    xaxis_title='Lead Time (hours)',
    yaxis_title='RMSE (2m Temperature)',
    showlegend=True,
    template='plotly_white'
)

fig.show()

## Optional: Log results to Weights & Biases

You can log the results to Weights & Biases by setting the following parameters:
- `--wandb_run_name`: W&B run name (optional)

For signing up to Weights & Biases, run the following command in your terminal and follow the instructions:
```
wandb login
```


In [14]:
import datetime
wandb_run_name = f"tutorial-run-{datetime.datetime.now().strftime('%Y-%m-%d_%H-%M')}"

stationbench.compare_forecasts(
    benchmark_datasets_locs=benchmark_datasets_locs,
    regions=regions,
    wandb_run_name=wandb_run_name
)

2025-02-21 11:09:23,573 - stationbench.compare_forecasts - INFO - Saving tables to stationbench-results
2025-02-21 11:09:24,783 - stationbench.compare_forecasts - INFO - Logging metrics to WandB: WandB not available


AttributeError: 'NoneType' object has no attribute 'log'