Postprocessing of Ensemble Weather Forecasts Using Permutation-invariant Neural Networks

Code for the paper "Postprocessing of Ensemble Weather Forecasts Using Permutation-invariant Neural Networks" by Kevin Höhlein, Benedikt Schulz, Rüdiger Westermann and Sebastian Lerch.

Installation

The main part of the project is written in Python. The repository contains a file requirements.txt, from which a virtual Python environment can be built as follows:

python -m venv venv
source venv/bin/activate
pip install wheel
pip install -r requirements.txt

Parts of the evaluation rely on the availability of an R installation. We recommend e.g. using Anaconda to install R and running R subsequently to install required packages. The R dependencies originate from the use of baseline evaluation metrics from a third-party repository. We refer to this one for further information.

Datasets

The paper investigates the utility of permutation-invariant neural networks for statistical postprocessing of ensemble weather forecasts using two forecast-observation datasets for wind-gust (COSMO-DE dataset) and surface temperature postprocessing (EUPPBench dataset). Training own models requires downloading parts of the data and listing the data location in path config files for it to be found by the training scripts

Downloading data

The COSMO-DE data is proprietary and can only be retrieved directly from the German weather service (DWD). The EUPPBench dataset is open access and consists of two parts, the reforecast and the forecast datasets. Scripts for downloading both parts to disk in zarr format are provided here and here and are used as follows:

python data/euppbench/download_reforecasts.py --path /path/to/reforecasts
python data/euppbench/download_forecasts.py --path /path/to/forecasts  # optional if no evaluation on forecast data is intended

Memory requirements amount to roughly 10.0 GB for the reforecast data and 8.5 GB for the forecast data.

Note that the training scripts cannot be run from within the standard Python environment of the project due to package version conflicts between climetlab and other dependencies. A conda environment for the download can be built from environment_download.yml. For further information, we refer to the documentations of CliMetLab and EUPPBench.

Path configuration interface

Access to the downloaded data is managed automatically using .json-based path configuration files in data/config. Paths to the dataset base directories should be listed in the respective configuration files for reforecast and forecast datasets as a mapping of host name to data path. An example configuration is shown in example.json.

To check the host name as used by the program, run python -c "import socket; print(socket.gethostname())".

Caching preprocessed data

Loading the data from disk and preprocessing it for training may take a few minutes. To speed up this process in repeated training runs, a preprocessed version of the dataset can be cached and loaded directly when re-running scripts with identical data settings. To use this feature, paths to caching directories for reforecast and forecast data can be specified in euppbench_reforecasts_cache.json and euppbench_forecasts_cache.json, respectively.

Exporting CSV files for baseline methods

Training of the baseline methods (DRN, BQN, EMOS) requires summarization and export of the data to .csv-format. Data in the required form can be exported by running the scripts export_reforecasts_csv.py and export_forecasts_csv.py, which write out two files, containing the training (and validation) dataset and the test set, respectively. The target directory can be specified by passing the command line argument --csv-directory /path/to/target/dir when running the script. Note that the generated training data are identical in both cases.

Reproducing experiments

Training ensemble-based models

The main entry point for training ensemble-based models are experiments/cosmo_de/run_training.py and experiments/euppbench_reforecasts/run_training.py for COSMO-DE and EUPPBench data, respectively. The scripts are mostly identical, but load different datasets internally. For examples, we focus on the EUPPBench case study. Training COSMO-DE models works equivalently.

Examples (EUPPBench)

ED-DRN with truncated logistic prediction parameterization and mean-based merger:

  python experiments/euppbench_reforecasts/run_training.py
      --output:path /path/to/output/dir
      --data:flt 6                    # intended lead time
      --training:batch-size 64
      --training:optimizer:lr 1.e-4
      --training:ensemble-size 10     # number of models for ensemble averaging
      --loss:type logistic 
      --model:merger:type mean
      --model:encoder:type mlp
      --model:encoder:num-layers 3
      --model:encoder:num-channels 64
      --model:decoder:type mlp
      --model:decoder:num-layers 3
      --model:decoder:num-channels 64,
      --model:bottleneck 64
      # for GPU training: --training:use-gpu

ST-BQN with attention-based merger:

  python experiments/euppbench_reforecasts/run_training.py
      --output:path /path/to/output/dir
      --data:flt 6                    # intended lead time
      --training:batch-size 64
      --training:optimizer:lr 1.e-4
      --training:ensemble-size 10     # number of models for ensemble averaging
      --loss:type bqn
      --loss:kwargs "{'integration_scheme': 'uniform', 'num_quantiles': 99}"
      --model:merger:type weighted-mean
      --model:merger:kwargs "{'num_heads': 8}"
      --model:encoder:type attention
      --model:encoder:num-layers 8    # corresponds to two attention blocks
      --model:encoder:num-channels 64
      --model:decoder:type mlp
      --model:decoder:num-layers 3
      --model:decoder:num-channels 64,
      --model:bottleneck 64
      # for GPU training: --training:use-gpu

Exporting predictions

To export predictions for validation and test data in a unified format, run

python experiments/euppbench_reforecasts/predict.py --path /path/to/training/output --valid --test

Training baselines

Scripts for training the baseline models are provided in experiments/baselines. Note that these scripts require input data in .csv format.

Examples (EUPPBench)

DRN with truncated logistic prediction parameterization:

  python experiments/baselines/run_training_drn_eupp.py
      --data-train /path/to/data/train.csv
      --data-test /path/to/data/test.csv
      --output:path /path/to/output/dir
      --model:posterior logistic

BQN:

  python experiments/baselines/run_training_bqn_eupp.py
      --data-train /path/to/data/train.csv
      --data-test /path/to/data/test.csv
      --output:path /path/to/output/dir
      --model:p-degree 12
      # for ensemble input: --model:use-ensemble

EMOS (custom Pytorch implementation using LBFGS optimizer):

  python experiments/baselines/run_training_emos_eupp.py
      --data-train /path/to/data/train.csv
      --data-test /path/to/data/test.csv
      --output:path /path/to/output/dir
      --model:posterior logistic

Exporting predictions

The training scripts for the baseline methods export predictions for validation and test data automatically. To copy the data to the required location for subsequent evaluation, run

python evaluation/copy_predictions.py --path /path/to/training/output

Predicting for forecast data

The folder experiments/euppbench_forecasts contains scripts for exporting predictions for the EUPPBench forecast dataset. Separate scripts are used for ensemble-based models (predict.py), DRN and BQN (predict_minimal.py) and EMOS(predict_emos.py)

Computing evaluation metrics

The script evaluation/eval.py provides functionality for computing forecast evaluation metrics. A distinction between ensemble-based and baseline models is not required, but the prediction format must be specified.

Examples (EUPPBench)

Truncated logistic predictions for reforecast data:

  python evaluation/eval.py exp logistic 
      --path /path/to/training/output
      --valid --test
      --num-members 11  # size of reference ensemble: 11 for reforecasts

BQN predictions for forecast data:

  python evaluation/eval.py exp bqn 
      --path /path/to/training/output 
      --forecasts
      --num-members 51  # size of reference ensemble: 51 for forecasts

Feature Permutation Analysis

The functionality to reproduce feature permutation results is contained in evaluation/feature_permutation. The scripts predict_perturbed.py, and predict_perturbed_minimal.py can be used to compute perturbed predictions for ensemble-based models and summary-based models (DRN and BQN), respectively.

Permutation importance scores can be computed with eval_scalar_predictors.py for summary-based models (DRN and BQN) and eval_single_features.py for ensemble-based architectures.

The implementation of the binned shuffling perturbation is located in perturbations.py.

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
data		data
evaluation		evaluation
experiments		experiments
external		external
model		model
utils @ 733a89e		utils @ 733a89e
.gitattributes		.gitattributes
.gitignore		.gitignore
.gitmodules		.gitmodules
LICENSE		LICENSE
README.md		README.md
environment_download.yml		environment_download.yml
requirements.txt		requirements.txt

License

khoehlein/Permutation-invariant-Postprocessing

Folders and files

Latest commit

History

Repository files navigation

Postprocessing of Ensemble Weather Forecasts Using Permutation-invariant Neural Networks

Installation

Datasets

Downloading data

Path configuration interface

Caching preprocessed data

Exporting CSV files for baseline methods

Reproducing experiments

Training ensemble-based models

Examples (EUPPBench)

Exporting predictions

Training baselines

Examples (EUPPBench)

Exporting predictions

Predicting for forecast data

Computing evaluation metrics

Examples (EUPPBench)

Feature Permutation Analysis

About

Resources

License

Stars

Watchers

Forks

Languages