## Downscaling with the DeepESD model

This notebook showcases a simple application of deep4downscaling for the statistical downscaling of precipitation. To do so, we will implement the following actions:

- Define and train the DeepESD architecture [1].
- Downscale and evaluate results over a test period.

### Train the model

In [3]:
DATA_PATH = './data/input'
FIGURES_PATH = './figures'
MODELS_PATH = './models'
ASYM_PATH = './data/asym'


When working with climate data, xarray is an essential library, and deep4downscaling heavily relies on it. For the deep learning component, deep4downscaling uses PyTorch, one of the most popular frameworks in the field.

In [4]:
import numpy as np
import xarray as xr
import torch
from torch.utils.data import DataLoader, random_split

import os
import sys; sys.path.append('/current_folder')
import deep4downscaling.viz
import deep4downscaling.trans
import deep4downscaling.deep.loss
import deep4downscaling.deep.utils
import deep4downscaling.deep.models
import deep4downscaling.deep.train
import deep4downscaling.deep.pred
import deep4downscaling.metrics
import deep4downscaling.metrics_ccs

We will begin by loading the predictor. In this case, we select various large-scale variables from ERA5 at different height levels. These variables are already stored in a NetCDF file, the standard data format for deep4downscaling. Unfortunately, due to GitHub's size restrictions, we are unable to upload these files to the repository. However, the following cells provide an overview of the data, making it straightforward to reproduce this notebook with a similar file.

In [5]:
# Load predictors
predictor_filename = f'{DATA_PATH}/ERA5_NorthAtlanticRegion_1-5dg_full.nc'
predictor = xr.open_dataset(predictor_filename)

In [6]:
predictor

The deep4downscaling library provides several functions to facilitate an initial visualization of the data. For example, the `deep4downscaling.viz.multiple_map_plot` function allows you to visualize an `xarray.Dataset`. These functions rely on matplotlib and cartopy. By default, the figure is saved as a `.pdf` file in the path specified by the `output_path argument`.

In [7]:
deep4downscaling.viz.multiple_map_plot(data=predictor.mean('time'),
                                       output_path=f'./{FIGURES_PATH}/predictor_climatology.pdf')

The predictand is an `xarray.Dataset` containing a single variable (the target). In this notebook, we will focus on downscaling accumulated precipitation over the region of Canary Islands. The dataset used is ROCIO+_CAN 2.5km.

In [6]:
predictand_filename = os.path.join(DATA_PATH, "sfcan*.nc")

predictand = xr.open_mfdataset(
    predictand_filename,
    combine="by_coords"
)

We preprocess the predictand dataset by retaining only surface-level data (height = 0.0), renaming the precipitation variable to `pr`, and standardizing the time dimension to daily resolution. These steps ensure compatibility with the predictor dataset.

In [7]:
predictand = predictand.rename({'precipitation': 'pr'})
predictand = predictand.sel(height=0.0, drop=True)
predictand["time"] = predictand["time"].dt.floor("D")

In [10]:
predictand

Unnamed: 0,Array,Chunk
Bytes,2.03 GiB,315.30 MiB
Shape,"(12053, 161, 281)","(1827, 161, 281)"
Dask graph,7 chunks in 16 graph layers,7 chunks in 16 graph layers
Data type,float32 numpy.ndarray,float32 numpy.ndarray
"Array Chunk Bytes 2.03 GiB 315.30 MiB Shape (12053, 161, 281) (1827, 161, 281) Dask graph 7 chunks in 16 graph layers Data type float32 numpy.ndarray",281  161  12053,

Unnamed: 0,Array,Chunk
Bytes,2.03 GiB,315.30 MiB
Shape,"(12053, 161, 281)","(1827, 161, 281)"
Dask graph,7 chunks in 16 graph layers,7 chunks in 16 graph layers
Data type,float32 numpy.ndarray,float32 numpy.ndarray


Similar to the predictors, deep4downscaling can also be used for an initial visualization of the predictand.

In [11]:
day_to_viz = '10-04-2015'
deep4downscaling.viz.simple_map_plot(data=predictand.sel(time=day_to_viz),
                                     colorbar='hot_r', var_to_plot='pr',
                                     output_path=f'./{FIGURES_PATH}/predictand_day.pdf')

Deep4downscaling also includes several common preprocessing techniques used in statistical downscaling, such as removing NaN values, aligning datasets (e.g., across time), bias adjustment, and standardization, among others.

In [8]:
# Remove days with nans in the predictor
predictor = deep4downscaling.trans.remove_days_with_nans(predictor)

# Align both datasets in time
predictor, predictand = deep4downscaling.trans.align_datasets(predictor, predictand, 'time')
predictor.load()
predictand.load()

There are no observations containing null values


In this particular case, we restrict the predictand data to the Canary Islands region to prevent the model from learning patterns that are not representative of the target area, thereby reducing potential bias during training.

In [9]:
lat_mask = (predictand.lat >= 26.5) & (predictand.lat <= 28.13)
lon_mask = (predictand.lon >= -14) & (predictand.lon <= -12)

mask = lat_mask & lon_mask
mask_3d = mask.broadcast_like(predictand.pr)

predictand['pr'] = predictand.pr.where(~mask_3d, np.nan)

To adhere to the standard training/validation scheme in the machine learning field, we divide the predictors and predictand into training and test sets.

In [10]:
years_train = ('1990', '2015')
years_test = ('2016', '2022')

x_train = predictor.sel(time=slice(*years_train))
y_train = predictand.sel(time=slice(*years_train))

x_test = predictor.sel(time=slice(*years_test))
y_test = predictand.sel(time=slice(*years_test))

Before feeding the predictors to the deep learning model, we standardize them to have a mean of zero and a standard deviation of one. This is done using the `deep4downscaling.trans.standardize` function.

In [11]:
x_train_stand = deep4downscaling.trans.standardize(data_ref=x_train, data=x_train)

For training and inference, the data will be transformed into the torch.Tensor type. To facilitate the transition from NetCDF to torch.Tensor, especially when computing projections (predictions), we define a mask around the predictand to use throughout the entire workflow.

In [12]:
y_mask = deep4downscaling.trans.compute_valid_mask(y_train) 

All deep learning models implemented in deep4downscaling flatten their output into a vector, standardizing its dimensions to the shape `(time, grid point)`.

In [13]:
y_train_stack = y_train.stack(gridpoint=('lat', 'lon'))
y_mask_stack = y_mask.stack(gridpoint=('lat', 'lon'))

The DeepESD architecture consists of a set of convolutional layers followed by a final dense layer. In our case, since the predictand contains NaN values for sea grid points, we filter out these grid points to save computation. This reduces the number of neurons in the final fully connected layer. By applying this operation using the mask, the conversion between the model's output and the corresponding NetCDF becomes straightforward.

In [14]:
y_mask_stack_filt = y_mask_stack.where(y_mask_stack==1, drop=True)
y_train_stack_filt = y_train_stack.where(y_train_stack['gridpoint'] == y_mask_stack_filt['gridpoint'],
                                             drop=True)

The deep4downscaling library includes various loss functions for training deep learning models. In this notebook, we follow [2] and focus on the ASYmmetric loss function (ASYM). We have provided the values asym_weight=3 and cdf_weight=10 as an example of the flexibility of the loss function. Default values of asym_weight=1 and cdf_weight=2 are equivalent to the original loss.Asym at [3].Implementing custom loss functions should be straightforward, as they follow the typical PyTorch conventions.

In [19]:
loss_function = deep4downscaling.deep.loss.Asym(ignore_nans=True,
                                                asym_path=ASYM_PATH)

For this loss function to work, we need to pre-compute a gamma distribution for each grid point in the predictand data (training set) on a yearly basis and calculate the mean of their parameters (see [3] for more details). This process can be handled by deep4downscaling.

In [20]:
if loss_function.parameters_exist():
    loss_function.load_parameters()
else:
    loss_function.compute_parameters(data=y_train_stack_filt,
                                     var_target='pr')

NetCDF is not well-suited for use with PyTorch (or for converting to the `torch.Tensor` type). In contrast, NumPy is.

In [15]:
x_train_stand_arr = deep4downscaling.trans.xarray_to_numpy(x_train_stand)
y_train_arr = deep4downscaling.trans.xarray_to_numpy(y_train_stack_filt)

With our data now in the numpy format, we can create the `torch.Dataset` and `torch.DataLoader` to feed batches of data to the deep learning model during training.

In [22]:
# Create Dataset
train_dataset = deep4downscaling.deep.utils.StandardDataset(x=x_train_stand_arr,
                                                            y=y_train_arr)

# Split into training and validation sets
train_dataset, valid_dataset = random_split(train_dataset,
                                            [0.9, 0.1])

# Create DataLoaders
batch_size = 64

train_dataloader = DataLoader(train_dataset, batch_size=batch_size,
                              shuffle=True)
valid_dataloader = DataLoader(valid_dataset, batch_size=batch_size,
                              shuffle=True)

Deep4downscaling includes several predefined deep learning architectures (e.g., DeepESD and U-Net), but custom architectures can be easily defined using the standard PyTorch framework. However, because deep4downscaling relies on a final flattening operation (as mentioned earlier), we recommend reviewing the implementations in `deep4downscaling.deep.models` and using them as a foundation.

While deep4downscaling lacks a formal documentation page, all its functions and arguments are properly documented within the code.

In [23]:
?deep4downscaling.deep.models.DeepESDpr

[0;31mInit signature:[0m
[0mdeep4downscaling[0m[0;34m.[0m[0mdeep[0m[0;34m.[0m[0mmodels[0m[0;34m.[0m[0mDeepESDpr[0m[0;34m([0m[0;34m[0m
[0;34m[0m    [0mx_shape[0m[0;34m:[0m [0mtuple[0m[0;34m,[0m[0;34m[0m
[0;34m[0m    [0my_shape[0m[0;34m:[0m [0mtuple[0m[0;34m,[0m[0;34m[0m
[0;34m[0m    [0mfilters_last_conv[0m[0;34m:[0m [0mint[0m[0;34m,[0m[0;34m[0m
[0;34m[0m    [0mstochastic[0m[0;34m:[0m [0mbool[0m[0;34m,[0m[0;34m[0m
[0;34m[0m    [0mlast_relu[0m[0;34m:[0m [0mbool[0m [0;34m=[0m [0;32mFalse[0m[0;34m,[0m[0;34m[0m
[0;34m[0m[0;34m)[0m[0;34m[0m[0;34m[0m[0m
[0;31mDocstring:[0m     
DeepESD model as proposed in Baño-Medina et al. 2024 for precipitation
downscaling. This implementation allows for a deterministic (MSE-based)
and stochastic (NLL-based) definition.

Baño-Medina, J., Manzanas, R., Cimadevilla, E., Fernández, J., González-Abad,
J., Cofiño, A. S., and Gutiérrez, J. M.: Downscaling multi-model cl

In this notebook, we will train the DeepESD architecture with a single final convolutional layer.

In [16]:
model_name = 'deepesd_pr'
model = deep4downscaling.deep.models.DeepESDpr(x_shape=x_train_stand_arr.shape,
                                               y_shape=y_train_arr.shape,
                                               filters_last_conv=1,
                                               stochastic=False)

We set the typical training hyperparameters, as is commonly done in PyTorch.

In [25]:
num_epochs = 10000
patience_early_stopping = 20

learning_rate = 0.0001
optimizer = torch.optim.Adam(model.parameters(),
                             lr=learning_rate)

Deep learning models can run on either CPU or GPU devices. We provide the corresponding `.yml` environment files (`deep4downscaling/requirement`) to set up a basic Conda environment for running deep4downscaling.

In [21]:
device = ('cuda' if torch.cuda.is_available() else 'cpu')

# Move ASYM paramters to device
loss_function.prepare_parameters(device=device)

Deep4downscaling provides the `deep4downscaling.deep.train.standard_training_loop`, which implements a basic training routine. Models are saved based on their performance on a validation set through an early stopping process, with the final saved model being the one that achieves the best score on this set. To disable early stopping, you can pass `None` to the `patience_early_stopping` argument. We recommend users consult the `?deep4downscaling.deep.train.standard_training_loop` for further details about this function.

In [27]:
train_loss, val_loss = deep4downscaling.deep.train.standard_training_loop(
                            model=model, model_name=model_name, model_path=MODELS_PATH,
                            device=device, num_epochs=num_epochs,
                            loss_function=loss_function, optimizer=optimizer,
                            train_data=train_dataloader, valid_data=valid_dataloader,
                            patience_early_stopping=patience_early_stopping)

Epoch 1 (32.98 secs) | Training Loss 1.2821 Valid Loss 1.5444 (Model saved)
Epoch 2 (32.85 secs) | Training Loss 1.2801 Valid Loss 1.5329 (Model saved)
Epoch 3 (35.97 secs) | Training Loss 1.2791 Valid Loss 1.5181 (Model saved)
Epoch 4 (32.4 secs) | Training Loss 1.2797 Valid Loss 1.5373
Epoch 5 (31.72 secs) | Training Loss 1.2783 Valid Loss 1.5192
Epoch 6 (31.92 secs) | Training Loss 1.2809 Valid Loss 1.5253
Epoch 7 (33.33 secs) | Training Loss 1.2851 Valid Loss 1.5191
Epoch 8 (34.98 secs) | Training Loss 1.2787 Valid Loss 1.5214
Epoch 9 (35.37 secs) | Training Loss 1.2782 Valid Loss 1.5409
Epoch 10 (35.53 secs) | Training Loss 1.2766 Valid Loss 1.519
Epoch 11 (35.56 secs) | Training Loss 1.28 Valid Loss 1.5269
Epoch 12 (35.45 secs) | Training Loss 1.2786 Valid Loss 1.529
Epoch 13 (35.26 secs) | Training Loss 1.2815 Valid Loss 1.5353
Epoch 14 (35.13 secs) | Training Loss 1.2831 Valid Loss 1.5173 (Model saved)
Epoch 15 (35.33 secs) | Training Loss 1.2816 Valid Loss 1.5265
Epoch 16 (35.

### Downscale the test set

Once a model has been trained and saved as a `.pt` file, it is easy to compute predictions on a new set of predictors. In this example, we will compute predictions on the test set, which was subset a few cells above. It is important to standardize the test data using the mean and standard deviation computed from the training set.

In [22]:
#Load the model weights into the DeepESD architecture
model.load_state_dict(torch.load(f'{MODELS_PATH}/{model_name}.pt'))

# Standardize
x_test_stand = deep4downscaling.trans.standardize(data_ref=x_train, data=x_test)

# Compute predictions
pred_test = deep4downscaling.deep.pred.compute_preds_standard(
                                x_data=x_test_stand, model=model,
                                device=device, var_target='pr',
                                mask=y_mask, batch_size=16)

In [23]:
# Visualize the predictions
deep4downscaling.viz.simple_map_plot(data=pred_test.mean('time'),
                                     colorbar='hot_r', var_to_plot='pr', vlimits=(0, 3.5),
                                     output_path=f'./{FIGURES_PATH}/prediction_test_mean.pdf') #TODO MSE

The `deep4downscaling.metrics` module, included within deep4downscaling, implements various metrics commonly used to assess deep learning models in the context of statistical downscaling. These include biases of different indices, spatial and probabilistic metrics, and multivariate indices, among others. In this example, we demonstrate its use by computing the relative bias of the Rx1day index between the target (test set) and the predictions for the winter months.

In [24]:
bias_rel_rx1day = deep4downscaling.metrics.bias_rel_rx1day(target=y_test, pred=pred_test,
                                                           var_target='pr', season='winter') 

In [25]:
print(bias_rel_rx1day)

<xarray.Dataset>
Dimensions:  (lat: 161, lon: 281)
Coordinates:
  * lat      (lat) float64 26.5 26.52 26.55 26.57 ... 30.43 30.45 30.48 30.5
  * lon      (lon) float64 -19.0 -18.98 -18.95 -18.93 ... -12.05 -12.02 -12.0
Data variables:
    pr       (lat, lon) float32 nan nan nan nan nan nan ... nan nan nan nan nan


Another commonly used metric is the relative bias with respect to the mean, which we will compute and visualize as an illustrative example.

In [26]:
bias_mean = deep4downscaling.metrics.bias_mean(target=y_test, pred=pred_test, var_target='pr')


deep4downscaling.viz.simple_map_plot(data=bias_mean,
                                     colorbar='hot_r', var_to_plot='pr', vlimits=(-2.5, 2.5),
                                     output_path=f'./{FIGURES_PATH}/bias_pred_test.pdf')



### References

[1] Baño-Medina, J., Manzanas, R., Cimadevilla, E., Fernández, J., González-Abad, J., Cofiño, A. S., & Gutiérrez, J. M. (2022). Downscaling multi-model climate projection ensembles with deep learning (DeepESD): Contribution to CORDEX EUR-44. Geoscientific Model Development Discussions, 2022, 1-14.

[2] González-Abad, J., & Gutiérrez, J. M. (2024). Are Deep Learning Methods Suitable for Downscaling Global Climate Projections? Review and Intercomparison of Existing Models. arXiv preprint arXiv:2411.05850.

[3] Doury, A., Somot, S., & Gadat, S. (2024). On the suitability of a convolutional neural network based RCM-emulator for fine spatio-temporal precipitation. Climate Dynamics, 62(9), 8587-8613.

[4] Baño-Medina, J., Manzanas, R., & Gutiérrez, J. M. (2021). On the suitability of deep convolutional neural networks for continental-wide downscaling of climate change projections. Climate Dynamics, 57(11), 2941-2951.