<a name="software-requirements"></a>
# Software Requirements
This notebook requires the following libraries:
*   climate_learn (pip)

`climate_learn` contains the source files used for modeling climate extremes.

The package is written using `PyTorch` machine learning library.

In [None]:
%%bash
python -m pip install --upgrade pip
pip install git+https://github.com/aditya-grover/climate-learn.git

In [2]:
from google.colab import drive
drive.mount('/content/drive')

Drive already mounted at /content/drive; to attempt to forcibly remount, call drive.mount("/content/drive", force_remount=True).


<a name="temporal-forecasting"></a>
# Temporal Forecasting


A precise and reliable weather forecasting is of great importance in various aspect of society including precipitation forecasts are essential for agricultural needs, wind speed and solar power forecasts for energy generation.

<br>
<center><img src="https://drive.google.com/uc?export=view&id=1_tsaaogqzkYVV0jdawnO9GCJTToW_FCi" height=300></center>

The forecasting task can be categorized into (a) **nowcasting** (timescale of a few hours), (b) weather-scale prediction (typically 1day - 1week), (c) **seasonal** prediction (typically months) and (d) **multi-year or decadal** (timescale of multiple years).

In this tutorial, we shall focus on the **medium-range** weather-scale  prediction of the climate variables i.e., typically 3-5 days in the future. This colab notebook demonstrates the temporal forecasting of the *Temperature* variable at *2m* height above the earth's surface. This variable serves as a good indicator of future temperatures on the Earth's surface for the forecasters. 

We shall further use the 2m temperature data at 5.625 degree resolution that divides the Earth's surface into a latitude x longitude grid of 32 x 64.

<br/><br/>



**References:**
1. Rasp S, Dueben PD, Scher S, Weyn JA, Mouatadid S, Thuerey N. WeatherBench: a benchmark data set for data‐driven weather forecasting. Journal of Advances in Modeling Earth Systems. 2020 Nov;12(11):e2020MS002203 [(Paper)](https://arxiv.org/abs/2002.00469).
2. Civitarese DS, Szwarcman D, Zadrozny B, Watson C. Extreme Precipitation Seasonal Forecast Using a Transformer Neural Network. arXiv preprint arXiv:2107.06846. 2021 Jul 14. [(Paper)](https://arxiv.org/abs/2107.06846)
3. Sønderby CK, Espeholt L, Heek J, Dehghani M, Oliver A, Salimans T, Agrawal S, Hickey J, Kalchbrenner N. Metnet: A neural weather model for precipitation forecasting. arXiv preprint arXiv:2003.12140. 2020 Mar 24. [(Paper)](https://arxiv.org/pdf/2003.12140.pdf)


## Data Preparation

Check out the [Data Processing Notebook](https://github.com/aditya-grover/climate-learn/tree/main/docs/notebooks/Data_Processing) for more info on this part.

In [3]:
from climate_learn.data import download
from climate_learn.utils.data import load_dataset, view
from climate_learn.utils.datetime import Year, Days, Hours
from climate_learn.data import DataModule

# Download data from weatherbench (~2-3 minutes)
download(root = "/content/drive/MyDrive/Climate/.climate_tutorial", source = "weatherbench", variable = "2m_temperature", dataset = "era5", resolution = "5.625")
dataset = load_dataset("/content/drive/MyDrive/Climate/.climate_tutorial/data/weatherbench/era5/5.625/2m_temperature")

data_module = DataModule(
    dataset = "ERA5",
    task = "forecasting",
    root_dir = "/content/drive/MyDrive/Climate/.climate_tutorial/data/weatherbench/era5/5.625/",
    in_vars = ["2m_temperature"],
    out_vars = ["2m_temperature"],
    train_start_year = Year(2014),
    val_start_year = Year(2015),
    test_start_year = Year(2016),
    end_year = Year(2017),
    pred_range = Days(3),
    subsample = Hours(6),
    batch_size = 128,
    num_workers = 1
)

Downloading era5 2m_temperature data for 5.625 resolution from weatherbench to /content/drive/MyDrive/Climate/.climate_tutorial/data/weatherbench/era5/5.625/2m_temperature
Creating train dataset


100%|██████████| 1/1 [00:00<00:00, 54.82it/s]


Creating val dataset


100%|██████████| 1/1 [00:00<00:00, 60.26it/s]


Creating test dataset


100%|██████████| 2/2 [00:00<00:00, 78.22it/s]


## Neural Networks Architectures

We consider three deep neural network architectures for in this tutorial.

1. Convolutional Neural Networks (CNN)
<center><img src="https://viso.ai/wp-content/uploads/2021/03/cnn-convolutional-neural-networks-1060x362.jpg" width=400></center>


Variants of CNN architecture: \

a. **ResNet**

<center><img src="https://miro.medium.com/max/875/1*WpX_8eCeTsEcCs8vdXtUCw.png" width=400></center>

ResNets have been used to achieve SOTA weather forecasting using neural networks for temperature and geopotential in [1]. 

Paper: [Deep Residual Learning for Image Recognition](https://arxiv.org/abs/1512.03385)

b. **U-Net**

<center><img src="https://miro.medium.com/max/875/1*f7YOaE4TWubwaFF7Z1fzNw.png" width=400></center>

The basic building blocks of the U-Net architecture involve downsampling as well as upsampling convolutions. The downsampling blocks project the input from higher dimension to a lower dimension, and upsampling blocks project the low dimension latent space to the higher dimension input space. After gaining popularity in the Biomedical domain, our package allows the users to benchmark U-Net in the Climate modeling space too.


Paper: [U-Net: Convolutional Networks for Biomedical Image Segmentation
](https://arxiv.org/abs/1505.04597) 



2. Vision Transformers

Vision transformers are the latest contemporary to CNN variants for visual recognition. We relegate the audience to the related paper for its architectural details.

<center><img src="https://viso.ai/wp-content/uploads/2021/09/vision-transformer-vit.png" width=400></center>

Vision Transformers have gained immense popularity in the Vision community, and its usefulness to learn representations of climate variables is still under-explored. [2] used Transformers for short-range temperature forecasting.
We believe that our ViT implementation shall allow the users to benchmark ViT on climate modeling tasks.

Paper: <a href="https://arxiv.org/abs/2010.11929">An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale</a>

<br/><br/>


**References**:

[1] Rasp S, Thuerey N. Data‐driven medium‐range weather prediction with a resnet pretrained on climate simulations: A new model for weatherbench. Journal of Advances in Modeling Earth Systems. 2021 Feb;13(2):e2020MS002405.\
[2] Bilgin O, Mąka P, Vergutz T, Mehrkanoon S. TENT: Tensorized encoder transformer for temperature forecasting. arXiv preprint arXiv:2106.14742. 2021 Jun 28.



<br/><br/>

In this tutorial, we shall demonstrate the training of a resnet from scratch. It is important to note that the choice of model architecture and hyperparameters are for demonstration purposes only. 

## Model initialization 

The hyperparameters and ResNet architecture chosen allow for a model that forecasts with 85.7% test accuracy, while still training within a reasonable amount of time for the sake of the tutorial (by nature of being a smaller model). We leave it to the user to perform a more exhaustive search of hyperparameter values for training models that perform better.

In [4]:
from climate_learn.models import load_model

model_kwargs = {
    "in_channels": len(data_module.hparams.in_vars),
    "out_channels": len(data_module.hparams.out_vars),
    "n_blocks": 4
}

optim_kwargs = {
    "lr": 1e-4,
    "weight_decay": 1e-5,
    "warmup_epochs": 1,
    "max_epochs": 5,
}

model_module = load_model(name = "resnet", task = "forecasting", model_kwargs = model_kwargs, optim_kwargs = optim_kwargs)

In [5]:
from climate_learn.models import set_climatology
set_climatology(model_module, data_module)

## Training


The training objective ensures that the machine learning model makes accurate forecasts over the gridded data. We employ latitude weighted RMSE given by:

<br>
$RMSE = \frac{1}{N_{forecasts}}\sum_{i}^{N_{forecasts}}\sqrt{\frac{1}{N_{lat}N_{lon}}\sum_{j}^{N_{lat}}\sum_{k}^{N_{lon}}L(j)(f_{i,j,k}-t_{i,j,k})^{2}} \tag{1}$ 
<br>

where $f$ is the model forecast and $t$ is the ERA5 truth. $L(j)$ is the latitude weighing factor at the $j^{th}$ latitude index:

<br>
$L(j) = \frac{cos(lat(j))}{\frac{1}{N_{lat}}\sum_{j}^{N_{lat}}cos(lat(j))} \tag{2}$
<br>

(Optional) If you want to monitor training and validation curves of the model using [Weights and Biases](https://docs.wandb.ai/), uncomment the lines in the following code block and login to your Wandb account (only once).

In [6]:
from climate_learn.training import Trainer

trainer = Trainer(
    seed = 0,
    accelerator = "gpu",
    # accelerator = "cpu",
    precision = 16,
    max_epochs = 1,
)

INFO:lightning_fabric.utilities.seed:Global seed set to 0


In [7]:
from climate_learn.models import fit_lin_reg_baseline
fit_lin_reg_baseline(model_module, data_module, reg_hparam=0.0)

In [8]:
trainer.fit(model_module, data_module)

Output()

## Evaluation 


Once our prediction model is trained, we want to be able to evaluate it against the ground truth labels for data samples in the test set. 

In addition to the Latitude weighted RMSE (Eq. 1), we shall look at the Anomaly Correlation Coefficient (ACC) which is defined as:

<br>
$ACC = \frac{\sum_{i,j,k}L(j)f'_{i,j,k}t'_{i,j,k}}{\sqrt{\sum_{i,j,k}L(j)f'^{2}_{i,j,k}L(j)t'^{2}_{i,j,k}}} \tag{3}$
<br>

where $'$ denotes the difference to the climatology. We define climatology as:

<br>
$climatology_{j,k} = \frac{1}{N_{time}}\sum{t_{j,k}}\tag{4}$
<br>

For the RMSE metric, we compare the deep learning model with a climatological forecast.

In [9]:
trainer.test(model_module, data_module)

Output()

The model's prediction has a strong correlation with the ground truth, which is indicated by a high ACC value. Compared to a climatological forecast, the deep learning model achieves a much smaller RMSE error.

<a name="spatial-downscaling"></a>
# Spatial Downscaling


General Circulation Models (GCMs) provide us with the future projections of climate scenarios. These raw estimates have to be downscaled at the desired resolution for actionable guidance.

<br>
<center><img src="https://drive.google.com/uc?export=view&id=11i2CIRxlVRqOHIgZRABwF05Qf5KeqVwc" height=300></center>

In practice, statistical spatial downscaling can be used to make predictions about a climate variable (a) over the latitude-longitude grid of **higher** resolution than the input grid and (b) on specific sites at the target locations. For example, we can predict the temperature at a specific station in Germany based on the gridded temperature data over the whole country.

Major class of statistical downscaling models include Perfect Prognosis (PP) [1] that aims at learning a transfer function $$\hat{y} = f(x, Z)$$ where $y$ is the true value at location $x$ and $Z$ are the set of model predictors for the climate model. The various PP models differ in their realization of the transfer function $f$. Related works in [2] provides deeper details into the previous works. In [2,3], the authors use CNNs as the transfer function, broadly due to its inherent inductive bias towards handling Vision data. The ability of Deep CNNs to perform super-resolution is a well-explored field of study [4].

<br/><br/>

**References:**
1. Maraun D, Wetterhall F, Ireson AM, Chandler RE, Kendon EJ, Widmann M, Brienen S, Rust HW, Sauter T, Themeßl M, Venema VK. Precipitation downscaling under climate change: Recent developments to bridge the gap between dynamical models and the end user. Reviews of geophysics. 2010 Sep;48(3).
2. Vaughan A, Tebbutt W, Hosking JS, Turner RE. Convolutional conditional neural processes for local climate downscaling. arXiv preprint arXiv:2101.07950. 2021 Jan 20.
3. Baño-Medina J, Manzanas R, Gutiérrez JM. Configuration and intercomparison of deep learning neural models for statistical downscaling. Geoscientific Model Development. 2020 Apr 28;13(4):2109-24.
4. Yamanaka J, Kuwashima S, Kurita T. Fast and accurate image super resolution by deep CNN with skip connection and network in network. InInternational Conference on Neural Information Processing 2017 Nov 14 (pp. 217-225). Springer, Cham.

- In this tutorial, we shall focus on mapping the coarse resolution data for a variable to a finer resolution at a given time stamp. Specifically, we shall continue with focusing on the _Temperature at 2m_ climate variable using a ResNet model.

## Data Preparation

To perfrom climate downscaling, we need to have data for the temperature at 2m at different resolutions. In addition to the 5.625deg dataset we downloaded above, here we download the 2.8125deg dataset, which divides the Earth's surface into a latitude x longitude grid of 64 x 128.

In [11]:
from climate_learn.data import download
from climate_learn.utils.datetime import Year, Days, Hours
from climate_learn.data import DataModule

# Download data from weatherbench (~4-6 minutes)
download(root = "/content/drive/MyDrive/Climate/.climate_tutorial", source = "weatherbench", variable = "2m_temperature", dataset = "era5", resolution = "2.8125")

data_module = DataModule(
    dataset = "ERA5",
    task = "downscaling",
    root_dir = "/content/drive/MyDrive/Climate/.climate_tutorial/data/weatherbench/era5/5.625",
    root_highres_dir = "/content/drive/MyDrive/Climate/.climate_tutorial/data/weatherbench/era5/2.8125",
    in_vars = ["2m_temperature"],
    out_vars = ["2m_temperature"],
    train_start_year = Year(2014),
    val_start_year = Year(2015),
    test_start_year = Year(2016),
    end_year = Year(2018),
    subsample = Hours(6),
    batch_size = 128,
    num_workers = 1
)

Downloading era5 2m_temperature data for 2.8125 resolution from weatherbench to /content/drive/MyDrive/Climate/.climate_tutorial/data/weatherbench/era5/2.8125/2m_temperature
Creating train dataset


100%|██████████| 1/1 [00:00<00:00, 43.18it/s]
100%|██████████| 1/1 [00:00<00:00, 30.77it/s]


Creating val dataset


100%|██████████| 1/1 [00:00<00:00, 47.43it/s]
100%|██████████| 1/1 [00:00<00:00, 26.84it/s]


Creating test dataset


100%|██████████| 3/3 [00:00<00:00, 35.66it/s]
100%|██████████| 3/3 [00:03<00:00,  1.24s/it]


## Model initialization

In [13]:
from climate_learn.models import load_model

model_kwargs = {
    "in_channels": len(data_module.hparams.in_vars),
    "out_channels": len(data_module.hparams.out_vars),
    "n_blocks": 4,
}

optim_kwargs = {
    "optimizer": "adamw",
    "lr": 1e-4,
    "weight_decay": 1e-5,
    "warmup_epochs": 1,
    "max_epochs": 5,
}

model_module = load_model(name = "resnet", task = "downscaling", model_kwargs = model_kwargs, optim_kwargs = optim_kwargs)

In [14]:
# latitude long info, 
from climate_learn.models import set_climatology
set_climatology(model_module, data_module)

## Training

In [15]:
from climate_learn.training import Trainer

trainer = Trainer(
    seed = 0,
    accelerator = "gpu",
    precision = 16,
    max_epochs = 5,
)

INFO:lightning_fabric.utilities.seed:Global seed set to 0


In [16]:
trainer.fit(model_module, data_module)

  rank_zero_warn(f"Checkpoint directory {dirpath} exists and is not empty.")


Output()

## Evaluation

In [17]:
trainer.test(model_module, data_module)

Output()