# Week 2 Day 3 Lab 

LEAP Summer Bootcamp 2025

Aya Lahlou

## Model Comparison Lab: Predicting TAS from CO₂ and CH₄

Choose **two model architectures** discussed in the bootcamp to **predict TAS** (near-surface air temperature) from **CO₂ and CH₄** under different scenarios (refer to the W2D1 lab).

Use the **Hugging Face library or Model Zoo** to explore and load model examples.

### For Each Model, Complete the Following:

- **Explore a data processing method:**
  - Try at least one of the following:
    - Normalization
    - Standardization
    - Downsampling (temporal or spatial resolution)
    - Data interpolation to increase resolution

- **Experiment with hyperparameters:**
  - Tune hyperparameters using one of the following approaches:
    - Grid search
    - Random search
    - Manual/arbitrary tuning
    - Load pretrained/tuned hyperparameters from Hugging Face

- **Evaluate model performance using 2–3 metrics:**
  - Examples include:
    - RMSE (Root Mean Square Error)
    - MAE (Mean Absolute Error)
    - Bias
    - R² (Coefficient of Determination)

- **Train your model** on the CO₂ and CH₄ data to predict TAS, using different sets of hyperparameters.

- **Record and compare performance metrics** across different hyperparameter settings:
  - Which set performed best?
  - Provide a brief interpretation of the results.

- **Perform uncertainty analysis:**
  - As in Lab 1, plot the **range of predictions**.
  - Comment on **model uncertainty** and its spatial/temporal patterns.

- **Plot your results**:
  - Visualize model predictions.
  - Comment on prediction performance and observed **spatial patterns**.

### Bonus

- Explore **interpretability methods** (e.g., SHAP, attention visualization) using Hugging Face tools to better understand the relationship between CO₂, CH₄, and TAS.

### Ressources: 

`Tsaug` is a Python package for time series augmentation. [github](https://github.com/arundo/tsaug)

`time_series_augmentation` provides methods like jittering, scaling, and time warping with Keras examples. [github](https://github.com/uchidalab/time_series_augmentation)

`tsai - Time Series Data Preparation` shows how to process and prepare time series data using the tsai library. [github](https://github.com/timeseriesAI/tsai/blob/master/tutorial_nbs/00c_Time_Series_data_preparation.ipynb)

`tsai - Time Series Regression` includes model training, evaluation, and hyperparameter tuning examples for time series regression. [github](https://github.com/timeseriesAI/tsai/blob/main/tutorial_nbs/04_Intro_to_Time_Series_Regression.ipynb)

`pytorch-forecasting` is a framework for time series forecasting using PyTorch, with support for interpretable deep learning models and tuning. [github](https://github.com/sktime/pytorch-forecasting)

`tsai - Time Series Classification` walks through training and evaluating deep learning models for classification tasks on time series. [github](https://github.com/timeseriesAI/tsai/blob/main/tutorial_nbs/01_Intro_to_Time_Series_Classification.ipynb)

`TimeSeries-Forecasting` is a full guide to building and evaluating deep learning models on time series datasets. [github](https://github.com/AayushSameerShah/TimeSeries-Forecasting)

`gxercavins/time-series` contains notebooks on forecasting and visualizing uncertainty in time series predictions. [github](https://github.com/gxercavins/time-series)

`tsai - PatchTST Tutorial` demonstrates a transformer-based model for long-term forecasting with visual performance diagnostics. [github](https://github.com/timeseriesAI/tsai/blob/main/tutorial_nbs/15_PatchTST_a_new_transformer_for_LTSF.ipynb)

`AugmentTS` uses generative models for time series data augmentation and interpretability exploration. [github](https://github.com/DrSasanBarak/AugmentTS)



In [None]:
!pip install tensorflow

In [None]:
import os
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import xarray as xr
from glob import glob

import tensorflow as tf
from tensorflow import keras
from tensorflow.keras.models import Model, load_model
from tensorflow.keras.layers import *
from tensorflow.keras import Sequential
from utils import * 

%matplotlib inline
%config InlineBackend.figure_format = 'retina'
%load_ext autoreload
%autoreload 2

plt.rcParams['savefig.dpi'] = 400
plt.rcParams['font.size'] = 13
plt.rcParams["legend.frameon"] = False

ClimateBench is a spatial-temporal dataset that contains simulations generated by the NorESM2 model. It provides both historical simulations & future projections under different scenarios (e.g., ssp245).

Four future scenarios are plotted here: `ssp126, ssp245, ssp370, ssp585`.
1. ssp126 (Low): low population growth, high levels of education, and global cooperation to address environmental and social issues.
2. ssp245 (Medium): intermediate challenges to mitigation and adaptation - moderate population growth, intermediate levels of education, and a balanced emphasis on economic development and environmental sustainability.
3. ssp370 (High): continued high population growth, limited environmental regulations, and slow technological progress in achieving sustainability goals.
4. ssp585 (Very High): high population growth, limited technological innovation in sustainability, and high reliance on fossil fuels

In [None]:
cwd = os.getcwd()
train_path = "gs://leap-persistent/jbusecke/data/climatebench/train_val/"
test_path = "gs://leap-persistent/jbusecke/data/climatebench/test/"

In [None]:
scenarios = ['historical','ssp126','ssp370','ssp585']
inputs = [os.path.join(train_path , f"inputs_{scenario}") for scenario in scenarios]
inputs.append(os.path.join(test_path, "inputs_ssp245"))
inputs.sort(key=lambda x:x.split('_')[-1])

outputs = [os.path.join(train_path , f"outputs_{scenario}") for scenario in scenarios]
outputs.append(os.path.join(test_path, "outputs_ssp245"))
outputs.sort(key=lambda x:x.split('_')[-1])
