<a name="software-requirements"></a>
# Software Requirements

In [None]:
%%bash
python -m pip install --upgrade pip
pip install git+https://github.com/aditya-grover/climate-learn.git

In [2]:
from google.colab import drive
drive.mount('/content/drive')

Drive already mounted at /content/drive; to attempt to forcibly remount, call drive.mount("/content/drive", force_remount=True).


<a name="temporal-forecasting"></a>
# Temporal Forecasting


In [3]:
from climate_learn.data import download

In [4]:
# Download data from copernicus (~15-20 mins)
# Generate API KEY: https://cds.climate.copernicus.eu/api-how-to
# api_key = "154140:40d3d2e0-ed2c-4f60-8bc8-d15789841be0" # Change to your_api_key
# download(source = "copernicus", variable = "2m_temperature", dataset = "era5", year = 1979, api_key = api_key)

In [5]:
# Download data from weatherbench (~2-3 minutes)
download(root = "/content/drive/MyDrive/Climate/.climate_tutorial", source = "weatherbench", variable = "2m_temperature", dataset = "era5", resolution = "5.625")

Downloading era5 2m_temperature data for 5.625 resolution from weatherbench to /content/drive/MyDrive/Climate/.climate_tutorial/data/weatherbench/era5/5.625/2m_temperature


ERA5 dataset directory structure from the Weatherbench source.

```
|-- 5.625deg
|   |-- 2m_temperature
|       |-- 2m_temperature_1979_5.625deg.nc
|       |-- 2m_temperature_1980_5.625deg.nc
|       |-- ...
|       |-- 2m_temperature_2018_5.625deg.nc
```

## Data Preprocessing


In [6]:
from climate_learn.utils.data import load_dataset, view

dataset = load_dataset("/content/drive/MyDrive/Climate/.climate_tutorial/data/weatherbench/era5/5.625/2m_temperature")
view(dataset)

Unnamed: 0,Array,Chunk
Bytes,2.68 GiB,68.62 MiB
Shape,"(350640, 32, 64)","(8784, 32, 64)"
Count,120 Tasks,40 Chunks
Type,float32,numpy.ndarray
"Array Chunk Bytes 2.68 GiB 68.62 MiB Shape (350640, 32, 64) (8784, 32, 64) Count 120 Tasks 40 Chunks Type float32 numpy.ndarray",64  32  350640,

Unnamed: 0,Array,Chunk
Bytes,2.68 GiB,68.62 MiB
Shape,"(350640, 32, 64)","(8784, 32, 64)"
Count,120 Tasks,40 Chunks
Type,float32,numpy.ndarray


## Data Conversion
We further convert the *NetCDF* files to *PyTorch* Dataloaders.

We store the useful information about
 the data ('lat', 'long') of the regions as _data members_ of our dataloaders.





In [7]:
from climate_learn.utils.datetime import Year, Days, Hours
from climate_learn.data.climate_dataset.args import ERA5Args
from climate_learn.data.tasks.args import ForecastingArgs
from climate_learn.data import DataModuleArgs, DataModule

data_args = ERA5Args(
    root_dir = "/content/drive/MyDrive/Climate/.climate_tutorial/data/weatherbench/era5/5.625/",
    variables = ["2m_temperature"],
    years = range(1979, 2015),
)

forecasting_args = ForecastingArgs(
    dataset_args = data_args,
    in_vars = ["2m_temperature"],
    out_vars = ["2m_temperature"],
    pred_range = 3*24,
    subsample = 6,
)

data_module_args = DataModuleArgs(
    task_args = forecasting_args,
    train_start_year = 1979,
    val_start_year = 2015,
    test_start_year = 2017,
    end_year = 2018,
)

data_module = DataModule(
    data_module_args = data_module_args,
    batch_size = 128,
    num_workers = 1
)

Creating train dataset


100%|██████████| 36/36 [00:00<00:00, 79.05it/s]


Creating val dataset


100%|██████████| 2/2 [00:00<00:00, 10.82it/s]


Creating test dataset


100%|██████████| 2/2 [00:00<00:00, 44.91it/s]


<a name="spatial-downscaling"></a>
# Spatial Downscaling

## Data Download

To perfrom climate downscaling, we need to have data for the temperature at 2m at different resolutions. In addition to the 5.625deg dataset we downloaded above, here we download the 2.8125deg dataset, which divides the Earth's surface into a latitude x longitude grid of 64 x 128.

In [8]:
from climate_learn.data import download

# Download data from weatherbench (~2-3 minutes)
# download(root = "/content/drive/MyDrive/Climate/.climate_tutorial", source = "weatherbench", variable = "2m_temperature", dataset = "era5", resolution = "5.625")
# Download data from weatherbench (~4-6 minutes)
download(root = "/content/drive/MyDrive/Climate/.climate_tutorial", source = "weatherbench", variable = "2m_temperature", dataset = "era5", resolution = "2.8125")

Downloading era5 2m_temperature data for 2.8125 resolution from weatherbench to /content/drive/MyDrive/Climate/.climate_tutorial/data/weatherbench/era5/2.8125/2m_temperature


## Data Conversion

In [9]:
from climate_learn.utils.datetime import Year, Days, Hours
from climate_learn.data.climate_dataset.args import ERA5Args
from climate_learn.data.tasks.args import DownscalingArgs
from climate_learn.data import DataModuleArgs, DataModule

lowres_data_args = ERA5Args(
    root_dir = "/content/drive/MyDrive/Climate/.climate_tutorial/data/weatherbench/era5/5.625/",
    variables = ["2m_temperature"],
    years = range(1979, 2015),
)

highres_data_args = ERA5Args(
    root_dir = "/content/drive/MyDrive/Climate/.climate_tutorial/data/weatherbench/era5/2.8125",
    variables = ["2m_temperature"],
    years = range(1979, 2015),
)

downscaling_args = DownscalingArgs(
    dataset_args = data_args,
    highres_dataset_args = highres_data_args,
    in_vars = ["2m_temperature"],
    out_vars = ["2m_temperature"],
    subsample = 6,
)

data_module_args = DataModuleArgs(
    task_args = downscaling_args,
    train_start_year = 1979,
    val_start_year = 2015,
    test_start_year = 2017,
    end_year = 2018,
)

data_module = DataModule(
    data_module_args = data_module_args,
    batch_size = 128,
    num_workers = 1
)

Creating train dataset


100%|██████████| 36/36 [00:01<00:00, 34.18it/s]
100%|██████████| 36/36 [00:17<00:00,  2.01it/s]


Creating val dataset


100%|██████████| 2/2 [00:00<00:00, 25.79it/s]
100%|██████████| 2/2 [00:01<00:00,  1.85it/s]


Creating test dataset


100%|██████████| 2/2 [00:00<00:00, 42.52it/s]
100%|██████████| 2/2 [00:00<00:00,  2.11it/s]
