# An extensive walk through `DownClim`

```{note}
Although all the described methods and functions describe below are fully functional, this is not the recommended way to use the library, as you don't fully take advantage of the workflow and abstractions provided. Take a look at other notebooks from the `examples` section to see how to use `DownClim` in a more efficient way.

However, these functions will provide some finer control over the way you handle your data, and maybe at some point you'll find yourself your own way to use `DownClim` !
```

In [1]:
from importlib_resources import files

from downclim import get_logger

#logger = get_logger("downclim")
#logger.setLevel("ERROR")

2025-08-29 14:18:12,380 - downclim - INFO - DownClim starting... Enjoy!
2025-08-29 14:18:12,381 - downclim - INFO - DownClim logging system initialized


## Definition of the workflow

In this intentiionaly simplified example, we will use the `DownClim` library to:
- Define some areas of interest for using `DownClim`.
- Download `CHELSA` data for the historical period 1980-1981, and use it later as the baseline product for downscaling.
- Download `CHIRPS` and `GSHTD` data for the evaluation period 2005-2006. It will be further used for evaluation of the downscaled simulations.
- Download `CORDEX` and `CMIP6` data for the historical, evaluation and projection periods.
- Downscale `CMIP6` and `CORDEX` data for evaluation on projection periods
- evaluate the performance of the downscaled data

We will use the following periods (taken short for downloading and computational purpose, so it does not have a meaningful climate interpretation, but this is just for demonstration):

- Historical period: (1980, 1981) 
- Evaluation period: (2006, 2007)
- Projection period: (2071, 2072)

Please refer to the [getting started tutorial](../getting_started.md) to have a definition of these periods.

In [2]:
historical_period = (1980, 1981)
evaluation_period = (2006, 2007)
projection_period = (2071, 2072)

Please refer to the [DownClim Datasets documentation](./datasets.ipynb) for details about datasets available with DownClim.

Do not hesitate to explore the `DownClim` documentation, and especially [API documentation](reference.md) if you want to see all options and defaults values used for each function.

## Areas of interest

We first need to define the areas of interest. This will define the boundaries for downloading the data and the area for which we will be predicting the downscaling.

There are multiple ways to define the areas of interest (cf. api link to `get_aoi`). 

In [None]:
from shapely.geometry import MultiPolygon
import geopandas as gpd

from downclim.aoi import get_aoi, get_aoi_informations

ob = MultiPolygon([
(((0.0, 0.0), (0.0, 1.0), (1.0, 1.0), (1.0, 0.0)),
[((0.1,0.1), (0.1,0.2), (0.2,0.2), (0.2,0.1))])
])
gdf = gpd.GeoDataFrame({"geometry":ob, "NAME_0":["ob"]})

aoi1 = get_aoi("Vanuatu", save_aoi_file=True)
aoi2 = get_aoi((10, 10, 15, 15, "box"), save_aoi_file=True)
aoi3 = get_aoi(gdf, save_aoi_file=False)
aoi = [aoi1, aoi2]
aoi_name, aoi_bounds = get_aoi_informations(aoi)
print(aoi_name, aoi_bounds)

2025-08-29 14:18:16,833 - downclim.aoi - INFO - Retrieving AOI from Vanuatu
2025-08-29 14:18:16,835 - downclim.aoi - INFO -    AOI given as a string: retrieving from GADM for Vanuatu
2025-08-29 14:18:20,241 - downclim.aoi - INFO -    AOI CRS not defined: setting to EPSG:4326 / WGS84
2025-08-29 14:18:20,369 - downclim.aoi - INFO - Retrieving AOI from (10, 10, 15, 15, 'box')
2025-08-29 14:18:20,370 - downclim.aoi - INFO -    AOI given as a tuple: creating geometry for box: (10, 10, 15, 15), named box
2025-08-29 14:18:20,373 - downclim.aoi - INFO -    AOI CRS not defined: setting to EPSG:4326 / WGS84
2025-08-29 14:18:20,380 - downclim.aoi - INFO - Retrieving AOI names and bounds


## Downloading data

### Authentication

First, we need to authenticate to Earth Engine to retrieve data from `GSHTD`, `CHIRPS` and `CMIP6`.

Although we also need to authenticate to `ESGF` for `CORDEX` data, login information can be provided in a separate file.

In [4]:
from __future__ import annotations

import ee

In [5]:
your_ee_project_id = "downclim"
your_esgf_credential_file = files("downclim").parent.parent.joinpath("config", "esgf_credential.yaml")

In [6]:
# Authenticate to Earth Engine to retrieve CMIP6 data
# ee.Authenticate() if necessary, only need to do this once on your machine
ee.Initialize(opt_url="https://earthengine-highvolume.googleapis.com", project=your_ee_project_id)

### Download CHELSA data

In [7]:
from downclim.dataset.chelsa2 import get_chelsa2

get_chelsa2(
    aoi = aoi,
    variable = ["pr", "tas"],
    period = historical_period,
    keep_tmp_dir = True,
)

2025-08-29 14:18:29,747 - downclim.dataset.chelsa2 - INFO - Downloading CHELSA data...
2025-08-29 14:18:29,750 - downclim.dataset.utils - INFO - Checking output directory...
2025-08-29 14:18:29,754 - downclim.dataset.utils - INFO - Setting output directory: ./results/chelsa
2025-08-29 14:18:29,755 - downclim.dataset.utils - INFO - Checking output directory...
2025-08-29 14:18:29,757 - downclim.dataset.utils - INFO - Setting output directory: ./results/tmp/chelsa
2025-08-29 14:18:29,759 - downclim.aoi - INFO - Retrieving AOI names and bounds
2025-08-29 14:18:29,761 - downclim.dataset.chelsa2 - INFO - Downloading CHELSA data...
              but the behaviour of the function is not affected.
              If this is not the desired behavior, please remove the file(s) from the temporary folder
              ./results/tmp/chelsa and rerun the function.
              but the behaviour of the function is not affected.
              If this is not the desired behavior, please remove the file(

### Downalod CHIRPS and GSHTD data

In [8]:
from downclim.dataset.chirps import get_chirps
from downclim.dataset.gshtd import get_gshtd

get_chirps(
    aoi = aoi,
    period = evaluation_period,
)

get_gshtd(
    aoi = aoi,
    variable = ["tas"],
    period = evaluation_period,
)


2025-08-29 14:18:58,458 - downclim.dataset.chirps - INFO - Downloading CHIRPS data...
2025-08-29 14:18:58,461 - downclim.dataset.utils - INFO - Checking output directory...
2025-08-29 14:18:58,463 - downclim.dataset.utils - INFO - Setting output directory: ./results/chirps
2025-08-29 14:18:58,465 - downclim.aoi - INFO - Retrieving AOI names and bounds
2025-08-29 14:18:59,505 - downclim.dataset.connectors - INFO - Already connected to Earth Engine with project 'downclim'.
2025-08-29 14:18:59,507 - downclim.dataset.chirps - INFO - Getting CHIRPS data for period : "(2006, 2007)" and area of interest : "Vanuatu"


  return rasterio.crs.CRS.from_user_input(crs_input)


2025-08-29 14:19:12,203 - downclim.dataset.utils - INFO - Saving chirps grid for Vanuatu...
2025-08-29 14:19:12,212 - downclim.dataset.utils - INFO - Grid saved to ./results/chirps/chirps_Vanuatu_grid.nc
2025-08-29 14:19:12,214 - downclim.dataset.chirps - INFO - Getting CHIRPS data for period : "(2006, 2007)" and area of interest : "box"
2025-08-29 14:19:20,952 - downclim.dataset.utils - INFO - Saving chirps grid for box...
2025-08-29 14:19:20,961 - downclim.dataset.utils - INFO - Grid saved to ./results/chirps/chirps_box_grid.nc
2025-08-29 14:19:20,962 - downclim.dataset.gshtd - INFO - Downloading GSHTD data...
2025-08-29 14:19:20,963 - downclim.dataset.utils - INFO - Checking output directory...
2025-08-29 14:19:20,966 - downclim.dataset.utils - INFO - Setting output directory: ./results/gshtd
2025-08-29 14:19:20,968 - downclim.aoi - INFO - Retrieving AOI names and bounds
2025-08-29 14:19:21,813 - downclim.dataset.connectors - INFO - Already connected to Earth Engine with project 'do

## Download climate simulations

### Download `CORDEX` data

In [None]:
from downclim.dataset.cordex import (
    CORDEXContext,
    get_cordex,
    get_download_scripts,
)

# Define the research context for CORDEX data
cordex_context = CORDEXContext(
    domain=["AUS-22", "AFR-44"],
    experiment=["historical", "rcp26", "rcp85"],
    frequency="mon",
    variable=["pr", "tas"],
)

# Use the previously defined context to list available simulations
# ! This step requires ESGF credentials
cordex_simulations = cordex_context.list_available_simulations(esgf_credential = your_esgf_credential_file)

# Retrieve download scripts for the available simulations
cordex_simulations = get_download_scripts(cordex_simulations)

# Save the list of simulations to a CSV file. This can be useful if you want to perform hand-selection.
cordex_simulations.to_csv("results/cordex/cordex_simulations.csv")

get_cordex(
    aoi = aoi,
    cordex_simulations = cordex_simulations,
    historical_period = historical_period,
    evaluation_period = evaluation_period,
    projection_period = projection_period,
    output_dir = "./results/cordex",
    tmp_dir = "./results/tmp/cordex",
    keep_tmp_dir = True,
    esgf_credential = your_esgf_credential_file
)


FileNotFoundError: [Errno 2] No such file or directory: 'config/esgf_credential.yaml'

### Download `CMIP6` data

Please refer to the [documentation about getting CMIP simulations](./get_available_simulation.ipynb#CMIP6-simulations) for more details about how to get available simulations.

In [None]:
from downclim.dataset.cmip6 import (
                                    CMIP6Context,
                                    get_cmip6,
                                    )

cmip6_context = CMIP6Context(
    project = ["ScenarioMIP", "CMIP"],
    institute = ["NOAA-GFDL", "CMCC"],
    experiment = ["ssp126", "historical"],
    ensemble = "r1i1p1f1",
    frequency = "mon",
    variable = ["tas", "pr"],
    grid_label = "gn",
)

cmip6_simulations = cmip6_context.list_available_simulations()

get_cmip6(
    aoi = aoi,
    cmip6_simulations = cmip6_simulations,
    historical_period = historical_period,
    evaluation_period = evaluation_period,
    projection_period = projection_period,
    output_dir = "./results/cmip6",
)

cmip6_simulations.to_csv("results/cmip6/cmip6_simulations.csv")

       project institute        source  experiment  ensemble table variable  \
0         CMIP      CMCC  CMCC-CM2-SR5  historical  r1i1p1f1  Amon      tas   
1         CMIP      CMCC  CMCC-CM2-SR5  historical  r1i1p1f1  Amon       pr   
2  ScenarioMIP      CMCC  CMCC-CM2-SR5      ssp126  r1i1p1f1  Amon      tas   
3  ScenarioMIP      CMCC  CMCC-CM2-SR5      ssp126  r1i1p1f1  Amon       pr   
4         CMIP      CMCC     CMCC-ESM2  historical  r1i1p1f1  Amon      tas   
5         CMIP      CMCC     CMCC-ESM2  historical  r1i1p1f1  Amon       pr   
6  ScenarioMIP      CMCC     CMCC-ESM2      ssp126  r1i1p1f1  Amon       pr   
7  ScenarioMIP      CMCC     CMCC-ESM2      ssp126  r1i1p1f1  Amon      tas   

  grid_label                                           datanode  \
0         gn  gs://cmip6/CMIP6/CMIP/CMCC/CMCC-CM2-SR5/histor...   
1         gn  gs://cmip6/CMIP6/CMIP/CMCC/CMCC-CM2-SR5/histor...   
2         gn  gs://cmip6/CMIP6/ScenarioMIP/CMCC/CMCC-CM2-SR5...   
3         gn  gs://c

## Downscaling

We have already downloaded the necessary `CMIP6` and `CORDEX` simulations for the all three periods, as well as baseline product over historical period. Now we can proceed with the downscaling.

In [None]:
from pathlib import Path

from downclim.dataset.utils import DataProduct
from downclim.downscale import run_downscaling

baseline_product = DataProduct.CHELSA2
cmip6_simulations_files = Path("./results/cmip6").glob("*.nc") # we want to perform the downscaling on all CMIP6 simulations downloaded
cordex_simulations_files = Path("./results/cordex").glob("*.nc") # we want to perform the downscaling on all CORDEX simulations downloaded

run_downscaling(
    aoi = aoi,
    historical_period = historical_period,
    evaluation_period = evaluation_period,
    projection_period = projection_period,
    baseline_product = baseline_product,
    cmip6_simulations_to_downscale = cmip6_simulations_files,
    cordex_simulations_to_downscale = cordex_simulations_files,
    downscaling_grid_file = [f"./results/chelsa/chelsa_{aoi_n}_grid.nc" for aoi_n in aoi_name],
    periods_to_downscale = ["evaluation", "projection"],
    input_dir = "./results",
    output_dir = "./results/downscaled",
)


## Evaluation

To perform the evaluation of the downscaled projections, we need to have the evaluation products already downloaded over the evaluation period, as well as downscaling already performed over the evaluation period (cf. `periods_to_downscale` parameter in the `run_downscaling` function).

In [None]:
from downclim.evaluation import run_evaluation

evaluation_product = [DataProduct.CHIRPS, DataProduct.GSHTD]
run_evaluation(
    aoi = aoi,
    evaluation_period = evaluation_period,
    evaluation_product = evaluation_product,
    cmip6_simulations_to_evaluate = [],
    cordex_simulations_to_evaluate = [],
    input_dir = "./results/downscaled",
    output_dir = "./results/evaluation"
)