# Get started with DownClim

`DownClim` is a simple and easy to use Python package for performing a climate downscaling on future climatic projections. Based on a reference product, and a set of projections from [CMIP6](https://pcmdi.llnl.gov/CMIP6/) and [CORDEX](https://cordex.org/) simulations, `DownClim` will help you to generate a climate projection for your region of interest.

Let's see it in action!

## The `DownClimContext` object 

In [1]:
from __future__ import annotations

from downclim.downclim import DownClimContext
from downclim.dataset.cmip6 import CMIP6Context

The `DownClimContext` object is the main object of the `DownClim` package. It contains all the relevant information necessary to perform your downscaling on your area of interest. 



You can check directly the documentation of the `DownClimContext` object to have an extensive overview of how to define it properly. In the following section, we will see the main fields included in the `DownClimContext` object to perform a downscaling.

### Main fields of the `DownClimContext` object

`DownClimContext` objects contain multiple attributes. The most relevant ones are:

- `aois`: the areas of interest. You can define one or multiple areas of interest, and you have the possibility to define them either as a `str`, a `geopandas.GeoDataFrame` or a simple Python `tuple` of coordinates.
    - `str`: the name provided must be recognized by [GADM](https://gadm.org/) to retrieve the corresponding shapefile.
    - `geopandas.GeoDataFrame` containing the directly shapefile of the area of interest.
    - `tuple` of coordinates `(lon_min, lon_max, lat_min, lat_max, str)` where the last `str` is the name of the area of interest.
  
- `variables`: the variables to downscale. You want to select the climate variables that you want to downscale in your future climate. You must provide them as a list of strings, and the name of the variables have to match the `CMIP6` / `CORDEX` naming convention (see [CMIP6 CMOR Tables](https://github.com/PCMDI/cmip6-cmor-tables)). For example, you can select the following variables: `['tas', 'pr']` for 2m temperature and precipitation. Please note that the variables selected must be available both in the climate simulations and in the reference product.

- `baseline_product`: the reference product used for downscaling. 

- `downscaling_methods`: the downscaling method to use. To downscale your climate data on your area of interest, you might want to use different downscaling methods. So far, `DownClim` only supports the `bias_correction` method, but other methods can be easily adopted and will be available soon.

- `use_cmip6` & `use_cordex`: whether to use CMIP6 or CORDEX simulations. You can select the climate simulation you want to use for your future projections. `CMIP6` projections are available for a large number of models and scenarios, at global scale and usually at a coarse resolution. `CORDEX` simulations are regional projections, much less numerous but at a finer resolution.

- `baseline_periods`, `evaluation_periods`, `projection_periods`: the years to use for the baseline, evaluation and projection periods. To perform a downscaling, you must define the baseline period, evaluation period and projection period. 
  - The baseline period is the period used to calibrate the downscaling method. It is a historical period that must overlap the reference product and the historical climate simulations.
  - The evaluation period is the period that overlaps the reference product and the future climate simulations. It is used to evaluate the calibration of the downscaling method performed on the baseline period. 
  - The projection period is the period for which you want to generate the future climate projections downscaled.


You can create a `DownClimContext` object either directly, or by defining a configuration file in a `.yaml` format.

### `DownClimContext` object creation directly

You can instantiate a `DownClimContext` object directly by providing the necessary information either via the constructor or providing a dictionary filled with the required keys. Not all fields are mandatory, and default values are used for the omitted fields.

In [None]:
DownClimContext_example = DownClimContext(
    aoi="Vanuatu",
    variable=["tas", "pr"],
    baseline_period=(1980, 1981),
    evaluation_period=(2018, 2019),
    projection_period=(2099, 2100),
    use_cordex=False,
    use_cmip6=True,
    cmip6_context=CMIP6Context()
)


### `DownClimContext` object creation from a `.yaml` file

You can also define a `DownClimContext` object from a `.yaml` file. You must first create a template `.yaml` file. 

In [3]:
from downclim.downclim import generate_DownClimContext_template_file

generate_DownClimContext_template_file(output_file = './DownClimContext_example.yaml')

And after filling the sections of the file, your `DownClim` configuration file should look like this:

```yaml
##
## All fields except aoi have default values. If the field is not specified, the default value is used.
####################################################
# aoi
aoi: "Vanuatu"

# variables
variable: ["tas", "tasmin", "tasmax", "pr"]
time_frequency: "mon"

# downscaling
downscaling_aggregation: "monthly-mean"
downscaling_method: "bias_correction"

# data and simulations
baseline_product: "chelsa2"
use_cordex: false
use_cmip6: true
cordex_context: {}
cmip6_context: {}
baseline_period: [1980, 2005]
evaluation_period: [2006, 2019]
projection_period: [2071, 2100]
evaluation_product: ["chirps", "chelsa2"]

# internal computation
nb_threads: 2
memory_mb: 4096
chunks: {"time": 1, "lat": 1000, "lon": 1000}

# directories
output_dir: "results"
tmp_dir: "tmp"
keep_tmp_dir: "False"

# data access
esgf_credential: "../config/esgf_credential.yaml"
```

In [2]:
from downclim.downclim import define_DownClimContext_from_file

DownClimContext_example = define_DownClimContext_from_file('./DownClimContext_example.yaml')
DownClimContext_example.model_dump()

{'aoi': [                                            geometry GID_0   NAME_0
  0  MULTIPOLYGON (((169.7766 -20.2488, 169.7719 -2...   VUT  Vanuatu],
 'variable': ['tas', 'pr'],
 'time_frequency': <Frequency.MONTHLY: 'monthly'>,
 'downscaling_aggregation': <Aggregation.MONTHLY_MEAN: 'monthly-mean'>,
 'baseline_product': <DataProduct.CHELSA: product_name='chelsa', period=(1980, 2019), scale_factor={'pr': 0.1, 'tas': 0.1, 'tasmin': 0.1, 'tasmax': 0.1}, add_offset={'pr': 0, 'tas': -273.15, 'tasmin': -273.15, 'tasmax': -273.15}, url='https://os.zhdk.cloud.switch.ch/chelsav2/GLOBAL'>,
 'evaluation_product': [<DataProduct.CHELSA: product_name='chelsa', period=(1980, 2019), scale_factor={'pr': 0.1, 'tas': 0.1, 'tasmin': 0.1, 'tasmax': 0.1}, add_offset={'pr': 0, 'tas': -273.15, 'tasmin': -273.15, 'tasmax': -273.15}, url='https://os.zhdk.cloud.switch.ch/chelsav2/GLOBAL'>],
 'downscaling_method': <DownscaleMethod.BIAS_CORRECTION: 'bias_correction'>,
 'use_cordex': False,
 'use_cmip6': True,
 'cor

## Download required data

Once your `DownClimContext` object is defined correctly, you can download the required data defined in the context. You can do it by using the `download_data` method of the `DownClimContext` object.

In [None]:
DownClimContext_example.download_data()

Downloading baseline product...
Downloading CHELSA data...
Getting year "1980" for variables "tas" and areas of interest : "['Vanuatu']"


Process ForkPoolWorker-1:
Traceback (most recent call last):
  File "/Users/arsouze/miniconda3/envs/downclim/lib/python3.13/site-packages/xarray/backends/file_manager.py", line 211, in _acquire_with_cache_info
    file = self._cache[self._key]
           ~~~~~~~~~~~^^^^^^^^^^^
  File "/Users/arsouze/miniconda3/envs/downclim/lib/python3.13/site-packages/xarray/backends/lru_cache.py", line 56, in __getitem__
    value = self._cache[key]
            ~~~~~~~~~~~^^^^^
KeyError: [<function open at 0x1442c4cc0>, ('https://os.zhdk.cloud.switch.ch/chelsav2/GLOBAL/monthly/tas/CHELSA_tas_09_1980_V.2.1.tif',), 'r', (('sharing', False),), 'fd52a22c-0848-4895-b1da-2d44aaba0b61']

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/Users/arsouze/miniconda3/envs/downclim/lib/python3.13/site-packages/multiprocess/process.py", line 313, in _bootstrap
    self.run()
    ~~~~~~~~^^
  File "/Users/arsouze/miniconda3/envs/downclim/lib/python3.13/s

Getting year "1987" for variables "tas" and areas of interest : "['Vanuatu']"


KeyboardInterrupt: 

Concatenating data for area of interest : Vanuatu
saving file ./tmp/chelsa_Vanuatu_tas_1987.nc
Getting year "1988" for variables "tas" and areas of interest : "['Vanuatu']"
Concatenating data for area of interest : Vanuatu
saving file ./tmp/chelsa_Vanuatu_tas_1988.nc
Getting year "1989" for variables "tas" and areas of interest : "['Vanuatu']"
Concatenating data for area of interest : Vanuatu
saving file ./tmp/chelsa_Vanuatu_tas_1989.nc
Getting year "1990" for variables "tas" and areas of interest : "['Vanuatu']"
Concatenating data for area of interest : Vanuatu
saving file ./tmp/chelsa_Vanuatu_tas_1990.nc
Getting year "1994" for variables "tas" and areas of interest : "['Vanuatu']"
Concatenating data for area of interest : Vanuatu
saving file ./tmp/chelsa_Vanuatu_tas_1994.nc
Getting year "2001" for variables "tas" and areas of interest : "['Vanuatu']"
Concatenating data for area of interest : Vanuatu
saving file ./tmp/chelsa_Vanuatu_tas_2001.nc


## Perform a downscaling

Once your `DownClimContext` object is defined correctly, you can directly perform a downscaling according to the context.

If your data (either reference product or simulations) are not already downloaded, `DownClim` will do it first.

In [8]:
from downclim.

SyntaxError: invalid syntax (2063765836.py, line 1)