# Welcom to xEarthStat For AgERA5

xEarthStat for AgERA5 allows users to download and aggregate AgERA5 climate data for a specified Region of Interest (ROI). This document outlines the installation process, setup, and usage instructions to get you started.

### Installation

To use xEarthStat for AgERA5, you first need to install the `earthstat` Python package. Run the following command in your Python environment:

### Step 1:Install & Import xEarthStat

In [None]:
# install the EarthStat

!pip install earthstat

In [None]:
# import the package

from earthstat import xEarthStat as xES

### Step 2: Initialize xEarthStat Workflow

Create an instance of xEarthStat, and initialize the workflow

In [None]:
# create an instance of the workflow

EU_AgERA5 = xES()

initilize the workflow by:

- **ROI Name** (`str`): Unique identifier for your ROI.
- **shape_file_path** (`str`): shapefile file path

In [None]:
ROI_name = 'EU'
shape_file_path = 'EU.shp'

EU_AgERA5.init_workflow(
    ROI_name, 
    shape_file_path # Adding shape file path is optional
)

> <span style="color:red;">**Note & Caution:**</span> Adding shapefile(optional): for just downloading data without data aggregation you can pass the shapefile.  

### Step 3: Download AgERA5 From CDS

Download the AgERA5 data for your ROI by defining the following:
- **AgERA5_parameters** (`list`): Define the list of interested variables to download from CDS.

> <span style="color:red;">**Note & Caution:**</span> Currently, xEarthStat can just download 7 variables included in the table below.

| Variable                 | AgERA5 Parameter            | Statistical Download Type |
|--------------------------|-----------------------------|---------------------------|
| Maximum Temperature      | 2m_temperature              | 24_hour_maximum           |
| Minimum Temperature      | 2m_temperature              | 24_hour_minimum           |
| Mean Temperature         | 2m_temperature              | 24_hour_mean              |
| Solar Radiation Flux     | solar_radiation_flux        | -                         |
| Precipitation Flux       | precipitation_flux          | -                         |
| Wind Speed               | 10m_wind_speed              | 24_hour_mean              |
| Vapour Pressure          | vapour_pressure             | 24_hour_mean              |


- **Bounding Box** (`list` of `float`): `north`, `west`, `south`, and `east` coordinates of ROI.
- **start_year** (`int`): the start year for data.
- **end_year** (`int`):  the end year for data.

***Example:***
To download data from 2000 to 2020, set start_year to 2000 and end_year to 2020. For a single year, set both to the same year.

In [None]:
# Define the AgERA5's variables to be downloaded

AgERA5_parameters = [
    'Maximum_Temperature', 'Minimum_Temperature', 'Mean_Temperature',
    'Solar_Radiation_Flux', 'Precipitation_Flux', 'Wind_Speed','Vapour_Pressure'
    ]

# Define the ROI's bounding box, start year, and end year

ROI_bounding_box = [71, -31, 34.5, 40]  # [north, west, south, east]
start_year = 2000
end_year = 2001

Next, initialize the AgERA5 downloader by defined parameters:

In [None]:
# Initialize the AgERA5 downloader

EU_AgERA5.init_AgERA5_downloader(
    AgERA5_parameters,
    ROI_bounding_box,
    start_year,
    end_year
)

**`xES.download_AgERA5`** options:

- `num_requests`: the number of downloading requests sends to CDS's API server until download all data.
- `extract`: `True` to Extract the downloaded AgERA5 zip files, set `False` if you don't want to extract zip files.

In [None]:
# Start downloading the AgERA5 data

EU_AgERA5.download_AgERA5(num_requests=6,
                          extract=True)

> <span style="color:red;">**Note & Caution:**</span> 
- Don't send more than 5 requests to the server. That leads to the server to block your API key from downloading.
- If your ROI is to much small decrease the number of requests to two.

### Step 4: Aggregate Data

xEarthStat's Aggregation process utilize the availability of GPU for parallel computation, and using the avilalble CPU cores for multiprocessing. it automatically detect if there is a GPU or not, if not it shift computational processing on CPU.

`xES.Aggregate_AgERA5`:

- `dataset_type`: Chosing the type of dataset, `dekadal` for aggregated dekadal (1,11,21 of month) dataset, `daily` for daily dataset.

- `all_touched`: Default to `False` to just consider pixels within the geometry object. `True` to consider all touched pixels by geo-object.

- `stat`:  `"mean"`(Default), `"median"`, `"min"`, `"max"`, `"sum"`.

- `multi_processing`: Enables parallel processing.

- `max_workers`: Default to total number of CPU's cores. You can change the number of cores that used in multiprocessing.

In [None]:
# Explore the number of all CPU cores

import os

cpu_cores = os.cpu_count()

print(f'Number of CPU cores: {cpu_cores}')

In [None]:
# Start aggregating the downloaded AgERA5 data

EU_AgERA5.Aggregate_AgERA5(

    dataset_type = "dekadal",
    all_touched=False,
    stat='mean',
    multi_processing=False,
    max_workers=None, # None means using all CPU cores
)

### Step 5 (Optional): Merge Aggregated AgERA5's Variables CSVs

Optionally, merge all generated datasets' csv files into one merged csv for all aggregated variables:

`xES.AgERA5_merged_csv`:

- `kelvin_to_celsius`: To convert the temperature unit from kelvin to celsius.

- `output_name`: option to add the name of merged csv, it's default to `AgERA5_{ROI_name}_merged_parameters_{workflow}_{timestamp}.csv`

In [None]:
EU_AgERA5.AgERA5_merged_csv(kelvin_to_celsius=False, 
                            output_name=None)