# Processing the Composite Drought Index using python and STAC

This notebook will introduce the the **Composite Drought Index (CDI)** and how it can be readily estimated using python and open-source satellite products readily available through the **SpatioTemporal Asset Catalogs (STAC)** framework. 

The CDI incorporates **three main components** that impact drought severity: 1) precipitation deficit, 2) excess temperature and 3) vegetation response, incorporating within the CDI three drought indices:

- Precipiation Drought Index (PDI)
- Temperature Drought Index (TDI)
- Vegetation Drought Index (VDI)

In this example, we will use [ERA5-Land](https://cds.climate.copernicus.eu/cdsapp#!/dataset/reanalysis-era5-land?tab=overview) from the Copernicus Climate Change Service to calculate the **PDI** and **TDI**, while we will use [MODIS 8-day Surface Reflectance](https://lpdaac.usgs.gov/products/mod09gqv061/) product to calculate the Normalize-Difference Vegetation Index (NDVI) as the response variable to produce the **VDI**.

The general form of each of the drought indices is:

![equation](images/DroughtIndex_equation.png)

where IP is the interest period, LTM is the long-term mean and the right side refers to the ratio between the run length of continuous deficit (in the case of precipiration and NDVI) or excess (in the case of temperature) compared to the long-term-mean. 

In this example, we use a monthly time step and the interest period uses a the 3-month moving window .

We will produce a CDI time series over the Borena region in Southern Ethiopia, which is particularly vulnerable to drought and their impacts. This notebook will showcase the use of open-source satellite-based datasets readily available in the STAC to produce an early warning system for drought events.

![study area](images/Borena_LULC_DEM.png)

This notebook will follow the following structure:

## General structure of notebook

### 1. Precipiation Drought Index (PDI)

##### 1.1 Long-term-mean (LTM)
##### 1.2 Actual conditions over the interest period (IP)
##### 1.3 PDI calculation

### 2. Temperature Drought Index (TDI)

##### 2.1 Long-term-mean (LTM)
##### 2.2 Actual conditions over the interest period (IP)
##### 2.3 TDI calculation

### 3. Vegetation Drought Index (VDI)

##### 3.1 Computing monthly mean NDVI
##### 3.2 Long-term-mean (LTM)
##### 3.3 Actual conditions over the interest period (IP)
##### 3.4 VDI calculation

### 4. Composite Drought Index (CDI)
##### 4.1 Merging drought indices and calculating CDI
##### 4.2 Visualizing results with leafmap


## Acknowledgements

This work was done within the [EO Africa](https://www.eoafrica-rd.org/) [R&D research projects](https://www.eoafrica-rd.org/research/research-projects-2023-2024/) funded by the European Space Agency (ESA)

![EO AFRICA](images/EOAFRICA-logo.png) 





## 0. import libraries
Before begining, make sure to view the readme and install all the dependencies needed to run this code

In [1]:
from pathlib import Path
from cubo import cubo
from rasterio.crs import CRS
import gdal_utils as gu
import matplotlib.pyplot as plt
import pandas as pd
import numpy as np
from pyCDI import cdi_functions as cdi
import xarray as xr
import ee
print('libraries imported correctly')

libraries imported correctly


### 1. PDI estimation

Monthly precipitation is obtained from the monthly [ERA5-Land dataset](https://radiantearth.github.io/stac-browser/#/external/storage.googleapis.com/earthengine-stac/catalog/ECMWF/ECMWF_ERA5_MONTHLY.json) available from the STAC images available from the google earth engine repositories

We will extract the data using [Cubo](https://github.com/ESDS-Leipzig/cubo), which facilitates the manipulation of geospatial datasets in STAC format. You will need to authenticate first your GEE account using ee.Authenticate() 

In [2]:
ee.Authenticate() 

True

Once authenticated, initialize the high volume endpoint from Google Earth Engine:

In [3]:
ee.Initialize(opt_url='https://earthengine-highvolume.googleapis.com')

#### Acquire STAC data using cubo
As first step, we will need to extract the dataset from GEE and use Cubo to import it as an xarray dataset. We need to specify the start and end date and also geographical location of center of the area of interest. We will take advantage to already import both Precipitation and air temperature.

In [4]:
# select start and end date
start_date = '2001-01-01'
end_date = '2022-12-31'

# select centroid of area of interest (Borena Region, Ethiopia)
lon = 38.18749189093533
lat = 4.428931591589981


# get monthly precipitation and air temperature from ERA5-Land STAC from GEE
da = cubo.create(
     lat=lat,
     lon=lon,
     collection="ECMWF/ERA5_LAND/MONTHLY_AGGR", # ID of the GEE collection
     bands=["total_precipitation_sum", "temperature_2m"], # Bands to retrieve
     start_date=start_date,
     end_date=end_date, # End date of the cube (remember in GEE this date is not included)
     edge_size=70,
     resolution=10000,
     gee=True # Set to True
)

da = da.assign_coords(epsg=da.attrs['epsg'])
da = da.rio.write_crs(f"EPSG:{da['epsg'].data}")

# get georeferencing metadata
epsg_code = da.attrs['epsg']
center_x = da.attrs['central_x']
center_y = da.attrs['central_y']
width = da.attrs['edge_size']
pixel_size = da.attrs['resolution']

# get geotransform and projection info to be used later to save outputs as rasters
gt = gu.calculate_geotransform(center_x, center_y, pixel_size, width, width)
proj = CRS.from_epsg(epsg_code).wkt


You can inspect the Cubo xarray dataset

In [5]:
da

Unnamed: 0,Array,Chunk
Bytes,9.87 MiB,918.75 kiB
Shape,"(264, 2, 70, 70)","(48, 1, 70, 70)"
Dask graph,12 chunks in 6 graph layers,12 chunks in 6 graph layers
Data type,float32 numpy.ndarray,float32 numpy.ndarray
"Array Chunk Bytes 9.87 MiB 918.75 kiB Shape (264, 2, 70, 70) (48, 1, 70, 70) Dask graph 12 chunks in 6 graph layers Data type float32 numpy.ndarray",264  1  70  70  2,

Unnamed: 0,Array,Chunk
Bytes,9.87 MiB,918.75 kiB
Shape,"(264, 2, 70, 70)","(48, 1, 70, 70)"
Dask graph,12 chunks in 6 graph layers,12 chunks in 6 graph layers
Data type,float32 numpy.ndarray,float32 numpy.ndarray


#### 1.1 calculation of Long-term-mean (LTM) of Precipitation

In [6]:
# select precipitation and slice the xarray DataArray
var = "total_precipitation_sum"
da_var = da.sel(band=var)
# get all time steps
time_steps = pd.to_datetime(da['time'].values)
years = np.array(time_steps.year)

# select outfolder to save LTM inputs
outfolder_ltm = Path() / 'CDI_data' / 'inputs_downloads' / 'Precip' / 'LTM'

Since calculating the LTM may take some time, if the LTM were already produced you can directly import them by setting ltm_produced=True. If it is the first time calculating or you want to calculate it again set ltm_produced=False

In [7]:
ltm_produced = False
if ltm_produced:
    # get mean data
    mean_ltm_folder = outfolder_ltm / 'mean'
    mean_img_list = sorted(list(mean_ltm_folder.glob('*.tif')))
    P_ltm_ds = gu.rasterlist2dict(mean_img_list)
    
    # get RL data
    rl_img_folder = outfolder_ltm / 'RL'
    rl_img_list = sorted(list(rl_img_folder.glob('*.tif')))
    P_rl_ltm_ds = gu.rasterlist2dict(rl_img_list)

else:
    P_ltm_ds, P_rl_ltm_ds = cdi.compute_ltm(da_var, time_steps, gt, proj, outfolder_ltm, unit_scaler=1000, deficit=True, gee=True)


Processing Long-Term-Mean (LTM) for Month 1
	saving result to CDI_data/inputs_downloads/Precip/LTM/mean/Mean_LTM_1.tif
	Processing deficit run length for year 2001




	Processing deficit run length for year 2002
	Processing deficit run length for year 2003
	Processing deficit run length for year 2004
	Processing deficit run length for year 2005
	Processing deficit run length for year 2006
	Processing deficit run length for year 2007
	Processing deficit run length for year 2008
	Processing deficit run length for year 2009
	Processing deficit run length for year 2010
	Processing deficit run length for year 2011
	Processing deficit run length for year 2012
	Processing deficit run length for year 2013
	Processing deficit run length for year 2014
	Processing deficit run length for year 2015
	Processing deficit run length for year 2016
	Processing deficit run length for year 2017
	Processing deficit run length for year 2018
	Processing deficit run length for year 2019
	Processing deficit run length for year 2020
	Processing deficit run length for year 2021
	Processing deficit run length for year 2022
	saving result to CDI_data/inputs_downloads/Precip/LTM/

#### 1.2 calculation of Precipitation during IP
once LTM data is produced, we can calculate the actual conditions for the diferent interest periods (IP). To limit processing time, we will only process for the year 2021 as an example

In [8]:
mask = years > 1999
# set output folder for IP
outfolder_ip = Path() / 'CDI_data' / 'inputs_downloads' / 'Precip' / 'IP'


Again, if the IP results were already produced you can directly import them by setting ip_produced=True. If it is the first time calculating or you want to calculate it again set ip_produced=False

In [9]:
ip_produced = False

if ip_produced:
    # get mean IP
    mean_ip_folder = outfolder_ip / 'mean'
    mean_ip_list = sorted(list(mean_ip_folder.glob('*.tif')))
    P_ip_ds = gu.rasterlist2dict(mean_ip_list)

    # remove any dates outside of interest period
    dates_to_remove = [key for key in P_ip_ds if not (pd.to_datetime(start_date) <= key <= pd.to_datetime(end_date))]
    # Removing the keys
    for date in dates_to_remove:
        del P_ip_ds[date]
    
    # get RL data
    rl_ip_folder = outfolder_ip / 'RL'
    mean_rl_list = sorted(list(rl_ip_folder.glob('*.tif')))
    P_rl_ip_ds = gu.rasterlist2dict(mean_rl_list)

    # remove any dates outside of interest period
    dates_to_remove = [key for key in P_rl_ip_ds if not (pd.to_datetime(start_date) <= key <= pd.to_datetime(end_date))]
    # Removing the keys
    for date in dates_to_remove:
        del P_rl_ip_ds[date]
    

else:
    P_ip_ds, P_rl_ip_ds = cdi.compute_ip(da_var[mask, :, :], time_steps[mask], gt, proj, outfolder_ip, unit_scaler=1000, deficit=True, gee=True)


Processing IP periods for year 2001
	month 1
	saving result to CDI_data/inputs_downloads/Precip/IP/mean/Mean_IP_2001_1.tif
	saving result to CDI_data/inputs_downloads/Precip/IP/RL/RL_IP_2001_1.tif

	month 2
	saving result to CDI_data/inputs_downloads/Precip/IP/mean/Mean_IP_2001_2.tif
	saving result to CDI_data/inputs_downloads/Precip/IP/RL/RL_IP_2001_2.tif

	month 3
	saving result to CDI_data/inputs_downloads/Precip/IP/mean/Mean_IP_2001_3.tif
	saving result to CDI_data/inputs_downloads/Precip/IP/RL/RL_IP_2001_3.tif

	month 4
	saving result to CDI_data/inputs_downloads/Precip/IP/mean/Mean_IP_2001_4.tif
	saving result to CDI_data/inputs_downloads/Precip/IP/RL/RL_IP_2001_4.tif

	month 5
	saving result to CDI_data/inputs_downloads/Precip/IP/mean/Mean_IP_2001_5.tif
	saving result to CDI_data/inputs_downloads/Precip/IP/RL/RL_IP_2001_5.tif

	month 6
	saving result to CDI_data/inputs_downloads/Precip/IP/mean/Mean_IP_2001_6.tif
	saving result to CDI_data/inputs_downloads/Precip/IP/RL/RL_IP_2001

#### 1.3 PDI calculation
Once we have both the LTM and IP values for precipitation, we can calculate the PDI

In [10]:
## 1.3 calculate PDI
outfolder = Path() / 'CDI_data' / 'drought_indices'
pdi = cdi.calc_pdi(P_ip_ds, P_rl_ip_ds, P_ltm_ds, P_rl_ltm_ds, gt, proj, outfolder)

2001
	Month: 1
	saving PDI result for 2001-1 to CDI_data/drought_indices/PDI/PDI_IP_2001_1.tif

	Month: 2
	saving PDI result for 2001-2 to CDI_data/drought_indices/PDI/PDI_IP_2001_2.tif

	Month: 3
	saving PDI result for 2001-3 to CDI_data/drought_indices/PDI/PDI_IP_2001_3.tif

	Month: 4
	saving PDI result for 2001-4 to CDI_data/drought_indices/PDI/PDI_IP_2001_4.tif

	Month: 5
	saving PDI result for 2001-5 to CDI_data/drought_indices/PDI/PDI_IP_2001_5.tif

	Month: 6
	saving PDI result for 2001-6 to CDI_data/drought_indices/PDI/PDI_IP_2001_6.tif

	Month: 7
	saving PDI result for 2001-7 to CDI_data/drought_indices/PDI/PDI_IP_2001_7.tif

	Month: 8
	saving PDI result for 2001-8 to CDI_data/drought_indices/PDI/PDI_IP_2001_8.tif

	Month: 9
	saving PDI result for 2001-9 to CDI_data/drought_indices/PDI/PDI_IP_2001_9.tif

	Month: 10
	saving PDI result for 2001-10 to CDI_data/drought_indices/PDI/PDI_IP_2001_10.tif

	Month: 11
	saving PDI result for 2001-11 to CDI_data/drought_indices/PDI/PDI_IP_2

### 2. TDI estimation 
Now, we do a very similar procedure with the air temperature data

#### 2.1 calculation of Long-term-mean (LTM) of air temperature
We will use the same xarray dataset already opened for precipitation but slice it for air temperature


In [11]:
# select precipitation and slice the xarray
var = "temperature_2m"
da_var = da.sel(band=var)
# get all time steps
time_steps = pd.to_datetime(da['time'].values)
years = np.array(time_steps.year)

# select output folder for Ta
outfolder_ltm = Path() / 'CDI_data' / 'inputs_downloads' / 'Ta' / 'LTM'

Since calculating the LTM may take some time, if the LTM were already produced you can directly import them by setting ltm_produced=True. If it is the first time calculating or you want to calculate it again set ltm_produced=False

In [15]:
ltm_produced = True
if ltm_produced:
    # get mean
    mean_ltm_folder = outfolder_ltm / 'mean'
    mean_img_list = sorted(list(mean_ltm_folder.glob('*.tif')))
    Ta_ltm_ds = gu.rasterlist2dict(mean_img_list)
    
    # get RL data
    rl_img_folder = outfolder_ltm / 'RL'
    rl_img_list = sorted(list(rl_img_folder.glob('*.tif')))
    Ta_rl_ltm_ds = gu.rasterlist2dict(rl_img_list)

else:
    Ta_ltm_ds, Ta_rl_ltm_ds = cdi.compute_ltm(da_var, time_steps, gt, proj, outfolder_ltm, unit_scaler=1, deficit=False, gee=True)


#### 2.2 Air Temperature during IP
once LTM data is produced, we can calculate the actual conditions for the diferent interest periods (IP). To limit processing time, we will only process for the year 2021 as an example

In [12]:
mask = years > 1999
outfolder_ip = Path() / 'CDI_data' / 'inputs_downloads' / 'Ta' / 'IP'

Again, if the IP results were already produced you can directly import them by setting ip_produced=True. If it is the first time calculating or you want to calculate it again set ip_produced=False

In [13]:
ip_produced = False

if ip_produced:
    # get mean IP
    mean_ip_folder = outfolder_ip / 'mean'
    mean_ip_list = sorted(list(mean_ip_folder.glob('*.tif')))
    Ta_ip_ds = gu.rasterlist2dict(mean_ip_list)
    
    # remove any dates outside of interest period
    dates_to_remove = [key for key in Ta_ip_ds if not (pd.to_datetime(start_date) <= key <= pd.to_datetime(end_date))]
    # Removing the keys
    for date in dates_to_remove:
        del Ta_ip_ds[date]

    # get RL data
    rl_ip_folder = outfolder_ip / 'RL'
    mean_rl_list = sorted(list(rl_ip_folder.glob('*.tif')))
    Ta_rl_ip_ds = gu.rasterlist2dict(mean_rl_list)

    # remove any dates outside of interest period
    dates_to_remove = [key for key in Ta_rl_ip_ds if not (pd.to_datetime(start_date) <= key <= pd.to_datetime(end_date))]
    # Removing the keys
    for date in dates_to_remove:
        del Ta_rl_ip_ds[date]

else:
    Ta_ip_ds, Ta_rl_ip_ds = cdi.compute_ip(da_var[mask, :, :], time_steps[mask], gt, proj, outfolder_ip, unit_scaler=1, deficit=False, gee=True)


Processing IP periods for year 2001
	month 1
	saving result to CDI_data/inputs_downloads/Ta/IP/mean/Mean_IP_2001_1.tif
	saving result to CDI_data/inputs_downloads/Ta/IP/RL/RL_IP_2001_1.tif

	month 2
	saving result to CDI_data/inputs_downloads/Ta/IP/mean/Mean_IP_2001_2.tif
	saving result to CDI_data/inputs_downloads/Ta/IP/RL/RL_IP_2001_2.tif

	month 3
	saving result to CDI_data/inputs_downloads/Ta/IP/mean/Mean_IP_2001_3.tif
	saving result to CDI_data/inputs_downloads/Ta/IP/RL/RL_IP_2001_3.tif

	month 4
	saving result to CDI_data/inputs_downloads/Ta/IP/mean/Mean_IP_2001_4.tif
	saving result to CDI_data/inputs_downloads/Ta/IP/RL/RL_IP_2001_4.tif

	month 5
	saving result to CDI_data/inputs_downloads/Ta/IP/mean/Mean_IP_2001_5.tif
	saving result to CDI_data/inputs_downloads/Ta/IP/RL/RL_IP_2001_5.tif

	month 6
	saving result to CDI_data/inputs_downloads/Ta/IP/mean/Mean_IP_2001_6.tif
	saving result to CDI_data/inputs_downloads/Ta/IP/RL/RL_IP_2001_6.tif

	month 7
	saving result to CDI_data/inpu

#### 2.3 Calculate TDI
Once we have both the LTM and IP values for precipitation, we can calculate the TDI

In [16]:
# get maximum air temperature during whole period to normalize air temperature when computing TDI
Ta_ar = da_var.values
Ta_max_ar = np.nanmax(Ta_ar)

outfolder = Path() / 'CDI_data' / 'drought_indices'
tdi = cdi.calc_tdi(Ta_ip_ds, Ta_rl_ip_ds, Ta_ltm_ds, Ta_rl_ltm_ds, Ta_max_ar, gt, proj, outfolder)

2001
	Month: 1
	saving TDI result for 2001-1 to CDI_data/drought_indices/TDI/TDI_IP_2001_1.tif

	Month: 2
	saving TDI result for 2001-2 to CDI_data/drought_indices/TDI/TDI_IP_2001_2.tif

	Month: 3
	saving TDI result for 2001-3 to CDI_data/drought_indices/TDI/TDI_IP_2001_3.tif

	Month: 4
	saving TDI result for 2001-4 to CDI_data/drought_indices/TDI/TDI_IP_2001_4.tif

	Month: 5
	saving TDI result for 2001-5 to CDI_data/drought_indices/TDI/TDI_IP_2001_5.tif

	Month: 6
	saving TDI result for 2001-6 to CDI_data/drought_indices/TDI/TDI_IP_2001_6.tif

	Month: 7
	saving TDI result for 2001-7 to CDI_data/drought_indices/TDI/TDI_IP_2001_7.tif

	Month: 8
	saving TDI result for 2001-8 to CDI_data/drought_indices/TDI/TDI_IP_2001_8.tif

	Month: 9
	saving TDI result for 2001-9 to CDI_data/drought_indices/TDI/TDI_IP_2001_9.tif

	Month: 10
	saving TDI result for 2001-10 to CDI_data/drought_indices/TDI/TDI_IP_2001_10.tif

	Month: 11
	saving TDI result for 2001-11 to CDI_data/drought_indices/TDI/TDI_IP_2

### 3. VDI estimation

In this case of VDI, we will use NDVI as a proxie of vegetation vigor. We will process the monthly values from the 8-day surface reflectance product from MODIS through the STAC browser of the [Microsoft Planetry Computer](https://planetarycomputer.microsoft.com/dataset/modis-09Q1-061)

#### 3.1 Deriving monthly mean NDVI

As a first step we need to get monthly mean NDVI from these 8-day products

In [17]:
# select start and end date
start_date = '2001-01-01'
end_date = '2022-12-31'

# for ndvi product we use microsoft planetery computer STAC
stac = 'https://planetarycomputer.microsoft.com/api/stac/v1'

# 8-day surface reflectance (250m)
collection = 'modis-09Q1-061'
da = cubo.create(lat, lon,
                 collection,
                 start_date,
                 end_date,
                 bands=['sur_refl_b01', 'sur_refl_b02', 'sur_refl_qc_250m'],
                 edge_size=1400, # 1400 roughly size of roi
                 resolution=250,
                 stac=stac,
                 )

da = da.assign_coords(epsg=da.attrs['epsg'])
da = da.rio.write_crs(f"EPSG:{da['epsg'].data}")
# get georeferencing metadata
epsg_code = da.attrs['epsg']
center_x = da.attrs['central_x']
center_y = da.attrs['central_y']
width = da.attrs['edge_size']
pixel_size = da.attrs['resolution']

gt = gu.calculate_geotransform(center_x, center_y, pixel_size, width, width)
proj = CRS.from_epsg(epsg_code).wkt
# use start datetime as time (some issues with original time domain)
da['time'] = da['start_datetime']

  times = pd.to_datetime(


#### 3.1 Deriving monthly mean NDVI

As a first step we need to get monthly mean NDVI from these 8-day products. If these data were already produced, simply set ndvi_produced=True.

In [18]:
ndvi_produced = True

outfolder_ndvi = Path() / 'CDI_data' / 'inputs_downloads' / 'ndvi'
if ndvi_produced:
    mean_folder = outfolder_ndvi / 'monthly_rasters'
    mean_list = sorted(list(mean_folder.glob('*.tif')))
    ndvi_monthly = gu.rasterlist2xarray(mean_list)
else:
    ndvi_monthly = cdi.compute_ndvi_monthly(da, start_date, end_date, gt, proj, outfolder_ndvi)

# convert xarray to DataArray to match cubo array type
# Define dimensions and coordinates
dims = ('time', 'x', 'y')  # Example dimensions
coords = {'time': ndvi_monthly['time'].values, 'x': ndvi_monthly['x'].values, 'y': ndvi_monthly['y'].values}
# store in xarray DataArray
da_ndvi = xr.DataArray(data=ndvi_monthly.to_array(), dims=dims, coords=coords)
date_mask = np.logical_and(da_ndvi['time'] >= pd.to_datetime(start_date), da_ndvi['time'] <= pd.to_datetime(end_date))
da_ndvi = da_ndvi[date_mask, :, :]
# sort data array by time
da_ndvi = da_ndvi.sortby('time')

time_steps = pd.to_datetime(da_ndvi['time'])
years = time_steps.year
months = time_steps.month

#### 3.2 calculation of Long-term-mean (LTM) of NDVI
we will use the output of the monthly mean processed above to calculate the LTM, which was stored in xarray dataset similar to maintain consistency with the processing of precipitation and air temperature

In [19]:
# set output folder
outfolder_ndvi_ltm = Path() / 'CDI_data' / 'inputs_downloads' / 'ndvi'/'LTM'

ltm_produced = False
if ltm_produced:
    # get mean data

    mean_ltm_folder = outfolder_ndvi_ltm / 'mean'
    mean_img_list = sorted(list(mean_ltm_folder.glob('*.tif')))
    ndvi_ltm_ds = gu.rasterlist2dict(mean_img_list)
    
    # get RL data
    rl_img_folder = outfolder_ndvi_ltm / 'RL'
    rl_img_list = sorted(list(rl_img_folder.glob('*.tif')))
    ndvi_rl_ltm_ds = gu.rasterlist2dict(rl_img_list)

else:
    ndvi_ltm_ds, ndvi_rl_ltm_ds = cdi.compute_ltm(da_ndvi, time_steps, gt, proj, outfolder_ndvi_ltm, unit_scaler=1, deficit=True)


Processing Long-Term-Mean (LTM) for Month 1
	saving result to CDI_data/inputs_downloads/ndvi/LTM/mean/Mean_LTM_1.tif
	Processing deficit run length for year 2001
	Processing deficit run length for year 2002
	Processing deficit run length for year 2003
	Processing deficit run length for year 2004
	Processing deficit run length for year 2005
	Processing deficit run length for year 2006
	Processing deficit run length for year 2007
	Processing deficit run length for year 2008
	Processing deficit run length for year 2009
	Processing deficit run length for year 2010
	Processing deficit run length for year 2011
	Processing deficit run length for year 2012
	Processing deficit run length for year 2013
	Processing deficit run length for year 2014
	Processing deficit run length for year 2015
	Processing deficit run length for year 2016
	Processing deficit run length for year 2017
	Processing deficit run length for year 2018
	Processing deficit run length for year 2019
	Processing deficit run leng

#### 3.3 NDVI during IP
once LTM data is produced, we can calculate the actual conditions for the diferent interest periods (IP). To limit processing time, we will only process for the year 2021 as an example

In [20]:
# select output folder
outfolder_ndvi_ip = Path() / 'CDI_data' / 'inputs_downloads' / 'ndvi' / 'IP'

ip_produced = False
# select years to produce
mask = years > 1999

if ip_produced:
    # get mean IP
    mean_ip_folder = outfolder_ndvi_ip / 'mean'
    mean_ip_list = sorted(list(mean_ip_folder.glob('*.tif')))
    ndvi_ip_ds = gu.rasterlist2dict(mean_ip_list)
    
    # remove any dates outside of interest period
    dates_to_remove = [key for key in ndvi_ip_ds if not (pd.to_datetime(start_date) <= key <= pd.to_datetime(end_date))]
    # Removing the keys
    for date in dates_to_remove:
        del ndvi_ip_ds[date]

    
    # get RL data
    rl_ip_folder = outfolder_ndvi_ip / 'RL'
    mean_rl_list = sorted(list(rl_ip_folder.glob('*.tif')))
    ndvi_rl_ip_ds = gu.rasterlist2dict(mean_rl_list)

    # remove any dates outside of interest period
    dates_to_remove = [key for key in ndvi_rl_ip_ds if not (pd.to_datetime(start_date) <= key <= pd.to_datetime(end_date))]
    # Removing the keys
    for date in dates_to_remove:
        del ndvi_rl_ip_ds[date]


else:
    ndvi_ip_ds, ndvi_rl_ip_ds = cdi.compute_ip(da_ndvi[mask, :, :], time_steps[mask], gt, proj, outfolder_ndvi_ip, deficit=True, unit_scaler=1)


Processing IP periods for year 2001
	month 1
	saving result to CDI_data/inputs_downloads/ndvi/IP/mean/Mean_IP_2001_1.tif
	saving result to CDI_data/inputs_downloads/ndvi/IP/RL/RL_IP_2001_1.tif

	month 2
	saving result to CDI_data/inputs_downloads/ndvi/IP/mean/Mean_IP_2001_2.tif
	saving result to CDI_data/inputs_downloads/ndvi/IP/RL/RL_IP_2001_2.tif

	month 3
	saving result to CDI_data/inputs_downloads/ndvi/IP/mean/Mean_IP_2001_3.tif
	saving result to CDI_data/inputs_downloads/ndvi/IP/RL/RL_IP_2001_3.tif

	month 4
	saving result to CDI_data/inputs_downloads/ndvi/IP/mean/Mean_IP_2001_4.tif
	saving result to CDI_data/inputs_downloads/ndvi/IP/RL/RL_IP_2001_4.tif

	month 5
	saving result to CDI_data/inputs_downloads/ndvi/IP/mean/Mean_IP_2001_5.tif
	saving result to CDI_data/inputs_downloads/ndvi/IP/RL/RL_IP_2001_5.tif

	month 6
	saving result to CDI_data/inputs_downloads/ndvi/IP/mean/Mean_IP_2001_6.tif
	saving result to CDI_data/inputs_downloads/ndvi/IP/RL/RL_IP_2001_6.tif

	month 7
	saving

#### 3.4 Calculate VDI

In [22]:
# get minimum NDVI for the entire study period
ndvi_ar = da_ndvi.values
ndvi_min_ar = np.nanmin(ndvi_ar)
ndvi_min = 0.15
# select output folder
outfolder = Path() / 'CDI_data' / 'drought_indices'

# calculate VDI
vdi = cdi.calc_vdi(ndvi_ip_ds, ndvi_rl_ip_ds, ndvi_ltm_ds, ndvi_rl_ltm_ds, ndvi_min, gt, proj, outfolder)


2001
	Month: 1
	saving VDI result for 2001-1 to CDI_data/drought_indices/VDI/VDI_IP_2001_1.tif

	Month: 2
	saving VDI result for 2001-2 to CDI_data/drought_indices/VDI/VDI_IP_2001_2.tif

	Month: 3
	saving VDI result for 2001-3 to CDI_data/drought_indices/VDI/VDI_IP_2001_3.tif

	Month: 4
	saving VDI result for 2001-4 to CDI_data/drought_indices/VDI/VDI_IP_2001_4.tif

	Month: 5
	saving VDI result for 2001-5 to CDI_data/drought_indices/VDI/VDI_IP_2001_5.tif

	Month: 6
	saving VDI result for 2001-6 to CDI_data/drought_indices/VDI/VDI_IP_2001_6.tif

	Month: 7
	saving VDI result for 2001-7 to CDI_data/drought_indices/VDI/VDI_IP_2001_7.tif

	Month: 8
	saving VDI result for 2001-8 to CDI_data/drought_indices/VDI/VDI_IP_2001_8.tif

	Month: 9
	saving VDI result for 2001-9 to CDI_data/drought_indices/VDI/VDI_IP_2001_9.tif

	Month: 10
	saving VDI result for 2001-10 to CDI_data/drought_indices/VDI/VDI_IP_2001_10.tif

	Month: 11
	saving VDI result for 2001-11 to CDI_data/drought_indices/VDI/VDI_IP_2

## 4. CDI calculation 
Now that we have produced the three components of the CDI (PDI, TDI and VDI), we can now merged them together and obtain the final CDI for each of the IPs.

First, we need to resample all the of the datasets to have a common pixel size and extent. We will use the NDVI files as the template to resample all data to the same pixel size. We will also take advantage to clip the datasets to the region of interest (ROI) over the Borena region. 

In [23]:
vdi_folder = outfolder / 'VDI'
template_file = list(vdi_folder.glob('*.tif'))[0]

# file pathname to Borena ROI shapefile
roi_file = Path() / 'CDI_data' / 'roi_info' / 'Borena_outline_utm.geojson'

pdi_ds, tdi_ds, vdi_ds = cdi.resample_indices(outfolder, roi_file, template_file, start_date, end_date, indices = ['PDI', 'TDI', 'VDI'])

resampling PDI images
2002-01-01 00:00:00
Saving as CDI_data/drought_indices/PDI/ROI/PDI_IP_2002_1_roi.tif
2005-12-01 00:00:00
Saving as CDI_data/drought_indices/PDI/ROI/PDI_IP_2005_12_roi.tif
2013-09-01 00:00:00
Saving as CDI_data/drought_indices/PDI/ROI/PDI_IP_2013_9_roi.tif
2006-05-01 00:00:00
Saving as CDI_data/drought_indices/PDI/ROI/PDI_IP_2006_5_roi.tif
2020-06-01 00:00:00
Saving as CDI_data/drought_indices/PDI/ROI/PDI_IP_2020_6_roi.tif
2017-12-01 00:00:00
Saving as CDI_data/drought_indices/PDI/ROI/PDI_IP_2017_12_roi.tif
2004-07-01 00:00:00
Saving as CDI_data/drought_indices/PDI/ROI/PDI_IP_2004_7_roi.tif
2022-04-01 00:00:00
Saving as CDI_data/drought_indices/PDI/ROI/PDI_IP_2022_4_roi.tif
2009-12-01 00:00:00
Saving as CDI_data/drought_indices/PDI/ROI/PDI_IP_2009_12_roi.tif
2019-02-01 00:00:00
Saving as CDI_data/drought_indices/PDI/ROI/PDI_IP_2019_2_roi.tif
2019-03-01 00:00:00
Saving as CDI_data/drought_indices/PDI/ROI/PDI_IP_2019_3_roi.tif
2022-05-01 00:00:00
Saving as CDI_data/d

Now we can finnaly merge all drought indices to compute the CDI

In [24]:
outfolder = Path() / 'CDI_data' / 'drought_indices'

# use template of the ROI image
template_folder = outfolder/'VDI'/'ROI'
template_file = list(template_folder.glob('*.tif'))[0]
# get geotransform and projection data from ROI image
proj, gt, _ , _ , _ , _ , _ = gu.raster_info(str(template_file))

# compute CDI
#cdi_ds = cdi.calc_cdi(pdi_ds, tdi_ds, vdi_ds, gt, proj, outfolder)
cdi_ds = cdi.calc_cdi_img(start_date, end_date, gt, proj, outfolder)


2001-01-01 00:00:00
	saving final CDI image for 2001-1 to CDI_data/drought_indices/CDI/CDI_IP_2001_1.tif

2001-02-01 00:00:00
	saving final CDI image for 2001-2 to CDI_data/drought_indices/CDI/CDI_IP_2001_2.tif

2001-03-01 00:00:00
	saving final CDI image for 2001-3 to CDI_data/drought_indices/CDI/CDI_IP_2001_3.tif

2001-04-01 00:00:00
	saving final CDI image for 2001-4 to CDI_data/drought_indices/CDI/CDI_IP_2001_4.tif

2001-05-01 00:00:00
	saving final CDI image for 2001-5 to CDI_data/drought_indices/CDI/CDI_IP_2001_5.tif

2001-06-01 00:00:00
	saving final CDI image for 2001-6 to CDI_data/drought_indices/CDI/CDI_IP_2001_6.tif

2001-07-01 00:00:00
	saving final CDI image for 2001-7 to CDI_data/drought_indices/CDI/CDI_IP_2001_7.tif

2001-08-01 00:00:00
	saving final CDI image for 2001-8 to CDI_data/drought_indices/CDI/CDI_IP_2001_8.tif

2001-09-01 00:00:00
	saving final CDI image for 2001-9 to CDI_data/drought_indices/CDI/CDI_IP_2001_9.tif

2001-10-01 00:00:00
	saving final CDI image fo

#### 4.2 Visualizing results
Here we can visualize some of the results using Leafmap

CDI values higher than 1 indicates no drought warning while any CDI value below 1 should be considered as possible drought warning: 

![CDI values](images/cdi_value_range.png)


In [24]:
from leafmap import leafmap
import rasterio

# output CDI folder
cdi_folder =  outfolder/'CDI'

# choose image to show
# choose year and month
year = 2021
month = 3 

img = cdi_folder / f'CDI_IP_{year}_{month}.tif'
m = leafmap.Map(center=(lat, lon), zoom=8)

# add CDI raster from march 
fid = rasterio.open(str(img))
m.add_raster(fid, colormap="Spectral", layer_name=f"{year} month {month}", vmin=0.4, vmax=1)

# add CDI raster from Sept 
month = 9
img = cdi_folder / f'CDI_IP_{year}_{month}.tif'
fid2 = rasterio.open(str(img))
m.add_raster(fid2, colormap="Spectral", layer_name=f"{year} month {month}", vmin=0.4, vmax=1)

params = {
    "width": 4.0,
    "height": 0.3,
    "vmin": 0.4,
    "vmax": 1,
    "cmap": "Spectral",
    "label": "CDI ()",
    "orientation": "horizontal",
}
m.add_colormap(position='bottomright', **params)
m

ModuleNotFoundError: No module named 'leafmap'