# M2.3 - Climate and Drought Indices

**Contents:**

- [Studying drought with climate data](#Studying-drought-with-climate-data)
  - [Quantifying drought](#Quantifying-drought)
- [Organizing our project files](#Organizing-our-project-files)
- [Reading a climate time series](#Reading-a-climate-time-series)

## Studying drought with climate data

We may think we have an intuitive understanding of drought, but drought takes many forms (Wilhite and Glantz 1985).

- **Meteorological drought** is a period in which precipitation (rainfall) is smaller than some expected amount. The amount of rainfall that was expected will vary between different places but also depends on the time of year (Palmer 1965).

- **Agricultural drought** occurs when plant water demand, particularly crop water demand, exceeds water supply, whatever the source.

- **Hydrologic drought** describes the effects of dry conditions on surface or sub-surface hydrology; i.e., it can be used to describe low streamflow or low reservoir conditions. Because of the potential time lag between a moisture deficit and a change in hydrology, hydrologic drought is often out of phase with other kinds of drought.

- **Socio-economic drought** is defined in terms of the socio-economic effects of dry conditions: a change in crop prices or in animal feed or forage; or the loss of farm or fishery livelihoods.

To this list, we might add a kind of drought that has been recognized more recently as the technology for monitoring soil moisture has improved: a **soil-moisture drought,** or deficit of soil moisture in particular. We previously introduced **flash drought,** a kind of soil-moisture drought characterized by its quick onset and rapid decrease in soil moisture.

### Quantifying drought

**There are several approaches to quantifying the impacts of drought from climate data.** **Drought indices,** such as the Palmer Drought Severity Index (PDSI, Palmer 1965) and the Standardized Precipitation-Evapotranspiration Index (SPEI, Vicente-Serrano et al. 2010), are commonly used, as they provide a dimensionless measure of the severity of drought that is easy to interpret. **Percentiles or ranking** of hydrological conditions can also be used; for example, it is common to describe snowpack conditions (and "snowpack drought") in terms of the percentage of the median historical snowpack depth, on a given date.

A hydrological or water-balance approach can also be used, though it requires good data on the components of a **basin-scale water budget.** We'll talk more about water budgets later. For now, we can imagine a simple "bucket model:" water enters the environment as **precipitation** and leaves the environment as **evapotranspiration (ET)** (the sum of evaporation from wet surfaces and transpiration from plants). Mathematically, we might represent the bucket model as:
$$
\text{Available water} = \text{Precipitation} - \text{ET}
$$

We'll use **potential evapotranspiration (PET)** as our measure of ET, as it represents the amount of water that would be evaporated (and transpired) given the amount of energy (primarily heat and solar radiation) that is available to vaporize water. One way to define drought, consistent with the Meteorological, Agricultural, and Hydrologic drought definitions, is as a period of time during which precipitation is much less than the amount of water that could be lost as ET.

---

## Organizing our project files

Today we'll be working with a dataset called [**Climate Hazards Group InfraRed Precipitation with Station data (CHIRPS).**](https://www.chc.ucsb.edu/data/chirps) CHIRPS combines remote sensing data with global weather station datasets to produce a global gridded record of precipitation. CHIRPS is not produced by NASA, but it is one of the better global records of precipitation available.

CHIRPS data can be downloaded from a variety of sources, but there isn't an interface like EarthData Search's cloud access. Instead, individual data files can be [downloaded manually from a server](https://data.chc.ucsb.edu/products/CHIRPS-2.0/) and [this README](https://data.chc.ucsb.edu/products/CHIRPS-2.0/README-CHIRPS.txt) explains how to download the files.

#### &#x1F6A9; <span style="color:red">Pay Attention</red>

**Because we're still learning, instead of downloading all those files manually, we'll use a prepared dataset:** 

- [Click to download `CHIRPS-v2_Africa_monthly_2014-2024.nc`](http://files.ntsg.umt.edu/data/ScienceCore/CHIRPS-v2_Africa_monthly_2014-2024.nc)

This dataset was produced by merging together [the individual, monthly CHIRPS files for Africa](https://data.chc.ucsb.edu/products/CHIRPS-2.0/africa_monthly/tifs/) from 2014 through 2023. You can view [the script that was used to merge the files at this link.](https://github.com/OpenClimateScience/M2-Computational-Climate-Science/blob/main/scripts/20240611_process_CHIRPS_monthly_into_stack.py) This 10-year record is shorter than we would typically like to use to infer climate variability, but we're trying to keep the dataset small.

#### &#x1F3AF; Best Practice

**We're starting a new analysis. Let's take a moment to organize our project's file system.** Take a look at the example file tree, below, and use it as inspiration to organize your file system.

![](assets/M2_file_tree_CHIRPS.png)

**As we start to gain more sophistication with writing scientific Python code, one of our goals should be to write code that also serves as documentation of our workflow.** A key challenge for open, reproducible science is linking scientific results to the computer code they were created from. 

One way to link scientific results to our Python script(s) is to use a **consistent naming scheme with a unique identifier that groups files together.** One approach to this is to use the current date, in `YYYYMMDD` (Year, Month, Day) format. This 8-digit number will always be unique, because every day is a new day. If you use today's 8-digit date in the filename of your Python script and any output file(s) it generates, you'll have a way of associating those files together.

---

## Reading a climate time series

As before, we'll use `xr.open_mfdataset()` ("open multiple-file dataset") to open this dataset. Even though there is only one file, `xr.open_mfdataset()` allows us to access some useful features in the `dask` library.

For instance, note that the output associated with the `"precip"` DataArray includes information about the total size of the array (1.12 GiB or gigabytes).

In [None]:
import xarray as xr
import numpy as np

ds = xr.open_mfdataset('data_raw/CHIRPS/CHIRPS-v2_Africa_monthly_2014-2024.nc')
ds['precip']

Those 1.12 gigabytes haven't been allocated in memory yet; rather, the variable `ds` points to a *representation* of the dataset that is stored on the hard disk. This is another example of **lazy evaluation,** which sounds bad but is actually a good thing: `xarray` won't read the data into memory until we're actually ready to perform some kind of computation. And because we used `xr.open_mfdataset()`, `xarray` and `dask` will make sure that the loaded data size doesn't exceed our computer's available memory; if the entire dataset is larger than our computer's memory, it will load smaller pieces of it, processing one or more pieces at a time.

In [None]:
# TODO Replacing NaNs

ds['precip'] = xr.where(ds['precip'] == -9999, np.nan, ds['precip'])

In [None]:
ds['precip'].sel(time = '2023-07-01').plot()

#### &#x1F3C1; Challenge: Calculate mean annual precipitation

Our dataset has monthly precipitation over a ten-year period. Make a plot of mean annual precipitation. **You should do this in two steps:**

- Use the `resample()` method, followed by `sum()` to calculate the *total precipitation* in each year.
- Then, calculate the mean annual precipitation.

In [None]:
# First, add up monthly precipitation (over 12 months) in each year
annual_precip = ds['precip'].resample(time = 'YS').sum()
# Then, calculate the average (mean) annual precipitation
mean_annual_precip = annual_precip.mean('time')
mean_annual_precip.plot()

In [None]:
# TODO 4 seasons every year

seasonal_precip = ds['precip'].resample(time = 'QS').sum()
seasonal_precip

In [None]:
# TODO Quarters are represented by the starting month

seasonal_precip.coords['time']

In [None]:
# TODO groupby() and the Split-Apply-Combine workflow

mean_seasonal_precip = seasonal_precip.groupby('time.month').mean()
mean_seasonal_precip

In [None]:
mean_seasonal_precip.sel(month = 1).plot()

In [None]:
# TODO Note the slice() order for the y coordinate must go from +90 to -90
# https://apnews.com/article/algeria-drought-rain-tebboune-tiaret-riots-09ce23f4ba235aaf1e3afecc7bfe3574

ds_tiaret = ds.sel(x = slice(0.8, 1.8), y = slice(36.1, 35.1))

tiaret_precip = ds_tiaret['precip'].resample(time = 'YS').sum()
tiaret_precip

In [None]:
tiaret_anomaly = tiaret_precip - tiaret_precip.mean('time')
tiaret_anomaly.mean(['x', 'y']).plot()

In [None]:
# TODO 

precip_anomaly = ds_tiaret['precip'].groupby('time.month').apply(lambda x: x - x.mean())
precip_anomaly

In [None]:
precip_anomaly.mean(['x', 'y']).plot()

In [None]:
precip_anomaly.rolling(time = 6).mean().mean(['x', 'y']).plot()

---

## A simple bucket model

Although there are [many different ways to calculate PET (Pimentel et al. 2023)](https://doi.org/10.1029/2022WR033447)...

- https://climatedataguide.ucar.edu/climate-data/terraclimate-global-high-resolution-gridded-temperature-precipitation-and-other-water
- https://climate.northwestknowledge.net/NWTOOLBOX/formattedDownloads.php

In [None]:
import pandas as pd

pet = pd.read_csv('data_raw/terraclimate_35.3709N_-1.3218W.csv', skiprows = 11)
pet = pet[pet['Year'] >= 2014]
pet

In [None]:
ds_tiaret['precip']

In [None]:
from matplotlib import pyplot

# NOTE: The CHIRPS data extend monthly through May 2024, but the PET data
#    do not, so we have to subset the CHIRPS data to the first 120 months
precip_pet_ratio = ds_tiaret['precip'].isel(time = slice(0, 120)).mean(['x','y']) / pet['pet(mm)']
precip_pet_ratio.plot()
pyplot.ylabel('Precipitation-to-PET Ratio')
pyplot.title('Precipitation-to-PET Ratio for Tiaret, Algeria')
pyplot.savefig('results/20240610_Tiaret_precip-to-PET_ratio.png', dpi = 172)

---

### References

Ault, T. R. 2020. On the essentials of drought in a changing climate. *Science* 368 (6488):256–260.

Palmer, Wayne C. Meteorological drought. Vol. 30. US Department of Commerce, Weather Bureau, 1965.

Wilhite, D. A., and M. H. Glantz. 1985. Understanding the drought phenomenon: The role of definitions. *Water International* 10 (3):111–120.

Vicente-Serrano, S. M., S. Beguería, and J. I. López-Moreno. 2010. A multiscalar drought index sensitive to global warming: The Standardized Precipitation Evapotranspiration Index. *Journal of Climate* 23 (7):1696–1718.