# Precipitation exercises
***

## <font color=steelblue>Exercise 5 - Intensity-duration-frequency curves

<font color=steelblue>Build an IDF (intensity-duration-frequency) curve from the data in _hourly_precipitation_Oviedo.csv_.<tfont>

In [None]:
import numpy as np

import pandas as pd
from pandas.plotting import register_matplotlib_converters
register_matplotlib_converters()

from matplotlib import pyplot as plt
import seaborn as sns
sns.set()

**Intensity-Duration-Frequency (IDF) curves** are a common approach for defining a design storm in a hydrologic project. IDF curves relate rainfall intensity, storm duration and frequency (expressed as return period).
 
<img src="img/IDF curves.JPG" alt="Mountain View" style="width:500px">

> <font color=gray>Intensity-duration-frequency curve for Oklahoma City (Applied Hydrology. Chow, 1988).</font>

When designing a structure, the objetive is to know the precipitation intensity given a return period and a duration. We would know the return period we want to design the structure for (usually defined by laws or standards). We would have to find the worst case scenario for the duration; this is usually the time of concentration of the structure's basin.

**Empirical IDF curves**. To build a IDF curve out of local data, we must carry out a frequency analysis. As input values, we need an annual series of maximum precipitation intensity for several storm durations. We must fit the series for each storm duration to a extreme values distributions in order to estimate the precipitation intensity given a return period. This is the empirical IDF curve.

**Analytical IDF curves** are equations that allow us to create continuous curves from which extract intensity for any duration and return period, so we overcome the limitation of empirical curves. The parameters of the equations must be fitted to the observations, i.e., the points of the empirical IDF curves.

Steps to solve the exercise:
1. Import the data: hourly precipitation series.
2. Generate __series of annual maximum__ precipitation for several storm durations.
3. Fit a __GEV distribution__ to the series of annual maxima.
4. Estimate the points of the __empirical IDF__.
5. Fitting the __analytical IDF__.

### Import data

The input data is the hourly rainfall series in the meteorological station that the AEMET (Spanish Meteorological Agency) manages in Oviedo.

In [None]:
# load precipitation data


In [None]:
# visualize the hourly series


### Generate series of annual maxima
To perfom a frequency analysis, we need a series with the maximum rainfall intensity in each of the years of the original data. Since our final goal is to derive an intensity-duration-frequency curve, we must repeat this process for several storm durations.

#### Example for 2h storm duration

In [None]:
# series of 2h rainfall intentity


In [None]:
# series of annual maximum intensity for 2-h precipitation


In [None]:
# visualize the data


#### Loop for storm durations from 1 to 24 h
We can repeat all the previous steps for several durations in a loop, and save the series in a single data frame.

In [None]:
# durations to study


In [None]:
# series of annual maximum intensity for different storm durations


### Fit a GEV distribution to the series of annual maxima

We must fit a extreme values distribution to the series of annual maxima. From the fitted distribution, we will be able to estimate the intensity for any return period.

We will use the **GEV distribution (generalized extreme values)**. When applied to exclusively positive values such as precipitation, the GEV distribution is:

$$F(s,\xi)=e^{-(1+\xi s)^{-1/\xi}}  \quad \forall \xi>0$$
$$ s = \frac{x-\mu}{\sigma} \quad \sigma>0$$

Where $s$ is the study variable standardised by the location parameter $\mu$ and the scale parameter $\sigma$, and $\xi$ is the shape parameter. So the GEV distribution has three parameters to be fitted.

<img src="img/Frechet.png" alt="Mountain View" style="width:600px">

> <font color=grey>Density function and the cumulative density function of the GEV type II (Frechet distribution) for several values of scale and shape.</font><br>

To fit the GEV distribution, we will use the function `genextreme.fit` in the package `scipy.stats`. The outputs of this function are the values of the three GEV parameters (shape, location and scale) that better fit the input data.

In [None]:
from scipy.stats import genextreme
from statsmodels.distributions.empirical_distribution import ECDF

#### Example: 2-hour storm

In [None]:
# visualize the data


In [None]:
# fit a GEV to the data


In [None]:
# fit the empirical distribution


In [None]:
# visualize the fit


#### Loop for all storm durations
We'll repeat the process in a loop for all the storm durations. We'll save the results in a data frame called _parameters_.

In [None]:
# fit parameters for each duration


In [None]:
# visualize the fit


### Empirical IDF

To calculate the empirical IDF we need to know the value of rainfall intensity for a given storm duration and return period. The storm duration defines which of the previously fitted GEV distributions to apply, whereas the return period defines the non-exceedance probability with which to enter the GEV distribution.

The **non-exceedance probability** (i.e., the value of the cumulative distribution function) and the **return period** are related by the equation: 

$$R = \frac{1}{1-CDF(x)}$$

Where $R$ is the return period and $CDF(x)$ the cumulative distribution function (or non-exceedance probability). From this expression, we can estimate the **non-exceedance probability** for a given **return period**:

$$CDF(x) = \frac{R-1}{R} = 1 - \frac{1}{R}$$

#### Example: 2-hour storm and 10 year return period
As an example, we will generate extreme values for a 2-h storm and the return period of 10 years. We'll use function `genextrem` in the package `scipy.stats`.

In [None]:
# set duration and return period

# non-exceedance probability associated to the return period


In [None]:
# rainfall intensity for a 2-h storm with 10 year return period


In [None]:
# visualize the fit


#### Loop through all storm duration and all return periods
We can iterate the procedure across duration and return periods. Results will be saved in a *data frame*.

We will analyze return perios of 2, 10 and 30 years. Since our records span for only 10 years, we should not calculate larger return periods; as a rule of thumb, we can calculate return periods up to 3 times the span of our original records.

In [None]:
# return periods


In [None]:
# non-exceedance probability


We can calculate the 2-h rainfall intensity for all return periods at once.

In [None]:
# non-exceedance probability


In [None]:
# rainfall intensity


And iterate this step through a loop for all storm duration

In [None]:
# data frame with values of the IDF curve


In [None]:
# save results


Scatter plot that shows, for each return period, rainfall intensity as a function of storm duration.

In [None]:
# configuración del gráfico


### Analytical IDF curves

Up to now, we have calculated points of the IDF curves corresponding to paired values of duration and return period. We could iterate the process to get points for each storm duration in full hours, but still, we would have to interpolate between points to get intensity values for a storm duration of 2.5 h, for instance. To avoid that, there exist analytical forms of the IDF curve that take forms such as:

$$I = \frac{a \cdot R + b}{(D + c)^d}$$

$$I = \frac{a \cdot R + b}{D^c + d}$$

$$I = \frac{a \cdot R^b}{(D + c)^d}$$

$$I = \frac{a \cdot R^b}{D^c + d}$$

where $I$ is the precipitation intensity, $D$ storm duration, $R$ is return period, and $a$, $b$, $c$ and $d$ are location-specific parameters. We must optimize these parameters to our data so that the analytical IDF curves fit the empirical IDF points.

In [None]:
def IDF_type_I(x, a, b, c, d):
    """Estimate precipitation intensity fiven a return period and a storm duration using the analytical IDF curve type I:
    
    I = (a * R + b) / (D + c)**d
    
    Input:
    ------
    x:         list [2x1]. Values of return period (years) and duration (h)
    a:         float. Parameter of the IDF curve
    b:         float. Parameter of the IDF curve
    c:         float. Parameter of the IDF curve
    d:         float. Parameter of the IDF curve
    
    Output:
    -------
    I:         float. Precipitation intensity (mm/h)"""
    
    I = (a * x[0] + b) / (x[1] + c)**d
    
    return I

def IDF_type_II(x, a, b, c, d):
    """Estimate precipitation intensity fiven a return period and a storm duration using the analytical IDF curve type II:
    
    I = (a * R + b) / (D**c + d)   
    
    Input:
    ------
    x:         list [2x1]. Values of return period (years) and duration (h)
    a:         float. Parameter of the IDF curve
    b:         float. Parameter of the IDF curve
    c:         float. Parameter of the IDF curve
    d:         float. Parameter of the IDF curve
    
    Output:
    -------
    I:         float. Precipitation intensity (mm/h)"""
    
    I = (a * x[0] + b) / (x[1]**c + d)
    
    return I

def IDF_type_III(x, a, b, c, d):
    """Estimate precipitation intensity fiven a return period and a storm duration using the analytical IDF curve type III:
    
    I = a * R**b / (D + c)**d
    
    Input:
    ------
    x:         list [2x1]. Values of return period (years) and duration (h)
    a:         float. Parameter of the IDF curve
    b:         float. Parameter of the IDF curve
    c:         float. Parameter of the IDF curve
    d:         float. Parameter of the IDF curve
    
    Output:
    -------
    I:         float. Precipitation intensity (mm/h)"""
    
    I = a * x[0]**b  / (x[1] + c)**d
    
    return I

def IDF_type_IV(x, a, b, c, d):
    """Estimate precipitation intensity fiven a return period and a storm duration using the analytical IDF curve type IV:
    
    I = a * R**b / (D**c + d).    
    
    Input:
    ------
    x:         list [2x1]. Values of return period (years) and duration (h)
    a:         float. Parameter of the IDF curve
    b:         float. Parameter of the IDF curve
    c:         float. Parameter of the IDF curve
    d:         float. Parameter of the IDF curve
    
    Output:
    -------
    I:         float. Precipitation intensity (mm/h)"""
    
    I = (a * x[0]**b) / (x[1]**c + d)
    
    return I 

#### Fit the analytical IDF
To fit the analytical IDF we will use the function `curve_fit` in `scipy.optimize`. We must provide `curve_fit` with a function representing the curve to be fitted, the independent variable (paired values of return period-duration) and the dependent variable (itensity associated to the previous pairs). `curve_fit` puts out a vector with the optimized parameters and a vector with the covariance between those parameters.

In [None]:
from scipy.optimize import curve_fit

**Dependent variable in the IDF curve: intensity**

In [None]:
# 1D array of intensity for each pair of values in 'R' and 'D'


**Independent variable in the IDF curve: paired values of return period and duration**

In [None]:
# grid with all possible combinations of duration and return period


In [None]:
# convert the grid ('RR' and 'DD') into a 1D array


In [None]:
# join 'RR' and 'DD' as columns of a 2D array


**Fit the curve**

In [None]:
# set type of curve


In [None]:
# fit the curve


In [None]:
# save the optimized parameters


In [None]:
# export results


**Visualize the fit**

In [None]:
# plot the analytical IDF curves

    
# save figure
