---
## **Bivariate Frequency Analysis (BFA)**
**Tolga Barış Terzi – 2025**

This section demonstrates **Bivariate Frequency Analysis (BFA)** of drought events, focusing on **drought duration and severity**.  
BFA uses **copulas** to model the dependence structure between two variables, allowing the calculation of **joint return periods** for extreme events.  

---

## **BFA Methodology Overview**

1. **Data Preparation**:  
   - Extract **Duration** and **Severity** from the `DChar` output.  
   - Ensure **Interarrival times** are included to scale the return periods.  

2. **Fit Marginal Distributions**:  
   - Fit a univariate distribution to **Duration** (`dist_x`).  
   - Fit a univariate distribution to **Severity** (`dist_y`).  
   - Common choices include **Exponential**, **Gamma**, or **Weibull** distributions.  

3. **Select Copula Family**:  
   - Copulas capture the dependence between Duration and Severity.  
   - Supported families: **Frank**, **Clayton**, **Gumbel**, **Gaussian**, **Plackett**, **Galambos**.  

4. **Joint Return Period Calculation**:  
   - **OR return period**: at least one variable exceeds its threshold.  
   - **AND return period**: both variables exceed thresholds simultaneously.  
   - Return periods can be scaled using the **average interarrival time**.  

---

## **Required Packages**


In [1]:
import pandas as pd
import numpy as np
import scipy.stats as stats
import pydrght

---
## **Load the Data**

The example dataset contains monthly values of:  

- **Duration**: Number of months a drought lasted.  
- **Severity**: Cumulative SPI deficit over the drought period.  
- **Interarrival**: Months between the end of a drought and the start of the next.  

These drought characteristics are calculated from the **SPI-12 series**, which was derived in the `example_SI` notebook. We fit **univariate distributions** to each variable and model their dependence using a **copula**, allowing calculation of **joint return periods** for extreme events.

The data is stored in `dchar.csv` and will be used for **Bivariate Frequency Analysis (BFA)** to model joint return periods of extreme drought events.

In [3]:
df = pd.read_csv("dchar.csv")
df.head()

Unnamed: 0,Event,Duration,Severity,Interarrival
0,0,2,-2.673279,11.0
1,1,4,-5.845878,12.0
2,2,6,-7.267712,3.0
3,3,9,-16.629503,119.0
4,4,11,-20.029517,2.0


---
## **Fit Univariate Distributions**

In this example, we fit **1-parameter Exponential distributions** to both **drought duration** and **drought severity** using `pydrght.Dist`.  

Before fitting the data, we ensured that:  

- **Severities** are converted to **absolute values** to avoid negative deficits.  
- **NaN values** in the `Interarrival` column are removed, since the last interarrival value calculated by `pydrght.DChar` is typically `NaN`.  

These fitted distributions serve as the **marginal distributions** for the subsequent bivariate frequency analysis with copulas.


In [4]:
dur = df["Duration"]           # Drought durations
sev = abs(df["Severity"])      # Severities (absolute values)
interarrival = df["Interarrival"].dropna()  # Interarrival times

# Fit univariate distributions
dist_dur = pydrght.Dist(dur, stats.expon, floc0=True)   # duration
dist_sev = pydrght.Dist(sev, stats.expon, floc0=True)  # severity

---
## **Bivariate Return Periods**

With the **marginal distributions** of drought duration and severity fitted, we can now perform **bivariate frequency analysis** using the `pydrght.BFA` class.  

The analysis requires a **copula** to model the dependence between the two variables. In this example, we use the **Frank Copula** (`pydrght.copulas.FrankCopula`) to capture the joint behavior of drought duration and severity.  

- `T` specifies the return period of interest (here, **200 years**).  
- `E_L` is the **average interarrival time** in years, calculated from the interarrival series.  

Finally, we compute the **T-year joint return period** using the `joint_return_period` method, which provides both:

- **OR return period**: the expected return period when **either** duration or severity exceeds its threshold.  
- **AND return period**: the expected return period when **both** duration and severity exceed their thresholds simultaneously.


In [5]:
# Initialize BFA with Frank copula
bivar = pydrght.BFA(dist_dur, dist_sev, copula_family=pydrght.copulas.FrankCopula)

# Compute average interarrival in years
E_L = interarrival.mean() / 12

# Define the return period of interest
T = 200

# Calculate T-year joint return period
rp = bivar.joint_return_period(T=T, interarrival=E_L)

# Display results
print("=== Bivariate Frequency Analysis (BFA) ===")
display(rp)


=== Bivariate Frequency Analysis (BFA) ===


OR      110.422033
AND    1059.505735
dtype: float64

---
## **References**

- Shiau, J. T. (2006). *Fitting drought duration and severity with two-dimensional copulas*. Water Resources Management, 20(5), 795–815. https://doi.org/10.1007/s11269-005-9008-9  