# Simulating Daily Precipitation Data
This notebook simulates daily precipitation data over a long period and analyzes statistical properties of rainfall.
- Uses binomial distribution to determine wet/dry days.
- Uses gamma distribution to simulate rainfall on wet days.
- Computes probabilities, return periods, and cumulative distribution functions.

In [None]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from itables import init_notebook_mode, show

init_notebook_mode(all_interactive=False)

## Generating Synthetic Precipitation Data
We generate 100 years of daily precipitation data using:
- Binomial distribution to decide if it rains.
- Gamma distribution for rainfall amounts on wet days.

In [None]:
years = 100
size = 365 * years

np.random.seed(42)
wetday = np.random.binomial(1, 0.3, size)
rainfall = np.random.gamma(1, 5, size)
precipitation = wetday * rainfall

df = pd.DataFrame({'day': range(1, size + 1), 'precipitation': precipitation})
show(df)

## Histogram of Precipitation
This histogram shows the distribution of daily precipitation amounts.

In [None]:
plt.figure(figsize=(8, 6))
plt.hist(df['precipitation'], bins=50, alpha=0.7, color='skyblue', edgecolor='black')
plt.xlabel('Precipitation (mm)')
plt.ylabel(f'Count in days within {years} years')
plt.title('Histogram of Daily Precipitation')
plt.grid(True)
plt.show()

## Extracting Rainfall Days and Computing Probabilities
We extract days with nonzero precipitation and compute the probability of a wet day.

In [None]:
dfwet = df[df.precipitation > 0]
n_days = df.day.size
n_wet_days = dfwet.day.size
p_wetdays = n_wet_days / n_days

## Histogram of Rainfall Days
This histogram includes only days with precipitation > 0.

In [None]:
plt.figure(figsize=(8, 6))
plt.hist(dfwet['precipitation'], bins=50, alpha=0.7, color='skyblue', edgecolor='black')
plt.xlabel('Precipitation (mm)')
plt.ylabel(f'Count in days within {years} years')
plt.title('Histogram of Precipitation (Rain Days Only)')
plt.grid(True)
plt.show()

## Return Period Analysis
Return period is calculated using:

$$\tau = \frac{N + 1}{m}$$

where N is the total number of wet days and m is the rank of an event and the propability of a certain rainfall event is calaculated as:

$$P_{\rm event} = \frac{1}{\tau}$$ 

In [None]:
df_sorted = dfwet.sort_values('precipitation', ascending=False)
rank = np.arange(1, n_wet_days+1)
df_sorted.insert(2, 'rank', rank)
return_period = (n_wet_days + 1) / df_sorted['rank']
probability = 1 / return_period
df_sorted.insert(3, 'return period', return_period)
df_sorted.insert(4, 'probability', probability)
show(df_sorted)

## Complementary Cumulative Distribution Function (CCDF)
The CCDF shows the probability of exceeding a given precipitation threshold.

In [None]:
plt.figure(figsize=(8, 6))
plt.plot(df_sorted['precipitation'], df_sorted['probability'])
plt.xlabel('Precipitation (mm)')
plt.ylabel('Probability of exceedance')
plt.title('Complementary Cumulative Distribution Function (p(X>x))')
plt.grid(True)
plt.show()

## Annual Exceedance Probability
Probability of exceedance within a year is calculated using:
$$
P_{annual} = 1 - (1 - P_{event} \times p_{wet})^{365}
$$

In [None]:
probability_annual_exceedance = 1 - (1 - (df_sorted['probability'] * p_wetdays)) ** 365
plt.figure(figsize=(8, 6))
plt.plot(df_sorted['precipitation'], probability_annual_exceedance)
plt.xlabel('Precipitation (mm)')
plt.ylabel('Probability of exceedance within a year')
plt.title('Cumulative Distribution Function 1-(1-p(X>x)*p_wet)^365')
plt.grid(True)
plt.show()