# Overview

## <span class="custom-tabset-title">Introduction</span>

Frequency analysis of floods is a statistical method used in hydrology
to estimate the probability of occurrence of extreme flow events, such
as floods, over a given period. It is typically based on the study of
observed annual maximum flood discharges over several years. An
alternative approach is the peak over threshold (POT) method, which
considers all flood peaks that exceed a certain threshold, rather than
only the annual maxima. This method allows for a more detailed analysis
of extreme flood events by using more data points above the chosen
threshold.

The objective is to determine the discharge associated with a return
period, that is, the discharge that has a certain probability (e.g., 1%,
10%) of being exceeded in any given year. For example, a 100-year flood
has a 1% chance of being exceeded in any year.

To achieve this, the data are fitted to a probability distribution (such
as the Gumbel distribution), and a frequency curve is constructed
linking discharges to return periods. This curve is then used to design
hydraulic structures (dams, bridges, levees) and to manage flood risks.

Frequency analysis thus allows for an objective assessment of
flood-related risks and supports informed decision-making in land-use
planning and flood protection.

## <span class="custom-tabset-title">Extreme Value Theory (EVT)</span>

Imagine we have daily observational data for a random variable (e.g.,
river discharges, precipitations, etc.) for many years.
[EVT](https://en.wikipedia.org/wiki/Extreme_value_theory) suggests that
the extreme values of this variable are asymptotically close to one of
three types of extreme value distributions, regardless of the original
distribution of daily flows.

The EVD helps us estimate the probability of rare, high-flow events like
floods.

<table style="width:100%;">
<colgroup>
<col style="width: 33%" />
<col style="width: 33%" />
<col style="width: 33%" />
</colgroup>
<tbody>
<tr class="odd">
<td style="text-align: left;"><div width="33.3%"
data-layout-align="left">
<figure>
<img src="attachment:overview_files/figure-ipynb/unnamed-chunk-1-1.png"
alt="(a) Normal distribution of the random variable" />
<figcaption aria-hidden="true">(a) Normal distribution of the random
variable</figcaption>
</figure>
</div></td>
<td style="text-align: left;"><div width="33.3%"
data-layout-align="left">
<figure>
<img src="attachment:overview_files/figure-ipynb/unnamed-chunk-1-2.png"
alt="(b) Sampling distribution of the mean" />
<figcaption aria-hidden="true">(b) Sampling distribution of the
mean</figcaption>
</figure>
</div></td>
<td style="text-align: left;"><div width="33.3%"
data-layout-align="left">
<figure>
<img src="attachment:overview_files/figure-ipynb/unnamed-chunk-1-3.png"
alt="(c) Extreme value distribution (maxima)" />
<figcaption aria-hidden="true">(c) Extreme value distribution
(maxima)</figcaption>
</figure>
</div></td>
</tr>
</tbody>
</table>

<table style="width:33%;">
<colgroup>
<col style="width: 33%" />
</colgroup>
<tbody>
<tr class="odd">
<td style="text-align: left;"><div width="33.3%"
data-layout-align="left">
<p>Illustration of the Extreme Value Theory (EVT)</p>
</div></td>
</tr>
</tbody>
</table>

## <span class="custom-tabset-title">Practical Objectives</span>

üìä **Apply sampling approaches for extreme events**

Use annual maxima and peaks-over-threshold (POT) methods, including
threshold selection and declustering to identify independent events.

üìà **Understand trend detection in extreme rainfall or flood series**

Gain insight into statistical methods for identifying trends in
hydrometeorological extremes.

üß™ **Compare non-parametric and parametric techniques**

Implement and contrast the non-parametric test (e.g., Mann-Kendall) with
parametric (distribution-based) models such as Generalized Extreme Value
Distribution (GEV) and Generalized Pareto Distribution (GPD) in a
non-stationary framework.

üîç **Interpret statistical significance**

Evaluate the reliability and meaning of trend detection results under
various statistical assumptions.

üíª **Use open-source R packages for implementation**

Apply all methods in practice using reproducible, open-source tools
within the [R environment](https://www.r-project.org/).

## <span class="custom-tabset-title">Materials</span>

üì¶ [{extRemes}](https://www.jstatsoft.org/article/view/v072i08) ‚Äî For
fitting and analyzing extreme value distributions.

``` r
install.packages("extRemes")
```

üì¶ [{trend}](https://cran.r-project.org/web/packages/trend/index.html) ‚Äî
Provides non-parametric tests test for trend detection.

``` r
install.packages("trend")
```

üì¶ [{EXstat}](https://cran.r-project.org/web/packages/trend/index.html)
‚Äî An efficient and simple solution to aggregate and analyze the
stationarity of time series.

``` r
library(remotes)
remotes::install_github('super-lou/EXstat')
```

üì¶ [{dplyr}](https://cran.r-project.org/web/packages/dplyr/index.html) ‚Äî
For data manipulation and tidy workflows

``` r
install.packages("dplyr")
```

üì¶
[{ggplot2}](https://cran.r-project.org/web/packages/ggplot2/index.html)
‚Äî For creating clear and publication-quality plots

``` r
install.packages("ggplot2")
```

üì¶
[{lubridate}](https://cran.r-project.org/web/packages/lubridate/index.html)
‚Äî To work with date-times and time-spans.

``` r
install.packages("lubridate")
```

> **Important**
>
> Must-haves
>
> ``` r
> library(dplyr)
> library(tidyr)
> library(ggplot2)
> library(lubridate)
> library(trend)
> library(EXstat)
> library(extRemes)
> ```

> **Note**
>
> The packages `dplyr`, `ggplot2` and `lubridate` are part of the
> [`tidyverse`](https://www.tidyverse.org/packages/) collection of R
> packages. The `core tidyverse` includes the packages that you‚Äôre
> likely to use in everyday data analyses. It‚Äôs is advised to install
> this set of packages, including `dplyr`, `ggplot2` and `lubridate`,
> together by running:
>
> ``` r
> install.packages("tidyverse")
> ```