<a href="https://colab.research.google.com/github/laura-turnbull-lloyd/STDH_teaching/blob/main/STDH2324_Lecture6_recurenceintervals.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Exercise 1: Recurrence intervals
Have a go at calculating open and closed recurrence intervals for the four earthquakes in the Personius paper. Recall that:

> $ Open RI = \frac{oldest age to present}{number of events}$

> $ Closed RI = \frac{oldest age - youngest age}{number of intervals}$

We can directly enter the earthquake dates:

In [None]:
# read in the dates of the four earthquakes from the Personius paper
dateBP = [5610, 4510, 3490, 2430]



Now that we have our earthquake dates, we can calculate the open recurrence interval.

In [None]:
openRI = max(dateBP)/(len(dateBP)) # here we use len(dateBP) to give us the length of the dataset (i.e. how many events there are)

print(openRI)


***Now, calculate the closed recurrence interval and print out the result. ***

In [None]:
# You can enter your code here.

Your answer should be 1060

#Exercise 2: Bringing magnitude into the equation

Let's generalise and say that we know the rate of an event that we might expect of a given size per year.

Over time, if events continue to occur at that same unchanged rate, that we can express that probability (the exceedance probability) using this equation:
> $ p_{n,1} = {1-(1-p)^n}$

Let's have a go at calculating the exceedance probability for an event with an expected rate of 0.1 per year, over 5 years.


In [None]:
# m is the expected number of occurrences of an event of at least a given size per year
m = 0.1

# T is the expected recurrence time for events of at elast that size
T = 1/m

p = 1/T

# n is the number of years we're interested in for the sake of these calculations
n = 5

***Have a go at entering the equation to calculate the probability of exceedance over 50 years below and print out your result (recall that in the first week we covered how to raise a number to a power).***

In [None]:
# calculate the exceedance probability here:




If you did this correctly, you should get 0.410

To summarise, what you've done here is calculate the exceedance probability of at least one event of a specific size occurring, and this depends on the rate at which that event occurs, and the time period of interest.

***Now, have a go at calculating the probabiltiy of no events occurring over the same time period.***

In [None]:
# calculate the probability of no events occurring over the same time period


As you increase the number of years in question, the probability (i.e. the likelihood of an event happning) will get closer to 1. And conversely, the probability of no event happening will get closer to 0 with a longer time span.

***Now, have a go at answering the following:***

***What is the exceedance probability of a 100-yr event occurring over a 30 yr period?***

***What is the probability of non-occurrence?***


# Exercise 3

Let's have a go at calculating recurrence intervals for annual maximum river flow in the UK. We can download the max river flow data for any gauge in the UK from the NRFA (https://nrfa.ceh.ac.uk/data/).

You can access the data we're using here: https://nrfa.ceh.ac.uk/data/station/peakflow/24001.


In [None]:
# import modules
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt


To open the data:


In [None]:
Filename = 'Wear_at_Sunderland.csv' # specifying the name of the file to be read in
flowmax = pd.read_csv(Filename)

Now that we've go our data, let's have a quick look at it:

In [None]:
flowmax

You'll see that the data already come ranked, form largerst to smallest. This is handy, but data don't always come this well prepared. So let's go through manually ranking the data, and check that we get the same results!





In [None]:
rank_flow = sorted(list(flowmax.Flow_m3s), reverse=True)

In [None]:
rank_flow

In [None]:
number_years = len(flowmax)

In [None]:
number_years

In [None]:
rank_data = list(range(1,number_years+1,1))

In [None]:
rank_df = pd.DataFrame({'rank_flow' :rank_flow, 'rank_data':rank_data})


Let's view our new dataframe containing earthquake magnitudes, sorted in descending order, and their rank.

In [None]:
rank_df

Now, we can finally calculate the exceedace probability and add this to the dataframe in a new column called 'pe'


In [None]:
rank_df['pe'] = rank_df.rank_data/(number_years+1)

***Can you now calculate the Weibull return time and add it to the dataframe?***

In [None]:
#Add the equation for the Weibull return time here in a new column called 'WRT':



Let's plot out the results:

In [None]:
from matplotlib.ticker import ScalarFormatter


fig, ax = plt.subplots()
ax.plot(rank_df.pe,rank_df.rank_flow, 'o', color='red')
plt.ylabel("Annual maximum flow")
plt.xlabel("Exceedance probability")
plt.tight_layout() # makes the plot look nicer!
plt.xlim(1,0.01) # to have probabilities going form largest to smallest
ax.set_xscale('log')
ax.set_yscale('log')
for axis in [ax.xaxis, ax.yaxis]:
    axis.set_major_formatter(ScalarFormatter())

fig, ax = plt.subplots()
ax.plot(rank_df.WRT, rank_df.rank_flow, 'o', color='red')
plt.xlabel("Return period")
plt.ylabel("Annual maximum flow")
plt.tight_layout() # makes the plot look nicer!
ax.set_xscale('log')
ax.set_yscale('log')
for axis in [ax.xaxis, ax.yaxis]:
    axis.set_major_formatter(ScalarFormatter())