<a href="https://colab.research.google.com/github/fdavenport/CIVE480A6-climate-change-impacts/blob/main/lectures/06_Analyzing_Extreme_Events.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# CIVE 480A6: Climate Change Risks and Impacts
## Week 9: Analyzing Extreme Events

This week's Objectives:
1. Analyze the distribution of daily maximum temperature.
2. Calculate how often daily temperatures exceed 90F and 100F.
3. Calculate "block maxima" (in this case, the hottest day of the year).
3. Learn how to fit a Generalized Extreme Value (GEV) distribution to the time series of block maxima.

## Part 1: Daily Temperature Data

Today we will be looking at daily temperature data. We will again be using data from the [Global Historical climatology Network (GHCN-D)](https://www.ncei.noaa.gov/products/land-based-station/global-historical-climatology-network-daily).  

We will be looking specifically at data from a weather station near Atlanta, Georgia. The data file contains the high and low (maximum and minimum) temperatures for each day.

<img src="https://raw.githubusercontent.com/fdavenport/CIVE480A6-climate-change-impacts/main/lectures/img/atlanta_map.png" width="400">
<img src="https://raw.githubusercontent.com/fdavenport/CIVE480A6-climate-change-impacts/main/lectures/img/heatwave.jpg" width="300">

In [None]:
# The data has already been added to the course github page at the following link:

atl_temp_data_url = "https://raw.githubusercontent.com/fdavenport/CIVE480A6-climate-change-impacts/refs/heads/main/lectures/data/USW00013874_atlanta_temp.csv"


In [None]:
# we are working with tabular data in a .csv file, so we need to import the pandas library



In [None]:
# read in the data


In [None]:
# look at the data


In [None]:
# add Date information


In [None]:
# get a summary of the data


In [None]:
# import matplotlib.pyplot to make graphs



In [None]:
# make a plot of the data



In [None]:
# create a histogram of the daily temperature data



## Part 2: Calculating 95th percentile threshold exceedance

In this section, we are going to look at the 95th percentile minimum and maximum temperature within the summer months (June, July, and August) in Atlanta, GA. This will give us a sense for what a rare or "extreme" hot temperature would look like at this particular location.

Oftentimes, we use percentiles to define extremes, because what is extreme in one place might not be extreme in another location. For example, 100 F is very extreme in Alaska, but not so extreme in Phoenix, AZ.

First let's look at daily maximum summer temperatures in Atlanta:

In [None]:
## subset the data for summer months



We will use the [quantile()](https://numpy.org/doc/2.0/reference/generated/numpy.quantile.html) function from the numpy package to calculate the 95th percentile

In [None]:
# import numpy


How many days have maximum temperatures above the 95th percentile?

Let's look at the frequency of extreme hot days over time to see if it has changed:

In [None]:
## how many days are there with maximum temperatures above the 95th percentile in each year?
## Hint: let's use our code from previous lectures to loop through all of the years



In [None]:
## make a time series plot:



To practice, let's apply the same analysis to daily minimum temperatures. Daily minimum temperatures are an especially important metric to understand the human health consequences of heat waves. If it stays very warm at night, people are unable to cool down and sustained hot temperatures are more likely to cause negative health impacts.

In [None]:
## calculate the 95th percentile of daily minimum temperature for summer months (June, July, and August)



In [None]:
## how many days have minimum temperatures above this threshold in each year?



In [None]:
## add this to our figure



What are some other ways we could assess changes in the intensity or frequency of extreme hot events?

## Part 3: Using Extreme Value statistics to analyze very rare temperature events

In this section, we will use extreme value statistics to estimate the probability of very rare extreme events, including those that may be more rare than anything in the historical data.

Recall from class, that extreme value theory applies to "block maxima", or the maximum value in each block of time. For this case, we will consider each calendar year as a block of time. This means we need to calculate the maximum temperature value within each year.

Secondly, recall from class that one of the assumptions of extreme value theory is that our data is "stationary", meaning that our data is not changing over time because of some external factor (such as climate change!). Clearly, our data is changing over time!

For the purposes of this exercise, we will analyze data from the most recent 30-year period. While there are still changes over this period, these changes will be smaller than those over the entire period, so this will get us closer to stationary conditions.

In [None]:
## calculate annual maxima for the period from 1994 through 2023



The Fisher–Tippett–Gnedenko theorem tells us that the distribution of block maxima can be described by the Generalized Extreme Value (GEV) distribution.

The GEV distribution is described by three parameters (location, scale, and shape) and the following equation:


GEV plot:

Just like we can calculate mean and standard deviation for a sample dataset, we can figure out which location, scale, and shape parameters best match our data.

We will use the [genextreme()](https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.genextreme.html#scipy.stats.genextreme) function from the scipy package to calculate the GEV parameters for our data.  

In [None]:
## load genextreme() function



In [None]:
## fit our data



In [None]:
## make a graph of a GEV distribution with these parameters



In [None]:
## what is the magnitude of a 10-year event? a 20-year event? a 100-year event?


In [None]:
## what is the uncertainty range for a 10-year event? what about a 100-year event?


In [None]:
## make a Quantile-Quantile plot



In [None]:
## make a Return Period plot

