# Calculating Extreme Heat data using daily projections from Cal-Adapt API

This notebook walks you through how we calculate Extreme Heat Days for an area. We will begin with getting observed daily tasmax data for an area. Using a baseline period of 1961 - 1990, we will calculate the 98th percentile value of all temperatures from April - October. This value is used as Extreme Heat Threshold for the area. Then we get the projected data for this area for four models. From the projected data, all days with tasmax above Extreme Heat Threshold are considered Extreme Heat Days. You can modify this approach for your needs.

**We reccomended using decadal or multi-decadal averages for reporting Extreme Heat Day counts for future time periods.**

Begin by importing all python modules we will need.

In [1]:
import requests 
import numpy as np
import pandas as pd
from datetime import datetime

Defining some functions and values that we use later in the code. 

In [2]:
# Convert value from degrees Celsius to degrees Fahrenheit (used for observed data)
def celsius_to_F(val):
    return val * 9/5 + 32 

# Convert value from Kelvin to degrees Fahrenheit (used for modeled data)
def kelvin_to_F(val):
    return  (val - 273.15) * 9/5 + 32

# Request header
headers = {'ContentType': 'json'}

### You can get data for a point or an area of your interest

If you are requesting data for a point or polygon use the `g` parameter to specify geometry. Geometry can be written in various formats including WKT, GeoJSON, KML. The examples below use WKT (Well Known Text) format.

In [3]:
# Uncomment the following lines to get data for a point location
#point = 'POINT(-121.4687 38.5938)'
#params = {'g': point}

# Uncomment the following lines to get data for a polygon
#polygon = 'POLYGON ((-123.35449 39.09596, -122.27783 39.09596, -122.27783 39.97712, -123.35449 39.97712, -123.35449 39.09596))' 
#params = {'g': polygon, 'stat': 'mean'}

### If you want to use polygon geometry from Cal-Adapt API, it's a 2-step process

- First get a polygon from a boundary layer in the API (e.g. counties, census tracts, place, etc.) that intersects your point of interest. Build a string that references the id of the polygon.
- Then use the `ref` param instead of the `g` param to request data

[Complete list of boundaries in Cal-Adapt API](https://berkeley-gif.github.io/caladapt-docs/data-catalog.html#vector-data). **Note: Requests might time out if the polygon is too large**. Subsetting the daily data has been tested with counties, census tracts, places and hydrounits. If you need data for a large boundary we reccomend downloading the daily rasters and processing the data locally. 

In [4]:
# Your point of interest
point = 'POINT(-121.4687 38.5938)'
# Name of boundary layer in API
resource = 'counties'
# Request url
url = 'http://api.cal-adapt.org/api/%s/' % resource
# Request params to find intersecting boundaries
params = {'intersects': point, 'srs': 4326, 'simplify': .0001, 'precision': 4}
ref = ''

# Get geometry
response = requests.get(url, params=params, headers=headers)
if response.ok:
    data = response.json()
    feature = data['features'][0]
    if (feature):
        ref = '/api/%s/%s/' % (resource, feature['id'])
        print(ref)
    else:
        print('Did not find any polygons that intersect your point')

params = {'ref': ref, 'stat': 'mean'}

/api/counties/34/


### 1. Get observed daily tasmax

Daily data is stored as a multiband raster. So the observed daily timeseries (1950 -2013) has 23376 bands with each band corresponding to one day, starting from 1950-01-01.

In [5]:
# Request url
url = 'http://api.cal-adapt.org/api/series/tasmax_day_livneh/' + 'rasters/'

# Make request
response = requests.get(url, params=params, headers=headers)

# Variable stores observed daily data in a Pandas dataframe
observedDF = None

if response.ok:
    json = response.json()
    data = json['results'][0]
    
    # Multiband raster data is returned by the API as a 3D array having a shape (233376, 1, 1)
    # Flatten the 3D array into a 1D array
    values_arr = np.array(data['image'])
    values_arr = values_arr.flatten()
    
    # Get start date of timeseries
    start_date = datetime.strptime(data['event'], '%Y-%m-%d')
    
    # Get total number of values -> number of days
    length = len(values_arr)
    
    # Create new pandas dataframe and map each value in list to a date index
    observedDF = pd.DataFrame(values_arr,
        index=pd.date_range(start_date, freq='1D', periods=length),
        columns=['value'])
    
    # Convert celsius to Fahrenheit
    observedDF.value = observedDF.value.apply(lambda x: celsius_to_F(x))

Explore the data. Units for the observed data is degrees Celsius.

In [6]:
print(observedDF.head())
print()
print(observedDF.tail())

                value
1950-01-01  47.537341
1950-01-02  43.417568
1950-01-03  46.940557
1950-01-04  43.705381
1950-01-05  45.215155

                value
2013-12-27  61.920082
2013-12-28  65.755937
2013-12-29  64.220372
2013-12-30  64.782641
2013-12-31  64.359176


### 2. Calculate Extreme Heat Threshold

For the Cal-Adapt Extreme Heat Tool, Extreme Heat Threshold is the 98th percentile of historical maximum temperatures between April 1 and October 31 based on observed daily temperature data from 1961–1990.

In [7]:
# Filter years
baselineDF = observedDF.loc['1961-01-01':'1990-12-31']

# Filter months
baselineDF = baselineDF[(baselineDF.index.month >= 4) & (baselineDF.index.month <= 10)]

print(baselineDF.head())
print()
print(baselineDF.tail())

                value
1961-04-01  73.500947
1961-04-02  78.862910
1961-04-03  83.878230
1961-04-04  83.873029
1961-04-05  79.941237

                value
1990-10-27  83.352703
1990-10-28  82.617298
1990-10-29  77.272040
1990-10-30  73.994929
1990-10-31  71.551565


In [8]:
threshold = baselineDF['value'].quantile(0.98, interpolation='linear')
print('Extreme Heat Threshold value is', round(threshold, 1), 'degrees Fahrenheit', sep = ' ')

Extreme Heat Threshold value is 102.7 degrees Fahrenheit


### 3. Get modeled data

For this example we will get modeled projections for one scenario (RCP 8.5) and one model (HadGEM2-ES).

In [9]:
# Request url
url = 'http://api.cal-adapt.org/api/series/tasmax_day_HadGEM2-ES_rcp85/' + 'rasters/'

# Make request
response = requests.get(url, params=params, headers=headers)

# Variable stores modeled daily data in a Pandas dataframe
modeledDF = None

if response.ok:
    json = response.json()
    data = json['results'][0]
    
    # Multiband raster data is returned by the API as a 3D array having a shape (233376, 1, 1)
    # Flatten the 3D array into a 1D array
    values_arr = np.array(data['image'])
    values_arr = values_arr.flatten()
    
    # Get start date of timeseries
    start_date = datetime.strptime(data['event'], '%Y-%m-%d')
    
    # Get total number of values -> number of days
    length = len(values_arr)
    
    # Create new pandas dataframe and map each value in list to a date index
    modeledDF = pd.DataFrame(values_arr,
        index=pd.date_range(start_date, freq='1D', periods=length),
        columns=['value'])
    
    # Convert Kelvin to Fahrenheit
    modeledDF.value = modeledDF.value.apply(lambda x: kelvin_to_F(x))

In [10]:
print(modeledDF.head())
print()
print(modeledDF.tail())

                value
2006-01-01  54.075602
2006-01-02  51.549680
2006-01-03  51.462877
2006-01-04  51.992432
2006-01-05  45.609325

                value
2099-12-27  56.904231
2099-12-28  61.793922
2099-12-29  61.170569
2099-12-30  62.885722
2099-12-31  64.597578


### 4. Calculate extreme heat days

In [11]:
# Filter years
filteredDF = modeledDF.loc['2070-01-01':'2099-12-31']

# Filter months
filteredDF = filteredDF[(filteredDF.index.month >= 4) & (filteredDF.index.month <= 10)]

# Filter days > threshold
filteredDF = filteredDF[filteredDF.value > threshold]

print(filteredDF.head())
print()
print(filteredDF.tail())

                 value
2070-05-13  104.988558
2070-05-14  104.404348
2070-05-15  103.318491
2070-05-20  107.368962
2070-05-21  107.779928

                 value
2099-09-16  108.722550
2099-09-17  107.535428
2099-09-18  105.261943
2099-09-19  103.908101
2099-09-20  104.712273


Count of extreme heat days by year (April through October)

In [12]:
filteredDF.value.resample('1AS').count()

2070-01-01    55
2071-01-01    32
2072-01-01    67
2073-01-01    36
2074-01-01    33
2075-01-01    53
2076-01-01    49
2077-01-01    63
2078-01-01    65
2079-01-01    41
2080-01-01    57
2081-01-01    44
2082-01-01    52
2083-01-01    63
2084-01-01    71
2085-01-01    57
2086-01-01    49
2087-01-01    53
2088-01-01    54
2089-01-01    53
2090-01-01    70
2091-01-01    51
2092-01-01    70
2093-01-01    71
2094-01-01    78
2095-01-01    72
2096-01-01    68
2097-01-01    41
2098-01-01    77
2099-01-01    43
Freq: AS-JAN, Name: value, dtype: int64

Count of extreme heat days averaged over 2070-2099 (April through October)

In [13]:
filteredDF.value.resample('1AS').count().mean()

56.266666666666666