# Cal-Adapt Analytics Engine: Threshold Tools Application Examples

A notebook on specific threshold-related applications with the *climakitae* package and *threshold_tools*. The notebook walks through both basic and advanced topics and covers capabilities such as evaluating goodness of fit, calculating return values and return periods, subsetting and filtering data, observing trends through time, and more.

The notebook focuses on two major *threshold_tools* applications: **updating design standards** and **asset-by-asset vulnerability assessments**, which are highlighted as application examples throughout the notebook.

### Step 0: Import

#### Import necessary packages before running analysis

In [None]:
import xarray as xr
import panel as pn
pn.extension()

In [None]:
import climakitae as ck
from climakitae import threshold_tools

## Threshold Basics: Exploring Applications At County-Level

### Step 1: Select

#### Load a new application and call *select* to display interface from which to choose location, variables, scenarios, and designate warming levels of interest

In [None]:
app = ck.Application()

#### Call *select* to display an interface from which to select the data to examine

For this section, please select:
- timescale: "monthly"
- variable: "Air Temperature at 2m"
- units: "degC"
- resolution: "9 km"
- scenario: "SSP 3-7.0 -- Business as Usual" and "Historical Climate" 
- location subsets: "CA counties" area subset with "Sacramento County" cached area

To learn more about the data available on the Analytics Engine, [see our data catalog](https://analytics.cal-adapt.org/data/). 

In [None]:
app.select()

### Step 2: Retrieve

#### Call *app.retrieve()* to load the subset/combo of data specified

In [None]:
sacramento_ds = app.retrieve()
sacramento_ds

#### Subset data by scenario and simulation to prepare it for *threshold_tools* functions

Note: Currently threshold_tools requires a dataarray where this is only 1 simulation and 1 scenario selected

In [None]:
sacramento_da = sacramento_ds.sel(simulation='WRF_CNRM-ESM2-1_r1i1p1f2')

### Step 3: Transform

#### Pull Annual Maximum Series (AMS) for all grid cells

In [None]:
sacramento_ams = threshold_tools.get_ams(sacramento_da, extremes_type='max')
sacramento_ams = app.load(sacramento_ams)

#### Subset data by time to prepare it for specific application

In [None]:
sacramento_1980_ams = sacramento_ams.sel(time=slice('1980-01-01', '2010-01-01'))

#### Calculate goodness of fit of selected distribution

<span style="color:#E47704">

**Application Example:** A electric utility in Sacramento wants to ensure that the return value and return period results they calculate are "statistically sound" and appropriate to use for their asset vulnerability assessments and design standards. 
    
For instance, is the GEV probability distribution a good fit for the data and the right probability distribution to calculate return values and return periods? Example code applies statistical goodness of fit test, K-S test.

In [None]:
sacramento_1980_ks = threshold_tools.get_ks_stat(sacramento_1980_ams, distr='gev', multiple_points=True)
sacramento_1980_ks

#### Calculate return value for a selected return period

<span style="color:#E47704">
    
**Application Example:** A electric utility planning on building new electrical equipment in Sacramento wants to calculate the value of a 1-in-20-year extreme temperature event that occurred historically (during the 1980-2010 time period) as a benchmark input for updating the design standards of new equipment.

In [None]:
sacramento_1980_rv = threshold_tools.get_return_value(sacramento_1980_ams, return_period=20, 
                                                      distr='gev', bootstrap_runs=100, 
                                                      conf_int_lower_bound=2.5, 
                                                      conf_int_upper_bound=97.5, 
                                                      multiple_points=True)
sacramento_1980_rv

#### Calculate return period for a selected return value

<span style="color:#E47704">
    
**Application Example:** An electric utility with existing electrical infrastructure in Sacramento wants to calculate the return period of a 35 degrees C temperature event that occurred historically (during the 1980-2010 time period) as a benchmark input for their recurring asset vulnerability assessment.

In [None]:
sacramento_1980_rp = threshold_tools.get_return_period(sacramento_1980_ams, return_value=35, 
                                                       distr='gev', bootstrap_runs=100, 
                                                       conf_int_lower_bound=2.5, 
                                                       conf_int_upper_bound=97.5, 
                                                       multiple_points=True)
sacramento_1980_rp

### Step 4: Visualize

#### Visualize goodness of fit of distribution

In [None]:
threshold_tools.get_geospatial_plot(sacramento_1980_ks, data_variable='p_value')

#### Visualize return value

In [None]:
threshold_tools.get_geospatial_plot(sacramento_1980_rv, data_variable='return_value')

#### Visualize return period

In [None]:
threshold_tools.get_geospatial_plot(sacramento_1980_rp, data_variable='return_period', bar_max=100)

## Threshold Advanced: Exploring Variations At County-Level

### Step 3: Transform

#### Subset data by time to prepare it for specific application

In [None]:
sacramento_2020_ams = sacramento_ams.sel(time=slice('2020-01-01', '2050-01-01'))
sacramento_2050_ams = sacramento_ams.sel(time=slice('2050-01-01', '2080-01-01'))

#### Calculate return value for a selected return period

<span style="color:#E47704">
    
**Application Example:** A electric utility planning on building new electrical equipment in Sacramento wants to calculate the value of a 1-in-20-year extreme temperature event that will occur in the future (during the 2020-50 and 2050-80 time periods) to ensure:
- that the planned equipment has the appropriate design standards to withstand extreme temperature events in the future,
- and to update the design standards for any future equipment built.

In [None]:
sacramento_2020_rv = threshold_tools.get_return_value(sacramento_2020_ams, return_period=20,
                                                distr='gev', bootstrap_runs=100, 
                                                conf_int_lower_bound=2.5, 
                                                conf_int_upper_bound=97.5, 
                                                multiple_points=True)
sacramento_2020_rv

In [None]:
sacramento_2050_rv = threshold_tools.get_return_value(sacramento_2050_ams, return_period=20,
                                                distr='gev', bootstrap_runs=100, 
                                                conf_int_lower_bound=2.5, 
                                                conf_int_upper_bound=97.5, 
                                                multiple_points=True)
sacramento_2050_rv

#### Calculate return period for a selected return value

<span style="color:#E47704">
    
**Application Example:** A electric utility with existing electrical infrastructure in Sacramento wants to calculate the return period of a 44 degrees C temperature event that that will occur in the future (during the 2020-50 and 2050-80 time periods) to understand:
- if the existing infrastructure will be impacted by more frequently occurring extreme temperature events in the future,
- and to complete a more robust asset-by-asset vulnerability assessment.
    
For the utility, 33 degrees C (hypothetically) represents a county-wide historic average for a 1-in-20-year extreme temperature event (during the 1980-2010 time period).

In [None]:
sacramento_1980_rp = threshold_tools.get_return_period(sacramento_1980_ams, return_value=33,
                                                       distr='gev', bootstrap_runs=100, 
                                                       conf_int_lower_bound=2.5, 
                                                       conf_int_upper_bound=97.5, 
                                                       multiple_points=True)
sacramento_1980_rp

In [None]:
sacramento_2020_rp = threshold_tools.get_return_period(sacramento_2020_ams, return_value=33,
                                                       distr='gev', bootstrap_runs=100, 
                                                       conf_int_lower_bound=2.5, 
                                                       conf_int_upper_bound=97.5, 
                                                       multiple_points=True)
sacramento_2020_rp

In [None]:
sacramento_2050_rp = threshold_tools.get_return_period(sacramento_2050_ams, return_value=33,
                                                       distr='gev', bootstrap_runs=100, 
                                                       conf_int_lower_bound=2.5, 
                                                       conf_int_upper_bound=97.5, 
                                                       multiple_points=True)
sacramento_2050_rp

### Step 4: Visualize

#### Visualize return value

In [None]:
threshold_tools.get_geospatial_plot(sacramento_1980_rv, data_variable='return_value',
                                    bar_min=25, bar_max=50)

In [None]:
threshold_tools.get_geospatial_plot(sacramento_2020_rv, data_variable='return_value',
                                    bar_min=25, bar_max=50)

In [None]:
threshold_tools.get_geospatial_plot(sacramento_2050_rv, data_variable='return_value',
                                    bar_min=25, bar_max=50)

#### Visualize return period

In [None]:
threshold_tools.get_geospatial_plot(sacramento_1980_rp, data_variable='return_period', 
                                    bar_min=1, bar_max=100)

In [None]:
threshold_tools.get_geospatial_plot(sacramento_2020_rp, data_variable='return_period',
                                    bar_min=1, bar_max=100)

In [None]:
threshold_tools.get_geospatial_plot(sacramento_2050_rp, data_variable='return_period',
                                    bar_min=1, bar_max=100)

### Step 5: Export

To export the threshold tools variables, we recommend NetCDF file format, which will work with any number of variables and dimensions in your dataset. 
If you would like to save data as a GeoTIFF or CSV file and the dataset contains scenarios or simulations, additionally provide arguments specifying the scenario (scenario=”historical”) and the simulation (simulation=”cesm2”).
- CSV and GeoTIFF can only be used for data arrays with one variable
- CSV works best for up to 2-dimensional data (e.g., lon x lat), and will be compressed and exported with a separate metadata file
- GeoTIFF can accept 3 dimensions total:
    - X and Y dimensions are required
    - The third dimension is flexible and will be a "band" in the file: time, simulation, or scenario could go here
    - Metadata will be accessible as "tags" in the .tif

To export as a GeoTIFF or CSV file, please subset the data with your desired variable first, then select either CSV or GeoTIFF as your format (NetCDF will also work).

In [None]:
app.export_as()

Next, write in the object you wish to export and your desired filename (in single or double quotation marks).

In [None]:
app.export_dataset(sacramento_2050_rp, 'my_filename_1')

An example of subsetting is below, for exporting to a CSV or GeoTIFF.

In [None]:
sacramento_1980_rp_variable = sacramento_1980_rp['return_period']

In [None]:
app.export_as()

In [None]:
app.export_dataset(sacramento_1980_rp_variable, 'my_filename_2')