# Vulnerability Assessment Pilot
This notebook demonstrates on-going development of climate adaptation vulnerability assessment (CAVA) support using climate data in the Analytics Engine. 

To execute a given 'cell' of this notebook, place the cursor in the cell and press the 'play' icon, or simply press shift+enter together. Some cells will take longer to run, and you will see a [$\ast$] to the left of the cell while AE is still working.

**Intended Application**: As a user, I want to **<span style="color:#FF0000">access climate projections data for my vulnerability assessment report</span>** by:
1. Retrieve data metrics required for planning needs

**Runtime**: With the default settings, this notebook takes approximately **less than 10 minutes** to run from start to finish. Modifications to selections may increase the runtime. <br>*This notebook is currently in progress, runtime will change as improvements and further analyses are added.*

### Step 0: Set-up

First, we'll import the Python library [climakitae](https://github.com/cal-adapt/climakitae), our AE toolkit for climate data analysis, along with this specific functions from that library that we'll use in this notebook, as well as any other necessary Python libraries to aid in analysis.

In [None]:
import climakitae as ck
import pandas as pd

from climakitae.explore.vulnerability import cava_data

### Step 1: Import locations
Now we'll read in point-based locations that we want to retrieve data for. For custom inputs, there are two options: (1) Input a single pair of latitude - longitude values; and (2) Import a csv file of locations that will each run. In the code below we show what each option looks like. 

Functionality to assess over a gridded area (region) is in the works as well!

In [None]:
## select a single custom location
# your_lat = LAT
# your_lon = LON

To import your own custom locations, we recommend putting your csv file in the same folder as this notebook for ease:
1. Drag and drop a csv file into the file tree on the left hand side; or
2. Use the `upload` button (the "up arrow" symbol next to the large blue plus symbol above the file tree). 

<span style="color:#FF0000">**Formatting note**</span>: For the code cells below to work, there must be **2 columns labeled `lat` and `lon`**. Functionality to accept different labeling is forthcoming!

In the cell below, we read the csv file in. We extracted three random locations from the HadISD station list as an example here -- you'll want to replace with your own locations file!

In [None]:
# Read in dummy locations from `stations_csv` file
from climakitae.core.paths import stations_csv_path
from climakitae.util.utils import read_csv_file
example_locs = read_csv_file(stations_csv_path, index_col=0)[['LAT_Y', 'LON_X']].rename(columns={'LAT_Y': 'lat', 'LON_X': 'lon'})

In [None]:
# select a location from the location list
one_loc = example_locs.loc[example_locs.index == 0]
loc_lat = one_loc.lat.values[0]
loc_lon = one_loc.lon.values[0]
print(loc_lat, loc_lon)

### Step 2: Retrieve metric data

Here, we'll list some of the available arguments for the `cava_data` function for certain parameters.

In [None]:
available_variables = [
    "Air Temperature at 2m",
    "Precipitation (total)",
    "NOAA Heat Index",
]
available_metrics = ["min", "max", "mean", "median"]
ssps = ["SSP2-4.5", "SSP3-7.0", "SSP5-8.5"]
export_method = ['off-ramp', 'calculate', 'both']

Now, we'll run the `cava_data` function with the arguments that you have changed below.

In [None]:
data = cava_data(
    example_locs.iloc[:1],
    time_start_year=2030,
    time_end_year=2050,
    units="degF",
    downscaling_method="Dynamical",  # default for now ## mandatory
    approach="time",  
    warming_level='3.0',
    wrf_bc=True,
    historical_data="Historical Climate",  # or "historical reconstruction"
    ssp_data=["SSP3-7.0"],
    variable="Air Temperature at 2m",  ## mandatory, must eventually accept temp, precip, or heat index
    metric_calc="max", 
    heat_idx_threshold=None, # Heat Index Threshold
    one_in_x=2, # One-in-X
    percentile=None, # Likeliness
    season="summer",
    export_method="calculate",  # off-ramp, full calculate, both
    separate_files=True, # Toggle to determine whether or not the user wants to separate climate variable information into separate files
    file_format="NetCDF",
)

---

### Appendix: Table Generation Sample Code

In [None]:
# Params dict
table_vars = {
    'Likely summer day high': {
        'variable': 'Air Temperature at 2m',
        'metric_calc': 'max',
        'season': 'summer',
        'percentile': 50,
        'heat_idx_threshold': None,
        'one_in_x': None,
    },
    'Likely summer night low': {
        'variable': 'Air Temperature at 2m',
        'metric_calc': 'min',
        'season': 'summer',
        'percentile': 50,
        'heat_idx_threshold': None,
        'one_in_x': None,
    },
    'Likely winter day high': {
        'variable': 'Air Temperature at 2m',
        'metric_calc': 'max',
        'season': 'winter',
        'percentile': 50,
        'heat_idx_threshold': None,
        'one_in_x': None,
    },
    'Likely winter night low': {
        'variable': 'Air Temperature at 2m',
        'metric_calc': 'min',
        'season': 'winter',
        'percentile': 50,
        'heat_idx_threshold': None,
        'one_in_x': None,
    },
    '1-in-2 year maximum': {
        'variable': 'Air Temperature at 2m',
        'metric_calc': 'max',
        'season': 'all',
        'percentile': None,
        'heat_idx_threshold': None,
        'one_in_x': 2,
    },
    '1-in-10 year maximum': {
        'variable': 'Air Temperature at 2m',
        'metric_calc': 'max',
        'season': 'all',
        'percentile': None,
        'heat_idx_threshold': None,
        'one_in_x': 10,
    },
    '1-in-10 year minimum': {
        'variable': 'Air Temperature at 2m',
        'metric_calc': 'min',
        'season': 'all',
        'percentile': None,
        'heat_idx_threshold': None,
        'one_in_x': 10,
    },
    'High/Extreme Heat Index': {
        'variable': 'NOAA Heat Index',
        'metric_calc': 'max',
        'season': 'all',
        'percentile': None,
        'heat_idx_threshold': 91,
        'one_in_x': None,
    }
}

In [None]:
%%time
import numpy as np
import pandas as pd
from climakitae.explore import warming_levels as WarmingLevels
from climakitae.core.data_interface import DataParameters
from climakitae.core.data_load import load
import contextlib
import io

loc_idx = 10
suppress_output = True

# Create empty df and instantiate variables
df = pd.DataFrame(columns=table_vars.keys())
lat, lon = example_locs.iloc[loc_idx] # Oakland teehee
months_map = {
    "winter": [12, 1, 2],
    "summer": [6, 7, 8],
    "all": np.arange(1, 13)
}
warming_levels = ['0.0', '1.5', '2.0', '3.0', '4.0'] # 0.0 = Historical Period (1980-2010)

# Create each column in the table, which is the historical period (1980-2010) and each WL (1.5, 2.0, 3.0, 4.0).
for warming_level in warming_levels:
    
    metrics = []
    preloaded_data_by_season = {}
    
    if warming_level != '0.0':
    
        # Retrieving warming level data once for each season so that it's not repeated constantly
        wl = WarmingLevels()
        wl.wl_params.timescale = "hourly"
        wl.wl_params.downscaling_method = "Dynamical"
        # wl.wl_params.variable_type = "Derived Index" if variable == "NOAA Heat Index" else "Variable"
        wl.wl_params.variable_type = "Variable"
        wl.wl_params.variable = "Air Temperature at 2m"
        wl.wl_params.latitude = (lat - 0.02, lat + 0.02)
        wl.wl_params.longitude = (lon - 0.02, lon + 0.02)
        wl.wl_params.warming_levels = [
            warming_level
        ]  # Calvin- default, only allow for 1 warming level to be passed in.
        wl.wl_params.units = "degF"
        wl.wl_params.resolution = "3 km"
        wl.wl_params.anom = "No"  # Q: When do we want this anomaly to be 'Yes'?

        print(f"\nRetrieving all warming level data by season to be cached and used in `cava_data`...\n")

        for season in months_map.keys():

            # Calculating warming levels by season
            wl.wl_params.months = months_map[season]

            print(f"\nRetrieving warming level data for {warming_level}°C in season {season}...\n") 

            wl.calculate()
    
            # Appending data to cached data variable
            preloaded_data_by_season[season] = wl.sliced_data[warming_level]      
    
    else: 
        
        # Retrieving time-based data once for each season so that it's not repeated constantly
        selections = DataParameters()
        selections.data_type = "Gridded"
        selections.downscaling_method = "Dynamical"
        selections.scenario_historical = ["Historical Climate"]
        selections.scenario_ssp = ssp_data=["SSP3-7.0"]
        selections.timescale = "hourly"
        selections.variable = "Air Temperature at 2m"
        selections.variable_type = "Variable"
        selections.latitude = (
            lat - 0.02,
            lat + 0.02,
        )
        selections.longitude = (lon - 0.02, lon + 0.02)
        selections.time_slice = (1980, 2010)
        selections.resolution = "3 km"
        selections.units = 'degF'

        print(f"\nRetrieving all time-based data by season to be cached and used in `cava_data`...\n")
        
        data = load(selections.retrieve(), progress_bar=True)

        for season in months_map.keys():

            # Appending data to cached data variable
            preloaded_data_by_season[season] = data.sel(time=data.time.dt.month.isin(months_map[season]))
        
    # Calculate each variable in the table
    for key in table_vars:
        params = table_vars[key]
        
        wl_or_hist_str = "Historical Period (1980-2010)" if warming_level == '0.0' else warming_level + '°C'
        print(f"\nRetrieving {key} for Warming Level {wl_or_hist_str}...\n")

        # Suppress outputs of `cava_data` function
        if suppress_output:
            
            with contextlib.redirect_stdout(io.StringIO()):
            
                data = cava_data(
                    example_locs.iloc[loc_idx: loc_idx + 1],
                    time_start_year=1980, # For historical period
                    time_end_year=2010, # For historical period
                    units="degF",
                    downscaling_method="Dynamical",  # default for now ## mandatory
                    approach="warming_level" if warming_level != '0.0' else "time",  
                    warming_level=warming_level,
                    wrf_bc=False,
                    historical_data="Historical Climate",  # or "historical reconstruction"
                    ssp_data=["SSP3-7.0"],
                    variable=params['variable'],  ## mandatory, must eventually accept temp, precip, or heat index
                    metric_calc=params['metric_calc'],
                    heat_idx_threshold=params['heat_idx_threshold'], # Heat index
                    one_in_x=params['one_in_x'], # Thresholds tools freq. counts
                    percentile=params['percentile'],
                    season=params['season'],
                    export_method='None',  # off-ramp, full calculate, both
                    separate_files=True, # Toggle to determine whether or not the user wants to separate climate variable information into separate files
                    file_format="NetCDF",
                    preloaded_data=preloaded_data_by_season,
                )

        # Retrieve data and average across simulation dimension
        val = data[0].mean(dim='simulation').item()
        
        # Add val to metrics to be added into row
        metrics.append(val)
        
    # Create dictionary of values to be input into DataFrame
    df.loc[warming_level] = pd.Series(dict(zip(table_vars.keys(), metrics)))
    
# Make slight modifications to DataFrame
df = df.T.rename(columns={'0.0': 'Hist. Period (1980-2010)'})
    
# Write out dataframe
df.to_csv(f"final_table_{loc_idx}.csv", index=False)

In [None]:
# Look at the table
oakland_df = df
oakland_df

In [None]:
# data_history: {
#     metric_applied, 
#     threshold_applied,
# }