## Cal-CRAI Supplementary Layer: Sea Level Rise Impact Area

In the SLR focused scenario, we intentionally need to set inland census tracts / counties as "improbable" or "invulnerable" to the particular risk. For our purposes, we define the areas of vulnerability to SLR based on the Cal-CRAI metric: `percent change in wetland habitat under RCP 4.5 at median model SLR per county`. From this metric, we create a binary layer of: **At risk to SLR = 1** and **Not at risk to SLR = 0**, where all census tracts (via their respective county) at risk to SLR are identified as having a non-missing percent change in wetland habitat due to SLR. 

This "improbability" layer is used to mask all in-land counties in the SLR focused scenario analysis, to avoid in-land values skewing the overall distribution of values. The first half of this notebook is similar to the metric calculation for the input layer. 

In [2]:
import os
import sys
import pandas as pd
import io
import numpy as np
import geopandas as gpd

sys.path.append(os.path.expanduser('../../'))
from scripts.utils.file_helpers import pull_csv_from_directory, upload_csv_aws
from scripts.utils.calculate_index import add_census_tracts

## Step 1: Pull data and process metric

In [None]:
bucket_name = 'ca-climate-index'
aws_dir = '1_pull_data/climate_risk/sea_level_rise/loss/climate_central/'
output_folder = '../data_metric_calc'

pull_csv_from_directory(bucket_name, aws_dir, output_folder=output_folder, search_zipped=False)

In [4]:
wetland_data = pd.read_csv('RCP_wetland_data.csv')

In [None]:
# Adjust the data
adjusted_wetland_data = wetland_data[22:]

# Set the first row as the header and reset index
adjusted_wetland_data.columns = adjusted_wetland_data.iloc[0]
adjusted_wetland_data = adjusted_wetland_data[1:].reset_index(drop=True)

# Drop the index column if it has been set as a column
adjusted_wetland_data.reset_index(drop=True, inplace=True)

# Rename columns to ensure no extra index is included
adjusted_wetland_data.columns.name = None

# Filter columns explicitly
columns_to_keep = [col for col in adjusted_wetland_data.columns 
                    if 'County' in col or '2000' in col or '2100' in col]
adjusted_wetland_data = adjusted_wetland_data[columns_to_keep]
adjusted_wetland_data.columns

In [6]:
# Function to calculate percent change between 2000 and 2100 columns
def calculate_percent_change(data, leave_alone=[]):
    # Convert columns to numeric, forcing non-numeric to NaN (skip columns in leave_alone)
    numeric_data = data.copy()
    for col in data.columns:
        if col not in leave_alone:
            numeric_data[col] = pd.to_numeric(data[col], errors='coerce')
    
    # Define columns for 2000 and 2100
    cols_2000 = [col for col in numeric_data.columns if '2000' in col]
    cols_2100 = [col for col in numeric_data.columns if '2100' in col]
    
    # Calculate percent change
    percent_change = pd.DataFrame()

    for cols_2000 in cols_2000:
        # Find the matching 2100 column
        col_2100 = cols_2000.replace('2000', '2100')

        if col_2100 in cols_2100:
            # Calculate percent change, handling NaN values
            percent_change[cols_2000 + '_to_' + col_2100] = (
                (numeric_data[col_2100] - numeric_data[cols_2000]) / numeric_data[cols_2000]
            ) * 100
    
    # Concatenate the percent change DataFrame with the original numeric data
    result = pd.concat([numeric_data, percent_change], axis=1)
    
    return result

# Function to rename columns, allowing some to be left unchanged
def rename_columns(data, leave_alone=[]):
    def rename_column(col):
        if col in leave_alone:
            return col
        words = col.split('_')
        return '_'.join(words[:4]) + '_percent_change'
    
    # Apply renaming function to columns
    data.columns = [rename_column(col) for col in data.columns]
    return data

# List of columns to leave unchanged
column_leave_alone = ['County']

In [None]:
# Run the calculation and renaming
adjusted_wetland_metric = calculate_percent_change(adjusted_wetland_data, leave_alone=column_leave_alone)

# Filter for columns that contain 'County' or 'to'
filtered_columns = [col for col in adjusted_wetland_metric.columns if 'County' in col or 'to' in col]

# Create a new DataFrame with only the filtered columns
filtered_wetland_metric = adjusted_wetland_metric[filtered_columns]
# Remove duplicate columns
filtered_wetland_metric = filtered_wetland_metric.loc[:, ~filtered_wetland_metric.columns.duplicated()]

wetland_metric_percent_change = rename_columns(filtered_wetland_metric, leave_alone=column_leave_alone)

wetland_metric_percent_change.columns = wetland_metric_percent_change.columns.str.lower()
wetland_metric_percent_change = wetland_metric_percent_change.applymap(lambda s: s.lower() if type(s) == str else s)

# Display the resulting DataFrame
wetland_metric_percent_change.head(3)

In [None]:
# read in CA census tiger file
ca_tract_county = "s3://ca-climate-index/0_map_data/ca_tracts_county.csv"
ca_tract_county = gpd.read_file(ca_tract_county)
ca_tract_county = ca_tract_county.drop(columns={'field_1'})
ca_tract_county.columns = ca_tract_county.columns.str.lower()
ca_tract_county = ca_tract_county.applymap(lambda s: s.lower() if type(s) == str else s)

ca_tract_county.head(3)

In [None]:
wetland_metric_merge = pd.merge(ca_tract_county, wetland_metric_percent_change, on='county', how='left')
final_columns = ['tract', 'county', 'rcp_4.5__50th_percent_change']
wetland_metric_final = wetland_metric_merge[final_columns]
wetland_metric_final['GEOID'] = wetland_metric_final['tract']
wetland_metric_final.drop(columns=['tract'])

In [None]:
gdf = add_census_tracts(wetland_metric_final)
gdf.head(5)

## Step 2: Identify SLR regions and create binary layer

In [None]:
# counties not impacted by SLR
no_impact_counties = gdf.loc[gdf['rcp_4.5__50th_percent_change'].isnull()]['county'].unique().tolist()
print('# of non-impacted census tracts: ', len(gdf.loc[gdf['rcp_4.5__50th_percent_change'].isnull()]))
no_impact_counties

In [None]:
# slr impacted counties
impact_counties = gdf.loc[gdf['rcp_4.5__50th_percent_change'] < 100]['county'].unique().tolist()
print('# of SLR impacted census tracts: ', len(gdf.loc[gdf['rcp_4.5__50th_percent_change'] < 100]))
impact_counties

Create new binary layer of SLR impact

In [None]:
# need to save a new binary layer of SLR imapct
gdf['slr_impacted'] = gdf['rcp_4.5__50th_percent_change'].apply(lambda x: 1 if not pd.isnull(x) else 0)
gdf.head(10)

In [None]:
# confirming count of impacted slr tracts
gdf.slr_impacted.value_counts()

In [15]:
# clean up before export
gdf = gdf[['GEOID', 'county', 'geometry', 'COUNTYFP', 'slr_impacted']]

In [None]:
# visually confirm coastal areas have value of 1, inland areas have value of 0 
gdf.plot(column='slr_impacted', legend=True)

## Step 3: Export binary layer

In [17]:
# save layer as csv file
gdf.to_csv('../utils/slr_mask_layer.csv')

In [None]:
# upload to AWS
bucket_name = 'ca-climate-index'
directory = '0_map_data'
export_filename = ['slr_mask_layer.csv']

upload_csv_aws(export_filename, bucket_name, directory) 