## Absolute change in 99th percentile 1-day accumulated precipitation
This notebook generates the text metadata files for the in-land flooding exposure metric `absolute change in 99th percentile 1-day accumulated precipitation`, using data from Cal-Adapt: Analytics Engine data. 
Please see the processing notebook `climate_ae_precipitation_accumulation_metrics.ipynb` for full methodological process. Note this notebook can only be on the AE Jupyter Hub, or a computing environment with a large enough memory capacity (e.g., at least 30 GB).

### Step 1: Generate metadata

In [8]:
import pandas as pd
import os
import sys

sys.path.append(os.path.expanduser('../../'))
from scripts.utils.file_helpers import upload_csv_aws, pull_csv_from_directory
from scripts.utils.write_metadata import append_metadata

In [9]:
bucket_name = 'ca-climate-index'
aws_dir = '3_fair_data/index_data/climate_flood_exposure_precipitation_metric.csv'
folder = 'csv_folder'

pull_csv_from_directory(bucket_name, aws_dir, folder, search_zipped=False)

Saved DataFrame as 'csv_folder\climate_flood_exposure_precipitation_metric.csv'


In [10]:
# options here: climate_extreme_heat_hot_day_metric, climate_extreme_heat_warm_night_metric
df_in = pd.read_csv(r'csv_folder/climate_flood_exposure_precipitation_metric.csv') # make sure this is in the same folder!
df_in # check

Unnamed: 0,GEOID,precip_99percentile,precip_99percentile_min,precip_99percentile_max,precip_99percentile_min_max_standardized
0,6001401700,1.009346,-1.807552,12.977818,0.190519
1,6001401800,1.009346,-1.807552,12.977818,0.190519
2,6001402200,1.009346,-1.807552,12.977818,0.190519
3,6001402500,1.009346,-1.807552,12.977818,0.190519
4,6001402600,1.009346,-1.807552,12.977818,0.190519
...,...,...,...,...,...
9124,6111008900,1.700077,-1.807552,12.977818,0.237236
9125,6111009100,1.700077,-1.807552,12.977818,0.237236
9126,6111009200,1.700077,-1.807552,12.977818,0.237236
9127,6111009600,1.373455,-1.807552,12.977818,0.215146


In [11]:
# Move a specific column to the end of the DataFrame
column_to_move = 'precip_99percentile'  # Replace with the actual column name
columns = [col for col in df_in.columns if col != column_to_move]  # Keep all other columns
columns.append(column_to_move)  # Add the column to move to the end

# Reassign the DataFrame with the new column order
df_in = df_in[columns]

In [12]:
df_in.to_csv('climate_flood_exposure_precipitation_metric.csv', index=False)

In [13]:
@append_metadata
def precip_ae_data_process(df, export=False, export_filename=None, varname=''):
    '''
    Reduces the size of the initial daily raw precipitation data in order to streamline compute time.
    Transforms the raw data into the following baseline metrics:
    * Absolute change in 99th percentile 1-day accumulated precipitation
    
    Methods
    -------
    Metric is calculated by pooling data across models and calculating the 99th percentile. 
    See https://github.com/cal-adapt/cae-notebooks/blob/main/exploratory/internal_variability.ipynb
    for reasoning behind data pooling for precipitation model data. 
    
    Parameters
    ----------
    df: pd.DataFrame
        Input data.
    export: True/False boolean
        False = will not upload resulting df containing CAL CRAI flooding metric to AWS
        True = will upload resulting df containing CAL CRAI flooding metric to AWS
    export_filename: string
        name of csv file to be uploaded to AWS
    varname: string
        Final metric name, for metadata generation
        
    Script
    ------
    Metric calculation: climate_ae_precipitation_accumulation_metric.py via pcluster run
    Example metric calculation for Alameda county: climate_ae_precipitation_accumulation_metric_example.ipynb
    Metadata generation: climate_ae_precipitation_accumulation_metadata.ipynb
    
    Note
    ----
    Because the climate projections data is on the order of 2.4 TB in size, intermediary
    processed files are not produced for each stage of the metric calculation. All processing
    occurs in a single complete run in the notebook listed above.
    '''
        
    # calculate with 2°C WL
    print('Data transformation: raw projections data retrieved for warming level of 2.0°C, by manually subsetting based on GWL for parent GCM and calculating 30 year average.')
    print("Data transformation: dynamically-downscaled climate data subsetted for a-priori bias-corrected models.")

    # historical baseline
    print("Data transformation: historical baseline data retrieved for 1981-2010, averaging across models.")
    print("Data transformation: dynamically-downscaled climate data subsetted for a-priori bias-corrected models.")

    # calculate delta signal
    print("Data transformation: snowfall sigal removed from precipitation data to isolate liquid precipitation from total precipitation.")
    print("Data transformation: data clipped to remove 0.1mm to remove trace precipitation.")
    print("Data transformation: leap days removed from historical data to match time periods.")
    print("Data transformation: data pooled across models to increase sample size and drop all singleton dimensions (scenario).")
    print("Data transformation: calculate 99th percentile from pooled data.")
    print("Data transformation: delta signal calculated by taking difference between chronic (2.0°C) and historical baseline.")
    print("Data transformation: non-CA grid points removed from data.")

    # reprojection to census tracts
    print("Data transformation: data transformed from xarray dataset into pandas dataframe.")
    print("Data transformation: data reprojected from Lambert Conformal Conic CRS to CRS 3310.")
        
    # min-max standardization
    print("Data transformation: data min-max standardized with min_max_standardize function.")
    
    if export == True:
        bucket_name = 'ca-climate-index'
        directory = '3_fair_data/index_data'
        export_filename = [df]
        upload_csv_aws(export_filename, bucket_name, directory)

    if export == False:
        print(f'{df} uplaoded to AWS.')

    if os.path.exists(df):
        os.remove(df)

In [14]:
varname = 'climate_caladapt_flood_exposure_precipitation'
filename = 'climate_flood_exposure_precipitation_metric.csv'
precip_ae_data_process(filename, export=True, export_filename=None, varname='test')