# Build flood hazard suitablity layers for GRIDCERF

The following code was used to build the flood hazard suitability layers for GRIDCERF. GRIDCERF does not provide the source data directly due to some license restrictions related for direct redistribution of the unaltered source data.  However, the following details the provenance associated with each source dataset and how they were processed.

## 1. Setup environment

### 1.1 Download GRIDCERF

Download the GRIDCERF package if you have not yet done so from here:  https://doi.org/10.57931/2281697.  Please extract GRIDCERF inside the `data` directory of this repository as the paths in this notebook are set to that expectation.


### 1.2 Data description

- **Title**: FEMA National Flood Hazard Layer
- **Description from Source**: The National Flood Hazard Layer (NFHL) is a geospatial database that contains current effective flood hazard data. FEMA provides the flood hazard data to support the National Flood Insurance Program. You can use the information to better understand your level of flood risk and type of flooding.
- **Source URL**:  https://msc.fema.gov/portal/advanceSearch
- **Date Accessed**:  03/23/23
- **Citation**
> US Federal Emergency Management Agency, 2023. FEMA National Flood Hazard Layer. https://msc.fema.gov/portal/advanceSearch
- **Application**: US FEMA provides flood risk data for regions where they have conducted detailed or approximate studies. They provide flood risk assessment information that are categorized by the following types:

    * A: 1-percent annual chance flood event (a.k.a, base flood or 100-year flood) using approximate methodologies, no detailed hydraulic analyses have been performed.
    * AE: 1-percent annual chance flood event determined through detailed hydraulic analyses.
    * A99: Areas with a 1% annual chance of flooding that will be protected by a Federal flood control system.
    * AH: Areas with a 1% annual chance of shallow flooding, usually in the form of a pond, with an average depth ranging from 1 to 3 feet
    * AO: River or stream flood hazard areas, and areas with a 1% or greater chance of shallow flooding each year, usually in the form of sheet flow, with an average depth ranging from 1 to 3 feet.
    * V or VE: Coastal areas with a 1% or greater chance of flooding and an additional hazard associated with storm waves.
    * X: outside the 500-year flood and protected by levee from 100- year flood
    * D: undetermined flood hazards
State level data is directly downloadable at the FEMA flood hazard website, here: https://msc.fema.gov/portal/advanceSearch. To access the appropriate state-level file select a state, then a county, then a community (doesn't matter which county or community), then hit search, then click "effective products", then NFHL Data-State, then download

For the GRIDCERF flood hazard exclusion layers, the following risk categories are used to create separate suitability: A/AE and V/VE. This notebook assumes that the A/AE and V/VE data has been separately downloaded for each state individually.

### 1.3 Import modules

In [8]:
import os

import numpy as np
import pandas as pd
import geopandas as gpd
import rasterio
from rasterio.plot import show
from rasterio import features

## 2. Configuration

In [2]:
# get the parent directory path to where this notebook is currently stored
root_dir = os.path.dirname(os.getcwd())

# data directory in repository
data_dir = os.path.join(root_dir, "data")

# GRIDCERF data directory from downloaded archive absolute path
gridcerf_dir = os.path.join(data_dir, "gridcerf")

# GRIDCERF local technology-specific data directory
technology_specific_dir = os.path.join(gridcerf_dir, 'technology_specific')

# GRIDCERF local common data directory
common_dir = os.path.join(gridcerf_dir, 'common')

# GRIDCERF reference data directory
reference_dir = os.path.join(gridcerf_dir, 'reference')

# GRIDCERF compiled final suitability data directory
compiled_dir = os.path.join(gridcerf_dir, "compiled")

# template land mask raster
template_raster = os.path.join(reference_dir, "gridcerf_sitingmask.tif")

# template ocean mask raster
ocean_template_raster = os.path.join(reference_dir, "gridcerf_oceanmask.tif")

# source data directory
source_dir = os.path.join(gridcerf_dir,  "source", 'technology_specific', 'flood_hazard_zones')

# V-VE source data file
vve_source_file = os.path.join(source_dir, "usfema_zone_v-ve_shp", "V-VE.shp")

# share drive directory
aae_dir = os.path.join(source_dir, "usfema_zone_a-ae_shp")

# temporary output raster for processing
temp_output_raster = os.path.join(source_dir, "temporary_raster.tif")

# temporary outpust shapefile for processing
temp_output_shp = os.path.join(source_dir, "gridcerf_flood_risk_shp" ,"temporary_shp.shp")


## 3. Generate Raster(s)

### 3.1 Functions to build files

In [3]:
def process_coastal_flood_shp_file(source_file = vve_source_file):
    """ 
    Process shapefile and prepare a rasterization field.
    """
    gdf = gpd.read_file(source_file)
    
    # dissolve into single polygon
    gdf['value'] = 1
    gdf = gdf.dissolve(by="value", as_index=False)
    
    # reproject shapefile
    gdf.to_crs("ESRI:102003", inplace=True)

    return gdf

def process_state_inland_floor_shp_file(source_file):
    
    gdf = gpd.read_file(source_file)
    
    # reproject shapefile
    gdf.to_crs("ESRI:102003", inplace=True)
    
    # set initial value to zero for state-wise matrix multiplication
    gdf['value'] = 0

    return gdf

def combine_state_inland_flood_risk_data(flood_dir = aae_dir):
    """ Iterates through all state-level flood risk shapefiles, reprojects them, rasterizes them, and 
    combines the rasters into one CONUS raster"""

    file_list =[]
    # loop through shapefile directory
    for file in os.listdir(flood_dir):
        if file.endswith('shp'):
        
            print(f'Processing {file}...')
        
            file_list.append(file)
            index = file_list.index(file)
        
            file_path = os.path.join(aae_dir, file)
        
            # read in and process state-level shapefile
            gdf = process_state_inland_floor_shp_file(file_path)
        
            print(f'Saving {file} to shp...')
            # save state_level file to temp shapefile
            gdf.to_file(temp_output_shp)
        
            print(f'Rasterizing {file}...')
            # set paths for temporary rasterization
            output_shp = os.path.join(source_dir,"gridcerf_flood_risk_shp", "temporary_shp.shp")
            output_raster = os.path.join(source_dir, "gridcerf_flood_risk_shp", "temporary_raster.tif")
        
            # construct the GDAL raster command
            gdal_rasterize_cmd = f"gdal_rasterize -a value -tr 1000.0 1000.0 -init 1 -te -2831615.228 -1539013.3223 2628318.0948 1690434.1707 -ot Int16 -of GTiff {output_shp} {output_raster}"

            # execute the GDAL command via the system terminal
            os.system(gdal_rasterize_cmd)
        
            # read in temp raster with rasterio
            full_temp_raster_path = os.path.join(source_dir,"gridcerf_flood_risk_shp", "temporary_raster.tif")
            
            temporary_flood_raster = rasterio.open(full_temp_raster_path)
            flood_array = temporary_flood_raster.read(1)
        
            # cross multiply with the merged set
            if index == 0:
                combined_raster = flood_array
            else:
                combined_raster = combined_raster*flood_array

    combined_raster *= template_raster

    return combined_raster

def vector_to_raster(template_raster, land_mask_raster, gdf, value_field, output_raster, include):
                     
    # open the template raster and extract metadata and land mask
    with rasterio.open(template_raster) as template:

        metadata = template.meta.copy()

        # update raster data type
        metadata.update(dtype=np.int16)

        # extract land mask
        land_mask_file = rasterio.open(land_mask_raster)
        land_mask = land_mask_file.read(1)
        land_mask = np.where(land_mask == 0, np.nan, 1)

        # write output raster
        with rasterio.open(output_raster, 'w+', **metadata) as out:

            out_arr = out.read(1)

            # build shapes to rasterize from target geometry and field
            shapes = ((geom, value) for geom, value in zip(gdf.geometry, gdf[value_field]))

            # burn features
            burned = features.rasterize(shapes=shapes, 
                                        fill=0, 
                                        out=out_arr, 
                                        transform=out.transform)
            
            burned = np.where(burned == 1, 1, 0).astype(np.float64)

            # invert suitability for inclusion layer
            if include:
                burned = np.where(burned==1, 0, 1).astype(np.float64)
            else:
                pass
            
            # apply land mask
            burned *= land_mask
            
            # make nan excluded
            burned = np.where(np.isnan(burned), 1, burned)

            out.write_band(1, burned.astype(np.int16))

### 3.2 Generate shapefiles

#### Coastal flood risk

In [5]:
%%time 

# preprocess shapefile
print("Preprocessing shapefile data...")
gdf = process_coastal_flood_shp_file()

# construct temporary shapefile output file path
coastal_output_shp_file_name = "gridcerf_fema_1pct_or_greater_coastal_flood_risk.shp"
output_shp = os.path.join(source_dir, "gridcerf_flood_risk_shp", coastal_output_shp_file_name)

# write output shapefile
gdf.to_file(output_shp)
print('Preprocessing complete')

Preprocessing shapefile data...
Preprocessing complete
CPU times: user 3min 11s, sys: 8.66 s, total: 3min 19s
Wall time: 3min 19s


### 3.3 Generate rasters

#### 100-year inland flood risk

In [None]:
# open the template raster
template_raster = rasterio.open(template_raster).read(1)

In [21]:
%%time
# preprocess shapefile
print("Preprocessing shapefile data...")

combined_array = combine_state_inland_flood_risk_data()

print('Preprocessing complete')

# swap the 1 and 0 values in the combined raster
combined_array = np.where(combined_array == 1, 0, 1)


Preprocessing shapefile data...
Processing NFHL_12_20230302.shp...
Saving NFHL_12_20230302.shp to shp...
Rasterizing NFHL_12_20230302.shp...
0...10...20...30...40...50...60...70...80...90...100 - done.
Processing NFHL_31_20221026.shp...
Saving NFHL_31_20221026.shp to shp...
Rasterizing NFHL_31_20221026.shp...
0...10...20...30...40...50...60...70...80...90...100 - done.
Processing NFHL_55_20230302.shp...
Saving NFHL_55_20230302.shp to shp...
Rasterizing NFHL_55_20230302.shp...
0...10...20...30...40...50...60...70...80...90...100 - done.
Processing NFHL_33_20221013.shp...
Saving NFHL_33_20221013.shp to shp...
Rasterizing NFHL_33_20221013.shp...
0...10...20...30...40...50...60...70...80...90...100 - done.
Processing NFHL_02_20220910.shp...
Saving NFHL_02_20220910.shp to shp...
Rasterizing NFHL_02_20220910.shp...
0...10...20...30...40...50...60...70...80...90...100 - done.
Processing NFHL_34_20230122.shp...
Saving NFHL_34_20230122.shp to shp...
Rasterizing NFHL_34_20230122.shp...
0...10...

#### coastal flood risk

In [16]:
# construct local directory paths
output_tif_file_name = "gridcerf_fema_1pct_or_greater_coastal_flood_risk.tif"
output_raster = os.path.join(technology_specific_dir, output_tif_file_name)

# generate raster for included area
vector_to_raster(template_raster=template_raster, 
                 land_mask_raster=template_raster,
                 gdf=gdf, 
                 value_field="value",
                 output_raster=output_raster, 
                include=False)

#### 100-year flood

In [22]:
# save raster file to technology_specific folder
output_tif_file_name = "gridcerf_fema_1pct_or_greater_inland_flood_risk.tif"
output_raster_path = os.path.join(technology_specific_dir, output_tif_file_name)

# read in template for metadata
template = rasterio.open(template_raster)
metadata = template.meta.copy()

# write file
with rasterio.open(output_raster_path, 'w', **metadata) as dest:
    
    dest.write(combined_array, 1)