# Wildcat Creek 2018 Labor Day Flood Inundation Mapping

This notebook uses the Wildcat Creek (near Manhattan, KS) Labor Day flood event in 2018 to demonstarte how to map the flood event using FLDPLN tiled library.

## Import Modules

Import necessary modules.

In [1]:
import sys
import time
from dask.distributed import Client, LocalCluster
from dask import visualize

### Import FLDPLN Modules

In [2]:
# Tool/script folder
fldplnToolFolder = r'Z:\FLDPLN\tools_os' # tool development folder, has the latest version

# Add the tool/script folder to sys.path to access fldpln modules
sys.path.append(fldplnToolFolder) 
# fldpln module
from fldpln import *
from fldpln_library import *
from fldpln_gauge import *

## Setup Input Tiled Library and Output Folders

Here we setup the folder under which tiled libraries (organized as folders) are located. We also setup the output folder (i.e., outputFolder) under which a map folder and a 'scratch' folder are created. The map folder, which is specified later, comtains all inundation depth maps. The scratch folder stores temporary files.

In [3]:
# tiled library folder
libFolder =  r'E:\fldpln\sites\wildcat_10m_3dep\tiled_snz_library'

# libraries to be mapped
allLibNames = ['lib2']

# Set output folder
outputFolder = r'E:\fldpln\sites\wildcat_10m_3dep\maps'

## Prepare Gauge Stage and Calculate Gauge Depth of Flow (DOF)

Here we obtain and prepare flood event stages from stream gauges. The stage at a gauge typically refers to the gauge's datum, which is not necessary of the stream bed elevation which is based on a certain vertical datum. In order to use gauge stage in a FLDPLN library, we need to make sure that gauge stage elevation (gauage + stage) and FSP's filled elevation are based on the same vertical datum. The depth of flow (DOF) at the FSP can then be calculated as the difference. The Wildcat Creek DEM and FLDPLN library are based on the NAVD88 vertical datum. So gauge stage elevations need to be based on the vertical datum too to calculate the DOFs at those gauges. 

### Gauge Stage from AHPS and USGS

Both USGS and NWS AHPS maintain stream gauages which record past flood stages. There are three AHPS and USGS gauges ([WKCK1](https://water.weather.gov/ahps2/hydrograph.php?wfo=top&gage=wkck1), [MWCK1](https://water.weather.gov/ahps2/hydrograph.php?wfo=top&gage=MWCK1), [MSTK1](https://water.weather.gov/ahps2/hydrograph.php?wfo=top&gage=MSTK1)) on the Wildcat Creek that record the 2018 Labor Day flood event. Here we use the maximum Labor Day flood event stages at those gauges to map the maximum inundation extent and depth of the event.

#### Event Stage from AHPS Historic Crests

The flood stage for the 2018 Labor Day flood event in 2018 are availble as AHPS histroic crests at those gauges [WKCK1](https://water.noaa.gov/gauges/WKCK1), [MWCK1](https://water.weather.gov/ahps2/hydrograph.php?wfo=top&gage=MWCK1) and [MSTK1](https://water.weather.gov/ahps2/hydrograph.php?wfo=top&gage=MSTK1). Excel file wildcat_gauges_albers_meters.xlsx has several sheets which store both gauge information (for example, gauge datum) and the event statges with different gauge combinations. The key fields needed for those gauges are: stationid, x, y, and stage_elevation

Note that most USGS and AHPS gauge stages are measured in feet and **Make sure that gauge coordinates are in the same coordinate system of the library and gauge stages are also in the same vertical unit of the library.** 

In [4]:
# # Two downstream gauges 
# gaugeStageFileName = 'wildcat_gauges.xlsx' # KS LiDAR DEM in UTM with vertical unit in feet
gaugeStageFileName = 'wildcat_gauges_albers_meters.xlsx' # 3DEP DEM in Albers with vertical unit in meters
sheetName = 'ThreeGauges' # all 3 gauges
# sheetName = 'TwoDsGauges' # 2 downstream gauges
# sheetName = 'MSTK1' # the last downstream gauge used in HEC-RAS model

# read gauge file
gaugeStages = pd.read_excel(gaugeStageFileName, sheet_name=sheetName) 
# print(gaugeStages)

# Need to calculate gauge stage elevation if necessary!

# keep only necessary fields from gauges
keptFields = ['stationid','x','y','stage_elevation']
gaugeWithStageElevations = gaugeStages[keptFields]
print(gaugeWithStageElevations)

        stationid           x             y  stage_elevation
0  06879805,WKCK1 -60735.0580  1.799635e+06       343.911936
1  06879810,MWCK1 -54988.6141  1.796210e+06       325.907400
2  06879815,MSTK1 -52277.2352  1.795783e+06       317.851536


#### Event Stage from USGS NWIS

We can also get event maximum stage directly from USGS NWIS to cehck the historic crests from AHPS. Note that the stages are in feet and we need to convert stages to stage elevation before using it in flood inundation mapping.

In [5]:
# Wildcat Creek 3 USGS gauges (in the order from upstream to downstream)
usgsIds = ['06879805','06879810','06879815'] 
ahpsIds = ['WKCK1','MWCK1','MSTK1']

# A period between two dates: Wildcat Creek Sep.3 2018 flood event
instStages = GetUsgsGaugeStageFromWebService(usgsIds,startDate='2018-09-02',endDate='2018-09-04')
print(instStages)

# find the max stage within the time period
maxStages = instStages.groupby(['stationid'],as_index=False).agg({'stage_ft':'max'})
# find the most recent time with the max stage
tdf = pd.merge(instStages, maxStages, how='inner', on=['stationid','stage_ft'])
gaugeStagesFromNwis = tdf.groupby(['stationid'], as_index=False).agg({'stationid':'first','stage_ft':'first','stage_time':'max'})
print(gaugeStagesFromNwis)



    stationid  stage_ft                     stage_time
0    06879805      6.87  2018-09-02T00:00:00.000-05:00
1    06879805      6.87  2018-09-02T00:15:00.000-05:00
2    06879805      6.87  2018-09-02T00:30:00.000-05:00
3    06879805      6.87  2018-09-02T00:45:00.000-05:00
4    06879805      6.87  2018-09-02T01:00:00.000-05:00
..        ...       ...                            ...
812  06879815      5.73  2018-09-04T22:45:00.000-05:00
813  06879815      5.73  2018-09-04T23:00:00.000-05:00
814  06879815      5.72  2018-09-04T23:15:00.000-05:00
815  06879815      5.72  2018-09-04T23:30:00.000-05:00
816  06879815      5.71  2018-09-04T23:45:00.000-05:00

[817 rows x 3 columns]
  stationid  stage_ft                     stage_time
0  06879805     25.97  2018-09-03T04:45:00.000-05:00
1  06879810     28.29  2018-09-03T07:00:00.000-05:00
2  06879815     25.18  2018-09-03T08:30:00.000-05:00


### Synthetic Gauge Stage from the National Water Model and HAND

HAND FIM uses NWM's discharge and turn it into stage. Here we use HAND reach stage to run FLDPLN for the event. Concepually, we turn reach stage into a synthetic gauge located at the either the mid-point or the outlet of the reach. Selecting the HAND reaches and sythteric gauge location is done by graduate student David Weiss manually for the Wildcat Creek example. Those sytheteic gauges can be treated as USGS/AHPS guages. The key fields needed are: stationid, x, y, and stage_elevation.  Note that we assume the HAND reach stage elevation is the same as the FLDPLN library DEM. 

In [None]:
# Synthetic FSP gauges from NWC reach stage
# gaugeStageFileName = 'wildcat_gauges.xlsx'
# sheetName = 'ReachStageAsDof' 
gaugeStageFileName = 'wildcat_gauges_albers_meters.xlsx'
# sheetName = 'ReachMedianStage' # HAND reach median stage as DOF
sheetName = 'ReachOutletStage' # HAND reach outlet stage as DOF

# read gauge file
gaugeStages = pd.read_excel(gaugeStageFileName, sheet_name=sheetName) # 3 gauges
print(gaugeStages)

# Need to calculate gauge stage elevation if necessary!

# keep only necessary fields from gauges
keptFields = ['stationid','x','y','stage_elevation']
gaugeWithStageElevations = gaugeStages[keptFields]
print(gaugeWithStageElevations)

### Snap Gauges to FSPs and Calculate Gauge DOF

Here we snap gauges (with their stage elevation) to FLDPLN flood source pixels (FSPs), which are the stream pixels. Each snapped gauge FSP has a stream bed elevaltion, which is used to claculate the depth of flow/flood (DOF) at those FSPs. 

This process also identifies the FLDPLN libraries that the gauges belong to. Note that the same gauges can be snapped to more than one library as FLDPLN libraries may overlap and the overalpping FSPs may have different coordinates! 

In [6]:
# snap gauges to FSPs on-the-fly
print('Snap gauges to FSPs ...')
print(f'Number of gauges: {len(gaugeWithStageElevations.index)}')

# FLDPLN libraries to whose FSPs gauges are sanpped. All the libraries by default but can be a subset
libs2Map = ['lib2']

# snap the gauges to FSPs. 
# Fields 'StrOrd','DsDist','SegId','FilledElev'are used for interpolating other FSP DOF
# Note that 'lib_name','FspX', 'FspY' together uniquely identify a FSP (as there are overlapping FSPs between libraries)!
gaugeFspDf = SnapGauges2Fsps(libFolder,libs2Map,gaugeWithStageElevations,snapDist=350,gaugeXField='x',gaugeYField='y',fspColumns=['FspX','FspY','StrOrd','DsDist','SegId','FilledElev']) 
print(gaugeFspDf)

# calculate gauge FSP's DOF
gaugeFspDf['Dof'] = gaugeFspDf['stage_elevation'] - gaugeFspDf['FilledElev']

# keep only necessary columns for gauge FSPs
gaugeFspDf = gaugeFspDf[['lib_name','FspX','FspY','StrOrd','DsDist','SegId','FilledElev','Dof']] # Note that 'lib_name','FspX', 'FspY' together uniquely identify a FSP!!!

# show info
print(f'Number of snapped gauge FSPs: {len(gaugeFspDf)}')
# Find libs where the gauges are snapped to, and they are the actual libs to map
libs2Map = gaugeFspDf['lib_name'].drop_duplicates().tolist()
print(f'Libraries gauges snapped to: {libs2Map}')
print(gaugeFspDf)

#
# save snapped gauges to CSV file for checking
# gaugeFspDf.to_csv(os.path.join(outputFolder, 'SnappedGauges.csv'), index=False)

Snap gauges to FSPs ...
Number of gauges: 3
   index       stationid           x             y  stage_elevation  \
0      0  06879805,WKCK1 -60735.0580  1.799635e+06       343.911936   
1      1  06879810,MWCK1 -54988.6141  1.796210e+06       325.907400   
2      2  06879815,MSTK1 -52277.2352  1.795783e+06       317.851536   

   d2NearestFsp          FspX          FspY  StrOrd        DsDist  SegId  \
0      7.437232 -60738.205794  1.799628e+06     1.0  29037.232304    3.0   
1      2.043149 -54988.205794  1.796208e+06     1.0  16424.448763    9.0   
2      4.761831 -52278.205794  1.795788e+06     1.0   9202.762620   12.0   

   FilledElev lib_name  
0  338.434052     lib2  
1  318.953552     lib2  
2  311.347992     lib2  
Number of snapped gauge FSPs: 3
Libraries gauges snapped to: ['lib2']
  lib_name          FspX          FspY  StrOrd        DsDist  SegId  \
0     lib2 -60738.205794  1.799628e+06     1.0  29037.232304    3.0   
1     lib2 -54988.205794  1.796208e+06     1.0  16424.

  nearestP2Df = pd.concat([nearestP2Df,t],ignore_index=False)


## Interpolate FSP's DOF

Here we interpolate the DOF for all the FSPs between the gauge-FSPs using their DOF calculated from previous step. The interpolation uses stream orders and starts from low stream order (i.e., main streams) to high stream order (i.e., tributatried). Either horizontal or vertical (by defaut) interpolation can be used.

In [7]:
# Find libs with snapped gauges. They are the actual libs to map
libs2Map = gaugeFspDf['lib_name'].drop_duplicates().tolist()

# prepare the DF for storing interpolated FSP DOF
fspDof = pd.DataFrame(columns=['LibName','FspId','Dof'])

# prepare DFs for saving interpolated FSPs and their segment IDs
fspCols = fspInfoColumnNames + ['Dof']
segIdCols = ['SegId','LibName']
fsps = pd.DataFrame(columns=fspCols)
segIds =pd.DataFrame(columns=segIdCols)

# map each library
for libName in libs2Map:
    # interpolate DOF for the gauges
    # print('Interpolate FSP DOF using gauge DOF ...')
    # fspIdDof = InterpolateFspDofFromGauge(libFolder,libName,gaugeFspDf) # 'V' by default
    fspIdDof = InterpolateFspDofFromGauge(libFolder,libName,gaugeFspDf,weightingType='H') # horizontal interpolation
    fspIdDof['LibName'] = libName
    # fspDof = fspDof.append(fspIdDof[['LibName','FspId','Dof']], ignore_index=True)
    fspDof = pd.concat([fspDof,fspIdDof[['LibName','FspId','Dof']]], ignore_index=True)

    # Keep interpolated FSP DOF for saving later
    fspFile = os.path.join(libFolder, libName, fspInfoFileName)
    fspDf = pd.read_csv(fspFile) 
    fspDf = pd.merge(fspDf,fspDof,how='inner',on=['FspId'])
    # fsps = fsps.append(fspDf, ignore_index=True)
    fsps = pd.concat([fsps,fspDf], ignore_index=True)
    
    # Keep FSP segment IDs for saving later
    t =  pd.DataFrame(fspDf['SegId'].drop_duplicates().sort_values())
    t['LibName'] = libName
    # segIds = segIds.append(t, ignore_index=True)
    segIds = pd.concat([segIds,t], ignore_index=True)

# show interpolated FSPs with Dof
print(fspDof)

#
# save interpolated FSP DOF and their segments for checking. This block of code should be commented out if no-checking needed
#
# Save DOF and segment IDs to CSV files
FspDofFile = os.path.join(outputFolder, 'Interpolated_FSP_DOF.csv')
SegIdFile = os.path.join(outputFolder, 'Interpolated_SegIds.csv')
fsps.to_csv(FspDofFile, index=False)
segIds.to_csv(SegIdFile, index=False)

# # turn interpolated sgements into a shapefile
# for libName in libs2Map:
#     segShp = os.path.join(libFolder, libName, 'stream_orders.shp')
#     segs = gpd.read_file(segShp)
#     segs['LibName'] = libName
#     # print(segs)
#     # join by two fields: SegId and LibName
#     segDf = pd.merge(segs,segIds,how='inner',on=['SegId','LibName'])
#     # print(segDf)
#     # write segments as a shapefile
#     segDf.to_file(os.path.join(outputFolder, 'Interpolated_Segements.shp'))

     LibName FspId       Dof
0       lib2   150  5.230906
1       lib2   151  5.237311
2       lib2   152  5.241841
3       lib2   153  5.248247
4       lib2   154  5.252776
...      ...   ...       ...
1762    lib2  1912  0.629674
1763    lib2  1913  0.445247
1764    lib2  1914  0.314837
1765    lib2  1915  0.184427
1766    lib2  1916  0.000000

[1767 rows x 3 columns]


  fspDof = pd.concat([fspDof,fspOrd],ignore_index=True)
  fspDof = pd.concat([fspDof,fspIdDof[['LibName','FspId','Dof']]], ignore_index=True)
  fsps = pd.concat([fsps,fspDf], ignore_index=True)


## Map Flood Inundation Depth


### Set Mapping Parameters

Setup the map folder (i.e., outMapFolderName) which is under the output folder and comtains all inundation depth maps. Additional settings include whether to mosaic tiles as single COG file and whether use a Dask local cluster to speed up the mapping.

In [9]:
# set up map folder
outMapFolderName = 'labor_day_2018_3_gauges'

# Create folders for storing temp and output map files
outMapFolder,scratchFolder = CreateFolders(outputFolder,'scratch',outMapFolderName)

# whether mosaci tiles as a single COG
mosaicTiles = True #True #False

# Using LocalCluster by default
useLocalCluster = False # This doesn't work on my office desktop though it works fine on KBS server
numOfWorkers = round(0.8*os.cpu_count())
numOfWorkers = 6
print(f'Number of workers: {numOfWorkers}')

Number of workers: 6


### Map Inundation Depth

The process of generating inundation depth map happens here.

In [10]:
# show mapping info
print(f'Tiled FLDPLN library folder: {libFolder}')
print(f'Map folder: {outMapFolder}')
# Find libs needs mapping
libs2Map = fspDof['LibName'].drop_duplicates().tolist()
print(f'Libraries to map: {libs2Map}')

# check running time
startTimeAllLibs = time.time()

# create a local cluster to speed up the mapping. Must be run inside "if __name__ == '__main__'"!!!
if useLocalCluster:
    # cluster = LocalCluster(n_workers=4,processes=False)
    try:
        print('Start a LocalCluster ...')
        # NOTE: set worker space (i.e., local_dir) to a folder that the LocalCluster can access. When run the script through a scheduled task, 
        # the system uses C:\Windows\system32 by default, which a typical user doesn't have the access!
        # cluster = LocalCluster(n_workers=numOfWorkers,memory_limit='32GB',local_dir="D:/projects_new/fldpln/tools") # for KARS production server (192G RAM & 8 cores)
        # cluster = LocalCluster(n_workers=numOfWorkers,processes=False) # for KARS production server (192G RAM & 8 cores)
        cluster = LocalCluster(n_workers=numOfWorkers,memory_limit='8GB',local_dir="E:\temp") # for office desktop (64G RAM & 8 cores)
        # print('Watch workers at: ',cluster.dashboard_link)
        print(f'Number of workers: {numOfWorkers}')
        client = Client(cluster)
        # print scheduler info
        # print(client.scheduler_info())
    except:
        print('Cannot create a LocalCLuster!')
        useLocalCluster = False

# dict to store lib processing time
libTime={}

# map each library
for libName in libs2Map:
    # check running time
    startTime = time.time()
    
    # select the FSPs within the lib
    fspIdDof = fspDof[fspDof['LibName']==libName][['FspId','Dof']]

    # mapping flood depth
    if useLocalCluster:
        print(f'Map [{libName}] using LocalCLuster ...')
        # generate a DAG
        dag,dagRoot=MapFloodDepthWithTilesAsDag(libFolder,libName,'snappy',outMapFolder,fspIdDof,aoiExtent=None)
        if dag is None:
            tileTifs = None
        else:
            # visualize DAG
            # visualize(dag)
            # Compute DAG
            tileTifs = client.get(dag, dagRoot)
            if not tileTifs: # list is empty
                tileTifs =  None
    else:
        print(f'Map {libName} ...')
        tileTifs = MapFloodDepthWithTiles(libFolder,libName,'snappy',outMapFolder,fspIdDof,aoiExtent=None)
    print(f'Actual mapped tiles: {tileTifs}')

    # Mosaic all the tiles from a library into one tif file
    if mosaicTiles and not(tileTifs is None):
        print('Mosaic tile maps ...')
        mosaicTifName = libName+'_'+outMapFolderName+'.tif'
        # Simplest implementation, may crash with very large raster
        MosaicGtifs(outMapFolder,tileTifs,mosaicTifName,keepTifs=False)
    
    # check time
    endTime = time.time()
    usedTime = round((endTime-startTime)/60,3)
    libTime[libName] = usedTime
    # print(f'{libName} processing time (minutes):', usedTime)

# Show processing time
# Individual lib processing time
print('Individual library mapping time:', libTime)
# total time
endTimeAllLibs = time.time()
print('Total processing time (minutes):', round((endTimeAllLibs-startTimeAllLibs)/60,3))

#
# Shutdown local clusters
#
if useLocalCluster:
    print('Shutdown LocalCluster ...')
    cluster.close()
    client.shutdown()
    client.close()
    useLocalCluster = False

Tiled FLDPLN library folder: E:\fldpln\sites\wildcat_10m_3dep\tiled_snz_library
Map folder: E:\fldpln\sites\wildcat_10m_3dep\maps\labor_day_2018_3_gauges
Libraries to map: ['lib2']
Map lib2 ...
Tiles need to be mapped: [1]
Actual mapped tiles: ['E:\\fldpln\\sites\\wildcat_10m_3dep\\maps\\labor_day_2018_3_gauges\\lib2_tile_1.tif']
Mosaic tile maps ...
Individual library mapping time: {'lib2': 0.011}
Total processing time (minutes): 0.011
