# Fluvial Hazard Zone Prediction - Machine Learning Approaches 
# 1. Introduction

<img style="-webkit-user-select: none;background-position: 0px 0px, 10px 10px;background-size: 20px 20px;background-image:linear-gradient(45deg, #eee 25%, transparent 25%, transparent 75%, #eee 75%, #eee 100%),linear-gradient(45deg, #eee 25%, white 25%, white 75%, #eee 75%, #eee 100%);cursor: zoom-in;" src="https://www.denverpost.com/wp-content/uploads/2016/04/20131102__helifloodp1.jpg?w=600" width="839" height="292">

# Background
In 2013, the Colorado Front Range experienced historic flooding caused by an 8 day rainfall event.  Rainfall totals were approximated as a 1,000 year event.  Flooding forced evacuations of thousands of residents, hundreds of home were destroyed and 4 lives were lost. 

http://mediacenter.dailycamera.com/2013/09/12/photos-massive-flash-flooding-along-front-range-of-colorado/#29

Floods of this scale produce the natural phenomenon of channel migration - or the movement of rivers from one location to another within their valleys.  Although flood maps produced by the Federal Emergency Management Agency (FEMA) predict the extent of high water during the 100 year flood, they do not capture the potential channel migration hazards associated with erosion, deposition, degradation, lateral migration, and avulsion.  These processes can move flood waters far beyond expected limits further endangering infrastructure and people. 

A number of state and federal agencies are investigating methodologies to identify these Fluvial Hazard Zones (FHZs) using geomorphic, geologic, hydraulic, and physical data along with a healthy dose of expert knowledge. The <b>Colorado Water Conservation Board (CWCB)</b> has recently released an excellent guidance document for use in these delineation efforts.  

http://coloradohazardmapping.com/hazardMapping/fluvialMapping


# Problem Statement
The recently released guidance from the CWCB provides the critical foundation for an emerging science of Fluvial Hazard Zone predicition.  A pilot study is also underway to map FHZs for a small number of rivers on Colorado's Front Range. This author wholely supports the adoption and implementation of this expert-based approach to identifying unmapped fluvial hazards.  However, the need is great and progress in mapping individual rivers will take years if not decades.  This raises an important question:   
  
<p style="margin-left: 40px"><i><b>- Are there methods that can extend the CWCB's Fluvial Hazard Zone mapping efforts?</i></b></p>  

The availability, level of detail, and abundance of predictive data from before and after the 2013 floods provides the opportunity to apply a new set of tools to the problem. 

<p style="margin-left: 40px"><i><b>- What additional insights can the 2013 flood provide for hazard mitigation purposes?</i></b></p>

This project will explore the ability of <b> MACHINE LEARNING</b> and <b>ARTIFICIAL INTELLIGENCE ALGORITHMS</b> to predict the 2013 flood impacts (FHZs specifically) using 2011 data.  While this exploration is in its early stages, the results show promise.  Although this effort is isolated from the CWCB's current efforts, future integration of these methods is both possible and mutually beneficial.  The algorithms could be improved with expert-level guidance, and CWCB's pilot study could be scaled up to the many stream miles where people live in harm's way.   

### What exactly is machine learning and artificial intelligence?

Very simply said, machine learning and artifical intelligence employ a wide range of algorithms that learn patterns and can make predictions from data.  Here's some more reading for reference:

https://medium.com/machine-learning-for-humans/why-machine-learning-matters-6164faf1df12


#  The South St. Vrain River.  Lyons, Colorado.  
This reach of river experienced significant channel migration
which damaged roads and infrastructure.  In addition to having detailed pre and 
post flood topography along this reach, the Colorado Water Conservation Board is conducting a 
Fluvial Hazard Zone Delineation of this reach using an expert-based geomorphic analysis.

# The Data
We've gridded the problem area into 3ft by 3ft cells within a 500 foot buffer of the South Saint Vrain. Within that grid, we've queried the source datasets for information on topographic characteristics, infrastructure, spatial relationships, and the change in ground elevation after the 2013 floods.  Latitude and longitude also uniquely identify each 3ft by 3ft cell. 

The results of that geoprossessing have been exported to a .txt file, where each row represents the characteristics of one 3ft by 3ft cell. Take a look at the table below to see the features of the dataset. Note this is only a small subset of the entire dataset.

In [21]:
import pandas as pd
import numpy as np

# Import data from .txt file
txt = r'/Users/Daniel/Documents/Programming/Project_Scripts/CMZ/data/SSV_FINAL.txt'
df = pd.read_csv(txt, sep=",", header=0)

# Reorder the columns and drop FID field (it's duplicated by dataframe index)
reordered_columns = ['long_WGS84', 'lat_WGS84', 'topo2011', 'ground_slope', 'ground_curve', 'near_crossing',
                     'near_road', 'near_stream', 'stream_slope', 'relative_elevation', 'ground_delta' ]
df = df[reordered_columns]

# print the end of the dataframe
df.tail()

Unnamed: 0,long_WGS84,lat_WGS84,topo2011,ground_slope,ground_curve,near_crossing,near_road,near_stream,stream_slope,relative_elevation,ground_delta
906125,-105.284159,40.207955,5494.540039,21.440399,6.477864,1791.599976,334.218994,602,1.42103,-9999.0,-0.838379
906126,-105.284148,40.207955,5492.620117,18.818701,-4.72819,1789.109985,334.148987,602,1.42103,-9999.0,-0.500488
906127,-105.284137,40.207955,5491.149902,13.4372,-2.76964,1786.630005,334.078003,603,1.42103,-9999.0,0.287109
906128,-105.284127,40.207955,5490.379883,12.5884,3.78418,1784.160034,334.007996,603,1.42103,-9999.0,0.397949
906129,-105.284116,40.207955,5489.629883,14.8452,6.857639,1781.680054,333.937012,604,1.42103,-9999.0,-0.269531


# The Data Features

Descriptions for each feature area below.  Features were generated by a suite of geoprocessing methods, primarily using ESRI's ArcGIS Pro version 1.3.  
Links to publicly availalbe data sources can be found at the end of this notebook.

    long_WGS84:         X coordinate of the cell in WORLD GEODETIC SYSTEM 1984 (WGS84) coordinate system  
    lat_WGS84:          Y coordinate of the cell in WORLD GEODETIC SYSTEM 1984 (WGS84) coordinate system   
    topo2011:           Elevation of cell from LiDAR flight performed in 2011 (ft)
    ground_slope:       Slope of the ground at each cell, averaged over a 2 cell radius (%)  
    ground_curve:       Planform curvature of the ground at each cell, averaged over a 2 cell radius (%)
    near_crossing:      Distance from the nearest bridge or culvert crossing  (ft)
    near_road:          Distance from the nearest roadway (ft)  
    near_stream:        Distance from the nearest stream (ft)  
    stream_slope:       Slope of the nearest stream over a 500 foot reach (%)
    relative_elevation: The vertical distance above the nearest stream for each cell (ft)  
    ground_delta:       The difference between 2011 pre-flood and 2013 post-flood elevations from LiDAR data (ft)  

# Visualizing The Data

The South Saint Vrain study area contains over 900,000 points, each representing an individual 3ft by 3ft cell.  We can import, clean, fit, and predict on this number of points with relative ease.  However, for speed of visualizations the dataset has been downsampled to approximately 1% of the total.  

<b>EXPLORE THE MAP BELOW</b> by selecting the map widgets on the right side of the figures.  

Can you decipher any patterns in this data using the visualization? Are there any problems that jump out? How noisy is this data?  We'll explore these questions and others in the analysis.

In [2]:
# Google Mapping API:
# key:AIzaSyDbo5FlMFzns5OzeuW1TA7dOikvEuF-eYI
# https://console.developers.google.com/home/activity?project=caramel-spot-199603

### Preparing the line geometry for plotting

In [64]:
import geopandas as gpd

# File path
SSV = r"/Users/Daniel/Documents/Programming/Project_Scripts/CMZ/shp/stvrain.shp"

# Read the data
SSV_river = gpd.read_file(SSV)

# Define function for capturing x and y coordinates
def getLineCoords(row, geom, coord_type):
    """Returns a list of coordinates ('x' or 'y') of a LineString geometry"""
    if coord_type == 'x':
        return list( row[geom].coords.xy[0] )
    elif coord_type == 'y':
        return list( row[geom].coords.xy[1] )
    
# Calculate x coordinates of the line
SSV_river['x'] = SSV_river.apply(getLineCoords, geom='geometry', coord_type='x', axis=1)

# Calculate y coordinates of the line
SSV_river['y'] = SSV_river.apply(getLineCoords, geom='geometry', coord_type='y', axis=1)

# Make a copy and drop the geometry column
SSV_river_df = SSV_river.drop('geometry', axis=1).copy()

# print the South Saint Vrain dataframe
SSV_river_df

Unnamed: 0,OBJECTID,CommonName,DecreeName,Alias,FeatureTyp,RuleID,WaterSourc,Division,District,StructureI,...,Comments,CreatedDat,CreatedBy,UpdatedDat,UpdatedBy,DataSource,Shape_STLe,ShapeSTLen,x,y
0,3991031,South Saint Vrain Creek,S ST VRAIN CR MIN FLOW 3,,Perennial Stream,11,ST VRAIN CREEK,1,5,2129,...,,2014-10-16T00:00:00.000Z,ndattels,2014-10-16T00:00:00.000Z,ndattels,2012 DRAPP,54178.452168,53968.01484,"[-105.29008190112451, -105.28994501972471, -10...","[40.20883341918259, 40.20891264874306, 40.2090..."


### Setup Interactive Plot

In [94]:
# Visualize the data using bokeh and the Google Maps API

# Import modules
from bokeh.io import show, output_notebook
from bokeh.plotting import ColumnDataSource, figure, gmap
from bokeh.layouts import row
from bokeh.models import GMapOptions, LinearColorMapper, ColorBar, LogTicker, HoverTool
 
# Create a downsampled version of the full dataframe for plotting (avoids data limit restrictions)
df_sample = df.sample(frac=0.005, replace=False)

# Create a ColumnDataSource from df: source
source = ColumnDataSource(df_sample)
river_source = ColumnDataSource(SSV_river_df)

# Set the mapping options, location and zoom level
map_options = GMapOptions(
    lat=np.mean(df['lat_WGS84']), 
    lng=np.mean(df['long_WGS84']),
    map_type="hybrid", zoom=14)

# Create the google maps figure: p
p = gmap(
    "AIzaSyDbo5FlMFzns5OzeuW1TA7dOikvEuF-eYI", 
    map_options, title="South Saint Vrain, 2011 FHZ prediction points (colored by elevation)", 
    tools='pan, wheel_zoom, box_select,lasso_select, reset, save',
    plot_width=900)

# Develop a color gradient for plotting, and color bar for legend
color_mapper = LinearColorMapper(
    palette='Plasma256',
    low=5350,
    high=df_sample['topo2011'].max())

color_bar = ColorBar(color_mapper=color_mapper, ticker=LogTicker(), major_label_text_align='right',
                 label_standoff=15, border_line_color=None, location=(0,0))

# Add circle glyphs to figure p
p.circle(
    x="long_WGS84", 
    y="lat_WGS84", 
    size=8, 
    source=source, 
    color=dict(field='topo2011', transform=color_mapper), fill_alpha=0.4)

# Create a HoverTool object: hover
hover = HoverTool(tooltips=[
    ('topo2011', '@topo2011{(0.00)}'),
    ('ground_delta', '@ground_delta{0.00}')])

# Add the HoverTool object to figure p
p.add_tools(hover)

# Create row layout of color bar and figure p:
p.add_layout(color_bar, 'left')

# Add the River Line to the map from our 'msource' ColumnDataSource -object
p.multi_line('x', 'y', source=river_source, color='cyan', line_width=3, legend="South Saint Vrain")

# Label the axes
p.xaxis.axis_label = 'longitude WGS84'
p.yaxis.axis_label = 'latitude WGS84'


# display the plot
output_notebook()
show(p)

# References:

### 2013 Flood and Fluvial Hazard Mapping References:  
https://www.denverpost.com/wp-content/uploads/2016/04/20131102__helifloodp1.jpg?w=600 : Photo credit 
http://mediacenter.dailycamera.com/2013/09/12/photos-massive-flash-flooding-along-front-range-of-colorado/#29 : Local news coverage 
http://coloradohazardmapping.com/hazardMapping/fluvialMapping : CWCB Fluvial Hazard Mapping Delineation Guide

### Technical References:   
https://developers.google.com/maps/documentation/javascript/maptypes  
https://developers.google.com/maps/documentation/javascript/basics  
https://bokeh.pydata.org/en/latest/docs/user_guide/geo.html  
http://bokeh.pydata.org/en/latest/docs/reference/palettes.html  
https://stackoverflow.com/questions/46060899/python-bokeh-histogram-adjusting-x-scale-and-chart-style : bokeh histograms  
https://automating-gis-processes.github.io/2016/Lesson5-interactive-map-bokeh.html : importing shp files to bokeh
https://github.com/bokeh/datashader/blob/master/datashader/utils.py : Convert WGS to Mercator coordinates  
https://medium.com/machine-learning-for-humans/why-machine-learning-matters-6164faf1df12 : Article on ML and AI  
https://console.developers.google.com/home/activity?project=caramel-spot-199603 : Google maps API
