**Extract Flow Accumulation at Bogs**

Adrian Wiegman

2023-08-12

-------

## Description

This notebook turns raw data from LiDAR DEM, NHD flowlines, and polygons of cranberry bogs, into a layer of polygons representing the topographic catchments draining into each cranberry bog in southeast Massachusetts. 

Even though the lidar data as been hydro flattened/enforced, there are still a number of flowpaths that are not detected underneath highways (e.g. interstate I-495, I-195)

The solution is to "burn" in stream flowlines from the national hydrography dataset. This is done by putting a buffer around the stream flowline network, then assigning an arbitary large value to the stream network polygon, then converting to raster and subtracting the stream elevations from the DEM. 

Once D8 flow direction and flow accumulation rasters have been made the primary objectives can be completed. 

Data Sources:

1. Mass GIS Lidar DEM (1ft vertical resolution, ~1m horizontal resolution)
2. USGS National Hydrography Dataset
- Flow lines
3. Cranberry bogs layer

Steps of Processing: 

1. Prepare Lidar
   - 1M Lidar Elevation -> Clip to study area -> 
   - Resample to 10m resolution using aggregate minimum cell value
   - Fill Sinks 

2. Prepare Flowlines
   - combine flowlines into one layer
   - dissolve flowlines
   - buffer flowlines to 3x the resolution of processed lidar. 
       - buffering width of 15m 
   - convert to raster
3. Burn in flow lines
    - assign a value of -100 feet to flow lines
    - add flowlines to existing elevation (subtract 100 feet)
    
4. fill sinks (again)
5. D8 Flow Direction 
6. Flow Accumulation
7. Generate Bog Pour Points
    - find maximum flow accumulation value inside each bog.
    - generate a point at each bog maximum value.
    - identity to get cranberry bog attributes at each point.
8. Delineate Basins for each point
    - with the bog cranberry points loop through each point
        - delineate watershed using the bog pour point and the D8 flow direction
        - save the output to a temp file with the feature ID ('FID') of the cranberry bog. 
    - merge all cranberry bog basins to one polygon layer, containing the FID of the cranberry bog. 

## Setup Environment

In [102]:
# iphython options
# delete variables in workspace
%reset -f
#places plots inline
%matplotlib inline
#automatically reloads modules if they are changed
%load_ext autoreload 
%autoreload 2
# this codeblock sets up the environment from jupyter notebooks
setup_notebook = "C:/Users/Adrian.Wiegman/Documents/GitHub/Wiegman_USDA_ARS/MEP/_Setup.ipynb"
%run $setup_notebook # magic command to run the notebook 

The autoreload extension is already loaded. To reload it, use:
  %reload_ext autoreload
***
loading python modules...

  `module_list` contains names of all loaded modules

...module loading complete

***
loading user defined functions...

type `fn_`+TAB to for autocomplete suggestions

 the object `def_list` contains user defined function names:
   fn_get_info
   fn_arcgis_table_to_df
   fn_arcgis_table_to_np_to_pd_df
   fn_run_script_w_propy_bat
   fn_try_mkdir
   fn_hello
   fn_recursive_glob_search
   fn_regex_search_replace
   fn_regex_search_0
   fn_arcpy_table_to_excel
   fn_agg_sum_df_on_group
   fn_add_prefix_suffix_to_selected_cols
   fn_calc_pct_cover_within_groups
   fn_buildWhereClauseFromList

 use ??{insert fn name} to inspect
 for example running `??fn_get_info` returns:
[1;31mSignature:[0m [0mfn_get_info[0m[1;33m([0m[0mname[0m[1;33m=[0m[1;34m'fn_get_info'[0m[1;33m)[0m[1;33m[0m[1;33m[0m[0m
[1;31mSource:[0m   
[1;32mdef[0m [0mfn_get_info[0m[1;33

In [103]:
wdr = r"C:\Workspace\Geodata\Verify_Discharge"
print(wdr)

C:\Workspace\Geodata\Verify_Discharge


In [104]:
# Create new file geodatabase to store results\n",
gdb = "Verify_Discharge.gdb"

ap.env.workspace = os.path.join(wdr,gdb)

In [105]:
files = ["C:\Workspace\Geodata\MEP\gwbogsheds.gdb\FA_D8_gwe_bf",
"C:\Workspace\Geodata\MEP\gwbogsheds.gdb\FA_D8_gwe_bf_lt1m",
"C:\Workspace\Geodata\MEP\Default.gdb\LidAg10BF_FlowAcc",
"C:\Workspace\Geodata\MEP\gwbogsheds.gdb\FA_Dinf_gwe_bf",
"C:\Workspace\Geodata\MEP\gwbogsheds.gdb\FA_Dinf_gwe_bf_lt1m",
"C:\Workspace\Geodata\MEP\gwbogsheds.gdb\FA_MDF_gwe_bf"]
import re
names = [re.search("\.gdb\\\(.*)",x)[1] for x in files]

In [106]:
# make copy of bogs
arcpy.management.CopyFeatures(
    in_features=r"C:\Workspace\Geodata\MEP\gwbogsheds.gdb\bogs_split",
    out_feature_class=r"C:\Workspace\Geodata\Verify_Discharge\Verify_Discharge.gdb\bogs")

# aggregate polygons of bogs within 20 meters and sharing the same FID. 
arcpy.cartography.AggregatePolygons(
    in_features="bogs",
    out_feature_class=r"C:\Workspace\Geodata\MEP\Default.gdb\bogs_aggregated",
    aggregation_distance="20 Meters",
    minimum_area=None,
    minimum_hole_size="0 SquareMeters",
    orthogonality_option="NON_ORTHOGONAL",
    barrier_features=None,
    out_table=r"C:\Workspace\Geodata\MEP\Default.gdb\bogs_aggregated_Tbl",
    aggregate_field="ORIG_FID"
)

arcpy.analysis.Identity(
    in_features="bogs",
    identity_features="bogs_aggregated",
    out_feature_class=r"C:\Workspace\Geodata\MEP\Default.gdb\bogs_Identity",
    join_attributes="ONLY_FID",
    cluster_tolerance=None,
    relationship="NO_RELATIONSHIPS"
)

arcpy.management.Dissolve(
    in_features="bogs_Identity",
    out_feature_class=r"C:\Workspace\Geodata\MEP\Default.gdb\bogs_Identity_Dissolve",
    dissolve_field="FID_bogs_aggregated",
    statistics_fields=None,
    multi_part="MULTI_PART",
    unsplit_lines="DISSOLVE_LINES",
    concatenation_separator=""
)

In [107]:
in_feat = "bogs"
for i in range(len(files)):
    out_raster = arcpy.ia.ZonalStatistics(
        in_zone_data=in_feat,
        zone_field="OBJECTID",
        in_value_raster=files[i],
        statistics_type="MAXIMUM",
        ignore_nodata="DATA",
        process_as_multidimensional="CURRENT_SLICE",
        percentile_value=90,
        percentile_interpolation_type="AUTO_DETECT",
        circular_calculation="ARITHMETIC",
        circular_wrap_value=360)
    out_rast_name = "ZS_MAX_"+names[i]
    out_raster.save(out_rast_name)
    

In [108]:
# generate one point inside each bog
arcpy.management.FeatureToPoint(
    in_features="bogs",
    out_feature_class=r"bogs_points",
    point_location="INSIDE")

In [109]:
arcpy.management.CalculateGeometryAttributes(
    in_features="bogs_points",
    geometry_property="Lat POINT_X;Long POINT_Y",
    length_unit="",
    area_unit="",
    coordinate_system='PROJCS["NAD_1983_UTM_Zone_19N",GEOGCS["GCS_North_American_1983",DATUM["D_North_American_1983",SPHEROID["GRS_1980",6378137.0,298.257222101]],PRIMEM["Greenwich",0.0],UNIT["Degree",0.0174532925199433]],PROJECTION["Transverse_Mercator"],PARAMETER["False_Easting",500000.0],PARAMETER["False_Northing",0.0],PARAMETER["Central_Meridian",-69.0],PARAMETER["Scale_Factor",0.9996],PARAMETER["Latitude_Of_Origin",0.0],UNIT["Meter",1.0]]',
    coordinate_format="DD"
)

In [110]:
# generate string of rasters to 
long_names = ["ZS_MAX_"+n for n in names]
long_names
_ = ["{} {}".format(os.path.join(wdr,gdb,n),n) for n in long_names]
in_rasters = ";".join(_)
print(in_rasters)

C:\Workspace\Geodata\Verify_Discharge\Verify_Discharge.gdb\ZS_MAX_FA_D8_gwe_bf ZS_MAX_FA_D8_gwe_bf;C:\Workspace\Geodata\Verify_Discharge\Verify_Discharge.gdb\ZS_MAX_FA_D8_gwe_bf_lt1m ZS_MAX_FA_D8_gwe_bf_lt1m;C:\Workspace\Geodata\Verify_Discharge\Verify_Discharge.gdb\ZS_MAX_LidAg10BF_FlowAcc ZS_MAX_LidAg10BF_FlowAcc;C:\Workspace\Geodata\Verify_Discharge\Verify_Discharge.gdb\ZS_MAX_FA_Dinf_gwe_bf ZS_MAX_FA_Dinf_gwe_bf;C:\Workspace\Geodata\Verify_Discharge\Verify_Discharge.gdb\ZS_MAX_FA_Dinf_gwe_bf_lt1m ZS_MAX_FA_Dinf_gwe_bf_lt1m;C:\Workspace\Geodata\Verify_Discharge\Verify_Discharge.gdb\ZS_MAX_FA_MDF_gwe_bf ZS_MAX_FA_MDF_gwe_bf


In [111]:
arcpy.sa.ExtractMultiValuesToPoints(
    in_point_features="bogs_points",
    in_rasters=in_rasters,
    bilinear_interpolate_values="NONE")

FA = Flow accumulation (number of cells draining to a point)
w = width of cell in distance units (10 meters)
l = length of cell in distance units (10 meters)

A = FA * cell_size^2

cell size is the side length of the grid cells

A = FA * 100 m^2/cell

Q = A * r

where r is recharge rate 
27.25 in/yr 

m3/d =  27.25 in/yr * 2.54 cm/in * 1/100 m/cm *  1/365.25 yr/d



In [112]:
# Compute conversion factor to translate flow accumulation into Q
cell_size = 13.778565541850835
FA_to_Q = (cell_size)**2 * 27.25 * 2.54 * (1/100) * (1/365.25)
FA_to_Q

0.35976425532343725

In [114]:
#Get extract the cell size of raster
def fn_FA_to_Q (rasterpath=None,recharge_rate_in_yr = 27.25):
    _ = arcpy.GetRasterProperties_management(rasterpath, "CELLSIZEX")
    #Get the elevation standard deviation value from geoprocessing result object
    cellsize_x = _.getOutput(0)
    _ = arcpy.GetRasterProperties_management(rasterpath, "CELLSIZEY")
    cellsize_y = _.getOutput(0)
    # calculate cell area in meters
    cell_area_meters = float(cellsize_x) * float(cellsize_y)
    print(cell_area_meters)
    FA_to_Q = cell_area_meters * recharge_rate_in_yr * 2.54 * (1/100) * (1/365.25)
    print(FA_to_Q)
fn_FA_to_Q(rasterpath=os.path.join(wdr,gdb,"ZS_MAX_FA_D8_gwe_bf"))

100.0
0.1895003422313484


In [None]:
arcpy.management.SelectLayerByAttribute(
    in_layer_or_view=r"C:\Workspace\Geodata\Verify_Discharge\Verify_Discharge.gdb\df_Q_bogs_streams_XYTableToPoint",
    selection_type="NEW_SELECTION",
    where_clause="{} IS NOT NULL",
    invert_where_clause=None
)

In [118]:
# calculate flow for all flow accumulation layers
# r"C:\Workspace\Geodata\Verify_Discharge\Verify_Discharge.gdb\bogs_points"
in_table = os.path.join(wdr,gdb,"bogs_points")
for long_name in long_names:
    print(long_name)
    FA_to_Q = fn_FA_to_Q(rasterpath=os.path.join(wdr,gdb,long_name))
    _ = re.search("ZS_MAX_(.*)",long_name)
    short_name = re.sub("FA","",re.sub("_","",_[1]))
    print(short_name)
    arcpy.management.SelectLayerByAttribute(
        in_layer_or_view=in_table,
        selection_type="NEW_SELECTION",
        where_clause="{} IS NOT NULL".format(long_name))
    arcpy.management.CalculateField(
        in_table=in_table,
        field="Q_m3d_{}".format(short_name),
        expression="!{}!*{}".format(long_name,FA_to_Q),
        expression_type="PYTHON3",
        code_block="",
        field_type="DOUBLE",
        enforce_domains="NO_ENFORCE_DOMAINS")
    arcpy.management.SelectLayerByAttribute(
        in_layer_or_view=in_table,
        selection_type="CLEAR_SELECTION")

ZS_MAX_FA_D8_gwe_bf
100.0
0.1895003422313484
D8gwebf


Traceback (most recent call last):
  File "<expression>", line 1, in <module>
TypeError: unsupported operand type(s) for *: 'float' and 'NoneType'


ExecuteError: ERROR 000539: Traceback (most recent call last):
  File "<expression>", line 1, in <module>
TypeError: unsupported operand type(s) for *: 'float' and 'NoneType'

Failed to execute (CalculateField).


In [None]:



# Start new notebook for this. 10/24/2023
# RESUME HERE EXTRACT NITRATE DATA

In [23]:
arcpy.management.XYTableToPoint(
    in_table=r"C:\Users\Adrian.Wiegman\Documents\GitHub\Wiegman_USDA_ARS\MEP\data\df_NO3_merged_XY_rivers_streams.csv",
    out_feature_class=r"df_NO3_rivers_streams_merged_XY",
    x_field="Long",
    y_field="Lat")

In [24]:
arcpy.management.CalculateGeometryAttributes(
    in_features="df_NO3_rivers_streams_merged_XY",
    geometry_property="Lat POINT_X;Long POINT_Y",
    length_unit="",
    area_unit="",
    coordinate_system='PROJCS["NAD_1983_UTM_Zone_19N",GEOGCS["GCS_North_American_1983",DATUM["D_North_American_1983",SPHEROID["GRS_1980",6378137.0,298.257222101]],PRIMEM["Greenwich",0.0],UNIT["Degree",0.0174532925199433]],PROJECTION["Transverse_Mercator"],PARAMETER["False_Easting",500000.0],PARAMETER["False_Northing",0.0],PARAMETER["Central_Meridian",-69.0],PARAMETER["Scale_Factor",0.9996],PARAMETER["Latitude_Of_Origin",0.0],UNIT["Meter",1.0]]',
    coordinate_format="DD"
)

In [13]:
arcpy.management.CopyFeatures(
    in_features=r"C:\Workspace\Geodata\Massachusetts\MassGIS_PointData\CENSUS2010BLOCKS_POINT.shp",
    out_feature_class=r"CENSUS2010BLOCKS_POINT",)

In [14]:
arcpy.management.CalculateField(
    in_table="CENSUS2010BLOCKS_POINT",
    field="POP_km2",
    expression="!POP100_RE!/((!AREA_ACRES!)*0.00404686)",
    expression_type="PYTHON3",
    code_block="",
    field_type="FLOAT",
    enforce_domains="NO_ENFORCE_DOMAINS")

In [None]:
with arcpy.EnvManager(coincidentPoints="MEAN", outputCoordinateSystem='PROJCS["NAD_1983_UTM_Zone_19N",GEOGCS["GCS_North_American_1983",DATUM["D_North_American_1983",SPHEROID["GRS_1980",6378137.0,298.257222101]],PRIMEM["Greenwich",0.0],UNIT["Degree",0.0174532925199433]],PROJECTION["Transverse_Mercator"],PARAMETER["False_Easting",500000.0],PARAMETER["False_Northing",0.0],PARAMETER["Central_Meridian",-69.0],PARAMETER["Scale_Factor",0.9996],PARAMETER["Latitude_Of_Origin",0.0],UNIT["Meter",1.0]]', snapRaster="gw_elev_meters", extent='239786.1488 4558611.5999 422661.4389 4746900.7902 PROJCS["NAD_1983_UTM_Zone_19N",GEOGCS["GCS_North_American_1983",DATUM["D_North_American_1983",SPHEROID["GRS_1980",6378137.0,298.257222101]],PRIMEM["Greenwich",0.0],UNIT["Degree",0.0174532925199433]],PROJECTION["Transverse_Mercator"],PARAMETER["False_Easting",500000.0],PARAMETER["False_Northing",0.0],PARAMETER["Central_Meridian",-69.0],PARAMETER["Scale_Factor",0.9996],PARAMETER["Latitude_Of_Origin",0.0],UNIT["Meter",1.0]]', cellSize="gw_elev_meters", mask="gw_elev_meters"):
    arcpy.ga.EmpiricalBayesianKriging(
        in_features="df_NO3_rivers_streams_merged_XY",
        z_field="NO3",
        out_ga_layer="bayes_krig_NO3_rs",
        out_raster=r"C:\Workspace\Geodata\MEP\Default.gdb\bayes_krig_NO3_rs",
        cell_size=684.791010400005,
        transformation_type="NONE",
        max_local_points=100,
        overlap_factor=1,
        number_semivariograms=100,
        search_neighborhood="NBRTYPE=StandardCircular RADIUS=84261.4359376343 ANGLE=0 NBR_MAX=15 NBR_MIN=10 SECTOR_TYPE=ONE_SECTOR",
        output_type="PREDICTION",
        quantile_value=0.5,
        threshold_type="EXCEED",
        probability_threshold=None,
        semivariogram_model_type="POWER"
    )

In [None]:
with arcpy.EnvManager(coincidentPoints="MEAN", outputCoordinateSystem='PROJCS["NAD_1983_UTM_Zone_19N",GEOGCS["GCS_North_American_1983",DATUM["D_North_American_1983",SPHEROID["GRS_1980",6378137.0,298.257222101]],PRIMEM["Greenwich",0.0],UNIT["Degree",0.0174532925199433]],PROJECTION["Transverse_Mercator"],PARAMETER["False_Easting",500000.0],PARAMETER["False_Northing",0.0],PARAMETER["Central_Meridian",-69.0],PARAMETER["Scale_Factor",0.9996],PARAMETER["Latitude_Of_Origin",0.0],UNIT["Meter",1.0]]', snapRaster=None, extent='320794.421171825 4565485.30428723 424591.552503007 4681489.77085835 PROJCS["NAD_1983_UTM_Zone_19N",GEOGCS["GCS_North_American_1983",DATUM["D_North_American_1983",SPHEROID["GRS_1980",6378137.0,298.257222101]],PRIMEM["Greenwich",0.0],UNIT["Degree",0.0174532925199433]],PROJECTION["Transverse_Mercator"],PARAMETER["False_Easting",500000.0],PARAMETER["False_Northing",0.0],PARAMETER["Central_Meridian",-69.0],PARAMETER["Scale_Factor",0.9996],PARAMETER["Latitude_Of_Origin",0.0],UNIT["Meter",1.0]]', cellSize="MAXOF", mask=None):
    arcpy.ga.EmpiricalBayesianKriging(
        in_features="df_NO3_rivers_streams_merged_XY",
        z_field="NO3",
        out_ga_layer="bayes_krig_NO3_rs_kb_100m_MA",
        out_raster=r"C:\Workspace\Geodata\MEP\Default.gdb\bayes_krig_NO3_rs_kb_100m_MA",
        cell_size=250,
        transformation_type="LOGEMPIRICAL",
        max_local_points=100,
        overlap_factor=1,
        number_semivariograms=100,
        search_neighborhood="NBRTYPE=StandardCircular RADIUS=84261.4359376343 ANGLE=0 NBR_MAX=15 NBR_MIN=5 SECTOR_TYPE=ONE_SECTOR",
        output_type="PREDICTION",
        quantile_value=0.5,
        threshold_type="EXCEED",
        probability_threshold=None,
        semivariogram_model_type="K_BESSEL"
    )

In [None]:
arcpy.sa.ExtractMultiValuesToPoints(
    in_point_features="df_NO3_rivers_streams_merged_XY",
    in_rasters="density_SW density_SW;density_GWD density_GWD;density_POP density_POP;density_HU density_HU;bayes_krig_NO3_rs_kb_100m bayes_krig_NO3_rs_kb_250m",
    bilinear_interpolate_values="NONE"
)

In [None]:
arcpy.sa.ExtractMultiValuesToPoints(
    in_point_features="bogs_points",
    in_rasters="density_SW density_SW;density_GWD density_GWD;density_POP density_POP;density_HU density_HU;bayes_krig_NO3_rs_kb_100m bayes_krig_NO3_rs_kb_250m",
    bilinear_interpolate_values="NONE"
)

In [None]:
arcpy.gapro.Forest(
    prediction_type="TRAIN_AND_PREDICT",
    in_features="df_NO3_rivers_streams_merged_XY",
    output_trained_features=r"C:\Workspace\Geodata\MEP\Default.gdb\NO3_RF_trained",
    variable_predict="NO3",
    treat_variable_as_categorical=None,
    explanatory_variables="density_SW false;density_GWD false;density_HU false;density_POP false;bayes_krig_NO3_rs_kb_250m false",
    features_to_predict="bogs_points",
    variable_of_importance=None,
    output_predicted=r"C:\Workspace\Geodata\MEP\Default.gdb\bog_points_predicted_NO3_RF",
    explanatory_variable_matching="density_SW density_SW;density_GWD density_GWD;density_HU density_HU;density_POP density_POP;bayes_krig_NO3_rs_kb_250m bayes_krig_NO3_rs_kb_250m; density_structure_SQ_FT density_structure_SQ_FT",
    number_of_trees=100,
    minimum_leaf_size=None,
    maximum_tree_depth=None,
    sample_size=100,
    random_variables=None,
    percentage_for_validation=20
)

In [16]:
# EXPORT TO EXCEL
arcpy.conversion.TableToExcel(
    Input_Table=r"C:\Workspace\Geodata\MEP\Default.gdb\NO3_RF_trained",
    Output_Excel_File=r"C:\Workspace\Geodata\Verify_Discharge\outputs\NO3_RF_trained_TableToExcel.xls",
    Use_field_alias_as_column_header="NAME",
    Use_domain_and_subtype_description="CODE"
)

arcpy.conversion.TableToExcel(
    Input_Table=r"C:\Workspace\Geodata\MEP\Default.gdb\bog_points_predicted_NO3_RF",
    Output_Excel_File=r"C:\Workspace\Geodata\Verify_Discharge\outputs\bog_points_predicted_NO3_RF_TableToExcel.xls",
    Use_field_alias_as_column_header="NAME",
    Use_domain_and_subtype_description="CODE"
)

ExecuteError: Failed to execute. Parameters are not valid.
ERROR 000732: Input Table: Dataset C:\Workspace\Geodata\MEP\Default.gdb\NO3_RF_trained does not exist or is not supported
WARNING 000725: Output Excel File (.xls or .xlsx): Dataset C:\Workspace\Geodata\Verify_Discharge\outputs\NO3_RF_trained_TableToExcel.xls already exists.
Failed to execute (TableToExcel).
