BARC updates to cap xsec area and allow user to choose input bankfull…

… geometry BARC updates to cap the bathy calculated xsec area in bathy_rc_adjust.py and allow user to choose input bankfull geometry. - Added new env variable to control which input file is used for the bankfull geometry input to bathy estimation workflow. - Modified the bathymetry cross section area calculation to cap the additional area value so that it cannot exceed the bankfull cross section area value for each stream segment (bankfull value obtained from regression equation dataset). - Modified the rating_curve_comparison.py plot output to always put the FIM rating curve on top of the USGS rating curve (avoids USGS points covering FIM). - Created a new aggregate csv file (aggregates for all hucs) for all of the usgs_elev_table.csv files (one per huc). - Evaluate the FIM Bathymetry Adjusted Rating Curve (BARC) tool performance using the estimated bankfull geometry dataset derived for the NWM route link dataset. This resolves #342, resolves #343, resolves #313, resolves #401, and resolves #402.
NOAA-OWP · Jun 4, 2021 · b9167d5 · b9167d5
1 parent ac25186
commit b9167d5
Show file tree

Hide file tree

Showing 6 changed files with 44 additions and 14 deletions.
diff --git a/CHANGELOG.md b/CHANGELOG.md
@@ -2,6 +2,18 @@ All notable changes to this project will be documented in this file.
 We follow the [Semantic Versioning 2.0.0](http://semver.org/) format.
 <br/><br/>
 
+## v3.0.17.0 - 2021-06-04 - PR #393
+BARC updates to cap the bathy calculated xsec area in `bathy_rc_adjust.py` and allow user to choose input bankfull geometry.
+
+## Changes
+
+- Added new env variable to control which input file is used for the bankfull geometry input to bathy estimation workflow.
+- Modified the bathymetry cross section area calculation to cap the additional area value so that it cannot exceed the bankfull cross section area value for each stream segment (bankfull value obtained from regression equation dataset).
+- Modified the `rating_curve_comparison.py` plot output to always put the FIM rating curve on top of the USGS rating curve (avoids USGS points covering FIM).
+- Created a new aggregate csv file (aggregates for all hucs) for all of the `usgs_elev_table.csv` files (one per huc).
+- Evaluate the FIM Bathymetry Adjusted Rating Curve (BARC) tool performance using the estimated bankfull geometry dataset derived for the NWM route link dataset.
+
+<br/><br/>
 ## v3.0.16.3 - 2021-05-21 - [PR #388](https://github.com/NOAA-OWP/cahaba/pull/388)
 
 Enhancement and bug fixes to `synthesize_test_cases.py`.
@@ -59,7 +71,7 @@ Remove Great Lakes coastlines from WBD buffer.
 Generate `nws_lid.gpkg`.
 
 ## Additions
-- Generate `nws_lid.gpkg` with attributes indicating if site is a headwater `nws_lid` as well as if it is co-located with another `nws_lid` which is referenced to the same `nwm_feature_id` segment. 
+- Generate `nws_lid.gpkg` with attributes indicating if site is a headwater `nws_lid` as well as if it is co-located with another `nws_lid` which is referenced to the same `nwm_feature_id` segment.
 
 <br/><br/>
 ## v3.0.15.8 - 2021-04-29 - [PR #371](https://github.com/NOAA-OWP/cahaba/pull/371)
@@ -69,7 +81,7 @@ Refactor NHDPlus HR preprocessing workflow. Resolves issue #238
 ## Changes
 - Consolidate NHD streams, NWM catchments, and headwaters MS and FR layers with `mainstem` column.
 - HUC8 intersections are included in the input headwaters layer.
-- `clip_vectors_to_wbd.py` removes incoming stream segment from the selected layers. 
+- `clip_vectors_to_wbd.py` removes incoming stream segment from the selected layers.
 
 <br/><br/>
 ## v3.0.15.7 - 2021-04-28 - [PR #367](https://github.com/NOAA-OWP/cahaba/pull/367)
@@ -124,7 +136,7 @@ Preprocess NHDPlus HR rasters for consistent projections, nodata values, and con
 <br/><br/>
 ## v3.0.15.2 - 2021-04-16 - [PR #359](https://github.com/NOAA-OWP/cahaba/pull/359)
 
-Hotfix to preserve desired files when production flag used in `fim_run.sh`. 
+Hotfix to preserve desired files when production flag used in `fim_run.sh`.
 
 ## Changes
 
@@ -133,17 +145,17 @@ Hotfix to preserve desired files when production flag used in `fim_run.sh`.
 <br/><br/>
 ## v3.0.15.1 - 2021-04-13 - [PR #355](https://github.com/NOAA-OWP/cahaba/pull/355)
 
-Sierra test considered all USGS gage locations to be mainstems even though many actually occurred with tributaries. This resulted in unrealistic comparisons as incorrect gages were assigned to mainstems segments. This feature branch identifies gages that are on mainstems via attribute field. 
+Sierra test considered all USGS gage locations to be mainstems even though many actually occurred with tributaries. This resulted in unrealistic comparisons as incorrect gages were assigned to mainstems segments. This feature branch identifies gages that are on mainstems via attribute field.
 
 ## Changes
 
-- Modifies `usgs_gage_crosswalk.py` to filter out gages from the `usgs_gages.gpkg` layer such that for a "MS" run, only consider gages that contain rating curve information (via `curve` attribute) and are also mainstems gages (via `mainstems` attribute). 
-- Modifies `usgs_gage_crosswalk.py` to filter out gages from the `usgs_gages.gpkg` layer such that for a "FR" run, only consider gages that contain rating curve information (via `curve` attribute) and are not mainstems gages (via `mainstems` attribute). 
-- Modifies how mainstems segments are determined by using the `nwm_flows_ms.gpkg` as a lookup to determine if the NWM segment specified by WRDS for a gage site is a mainstems gage. 
+- Modifies `usgs_gage_crosswalk.py` to filter out gages from the `usgs_gages.gpkg` layer such that for a "MS" run, only consider gages that contain rating curve information (via `curve` attribute) and are also mainstems gages (via `mainstems` attribute).
+- Modifies `usgs_gage_crosswalk.py` to filter out gages from the `usgs_gages.gpkg` layer such that for a "FR" run, only consider gages that contain rating curve information (via `curve` attribute) and are not mainstems gages (via `mainstems` attribute).
+- Modifies how mainstems segments are determined by using the `nwm_flows_ms.gpkg` as a lookup to determine if the NWM segment specified by WRDS for a gage site is a mainstems gage.
 
 ## Additions
 
-- Adds a `mainstem` attribute field to `usgs_gages.gpkg` that indicates whether a gage is located on a mainstems river. 
+- Adds a `mainstem` attribute field to `usgs_gages.gpkg` that indicates whether a gage is located on a mainstems river.
 - Adds `NWM_FLOWS_MS` variable to the `.env` and `.env.template` files.
 - Adds the `extent` argument specified by user when running `fim_run.sh` to `usgs_gage_crosswalk.py`.
 

diff --git a/config/params_calibrated.env b/config/params_calibrated.env
@@ -20,10 +20,12 @@ export min_catchment_area=0.25
 export min_stream_length=0.5
 
 #### bathy SRC estimation parameters ####
+export bankfull_input_table="/bathymetry/BANKFULL_CONUS.txt" # Option 1: Wieczorek (2018) bankfull geometry (Bieger 2015 regression)
+#export bankfull_input_table="/bathymetry/nwm_route_link_geom_BED.csv" # Option 2: NWM Route Link bankfull geometry (Blackburn-Lynch regression)
 export bathy_src_modification=True
 export surf_area_thalweg_ratio_flag=10
 export thalweg_stg_search_max_limit=3
-export bathy_xs_area_chg_flag=5
+export bathy_xs_area_chg_flag=1
 export bankful_xs_area_ratio_flag=10
 export thalweg_hyd_radius_flag=10
 

diff --git a/config/params_template.env b/config/params_template.env
@@ -20,10 +20,12 @@ export min_catchment_area=0.25
 export min_stream_length=0.5
 
 #### bathy SRC estimation parameters ####
+export bankfull_input_table="/bathymetry/BANKFULL_CONUS.txt" # Option 1: Wieczorek (2018) bankfull geometry (Bieger 2015 regression)
+#export bankfull_input_table="/bathymetry/nwm_route_link_geom_BED.csv" # Option 2: NWM Route Link bankfull geometry (Blackburn-Lynch regression)
 export bathy_src_modification=True
 export surf_area_thalweg_ratio_flag=10
 export thalweg_stg_search_max_limit=3
-export bathy_xs_area_chg_flag=5
+export bathy_xs_area_chg_flag=1
 export bankful_xs_area_ratio_flag=10
 export thalweg_hyd_radius_flag=10
 

diff --git a/src/bathy_rc_adjust.py b/src/bathy_rc_adjust.py
@@ -8,7 +8,7 @@
 sa_ratio_flag             = float(environ['surf_area_thalweg_ratio_flag']) #10x
 thal_stg_limit            = float(environ['thalweg_stg_search_max_limit']) #3m
 bankful_xs_ratio_flag     = float(environ['bankful_xs_area_ratio_flag']) #10x
-bathy_xsarea_flag         = float(environ['bathy_xs_area_chg_flag']) #5x
+bathy_xsarea_flag         = float(environ['bathy_xs_area_chg_flag']) #1x
 thal_hyd_radius_flag      = float(environ['thalweg_hyd_radius_flag']) #10x
 
 def bathy_rc_lookup(input_src_base,input_bathy_fileName,output_bathy_fileName,output_bathy_streamorder_fileName,output_bathy_thalweg_fileName,output_bathy_xs_lookup_fileName,):
@@ -117,7 +117,8 @@ def bathy_rc_lookup(input_src_base,input_bathy_fileName,output_bathy_fileName,ou
         ## Calculate the ratio btw the lookup SRC XS_Area and the Bankfull_XSEC_AREA --> use this as a flag for potentially bad XS data
         xs_area_hydroid_lookup['bankfull_XS_ratio_flag'] = (xs_area_hydroid_lookup['bathy_calc_xs_area'] / xs_area_hydroid_lookup['BANKFULL_XSEC_AREA (m2)'])
         ## Set bath_cal_xs_area to 0 if the bankfull_XS_ratio_flag is > threshold --> 5x (assuming too large of difference to be a reliable bankfull calculation)
-        xs_area_hydroid_lookup['bathy_calc_xs_area'].mask((xs_area_hydroid_lookup['bankfull_XS_ratio_flag']>bathy_xsarea_flag) | (xs_area_hydroid_lookup['bankfull_XS_ratio_flag'].isnull()),0,inplace=True)
+        xs_area_hydroid_lookup['bathy_calc_xs_area'].mask(xs_area_hydroid_lookup['bankfull_XS_ratio_flag']>bathy_xsarea_flag,xs_area_hydroid_lookup['BANKFULL_XSEC_AREA (m2)'],inplace=True)
+        xs_area_hydroid_lookup['bathy_calc_xs_area'].mask(xs_area_hydroid_lookup['bankfull_XS_ratio_flag'].isnull(),0,inplace=True)
 
         ## Merge bathy_calc_xs_area to the modified_src_base
         modified_src_base = modified_src_base.merge(xs_area_hydroid_lookup.loc[:,['HydroID','bathy_calc_xs_area']],how='left',on='HydroID')

diff --git a/src/run_by_unit.sh b/src/run_by_unit.sh
@@ -34,6 +34,7 @@ huc2Identifier=${hucNumber:0:2}
 input_NHD_WBHD_layer=WBDHU$hucUnitLength
 input_DEM=$inputDataDir/nhdplus_rasters/HRNHDPlusRasters"$huc4Identifier"/elev_m.tif
 input_NLD=$inputDataDir/nld_vectors/huc2_levee_lines/nld_preprocessed_"$huc2Identifier".gpkg
+input_bathy_bankfull=$inputDataDir/$bankfull_input_table
 
 # Define the landsea water body mask using either Great Lakes or Ocean polygon input #
 if [[ $huc2Identifier == "04" ]] ; then
@@ -400,7 +401,7 @@ echo -e $startDiv"Finalize catchments and model streams $hucNumber"$stopDiv outp
 date -u
 Tstart
 [ ! -f $outputHucDataDir/gw_catchments_reaches_filtered_addedAttributes_crosswalked.gpkg ] && \
-$srcDir/add_crosswalk.py -d $outputHucDataDir/gw_catchments_reaches_filtered_addedAttributes.gpkg -a $outputHucDataDir/demDerived_reaches_split_filtered.gpkg -s $outputHucDataDir/src_base.csv -u $inputDataDir/bathymetry/BANKFULL_CONUS.txt -v $outputHucDataDir/bathy_crosswalk_calcs.csv -e $outputHucDataDir/bathy_stream_order_calcs.csv -g $outputHucDataDir/bathy_thalweg_flag.csv -i $outputHucDataDir/bathy_xs_area_hydroid_lookup.csv -l $outputHucDataDir/gw_catchments_reaches_filtered_addedAttributes_crosswalked.gpkg -f $outputHucDataDir/demDerived_reaches_split_filtered_addedAttributes_crosswalked.gpkg -r $outputHucDataDir/src_full_crosswalked.csv -j $outputHucDataDir/src.json -x $outputHucDataDir/crosswalk_table.csv -t $outputHucDataDir/hydroTable.csv -w $outputHucDataDir/wbd8_clp.gpkg -b $outputHucDataDir/nwm_subset_streams.gpkg -y $outputHucDataDir/nwm_catchments_proj_subset.tif -m $manning_n -z $input_nwm_catchments -p $extent -k $outputHucDataDir/small_segments.csv
+$srcDir/add_crosswalk.py -d $outputHucDataDir/gw_catchments_reaches_filtered_addedAttributes.gpkg -a $outputHucDataDir/demDerived_reaches_split_filtered.gpkg -s $outputHucDataDir/src_base.csv -u $input_bathy_bankfull -v $outputHucDataDir/bathy_crosswalk_calcs.csv -e $outputHucDataDir/bathy_stream_order_calcs.csv -g $outputHucDataDir/bathy_thalweg_flag.csv -i $outputHucDataDir/bathy_xs_area_hydroid_lookup.csv -l $outputHucDataDir/gw_catchments_reaches_filtered_addedAttributes_crosswalked.gpkg -f $outputHucDataDir/demDerived_reaches_split_filtered_addedAttributes_crosswalked.gpkg -r $outputHucDataDir/src_full_crosswalked.csv -j $outputHucDataDir/src.json -x $outputHucDataDir/crosswalk_table.csv -t $outputHucDataDir/hydroTable.csv -w $outputHucDataDir/wbd8_clp.gpkg -b $outputHucDataDir/nwm_subset_streams.gpkg -y $outputHucDataDir/nwm_catchments_proj_subset.tif -m $manning_n -z $input_nwm_catchments -p $extent -k $outputHucDataDir/small_segments.csv
 Tcount
 
 ## USGS CROSSWALK ##

diff --git a/tools/rating_curve_comparison.py b/tools/rating_curve_comparison.py
@@ -287,7 +287,7 @@ def generate_facet_plot(rc, plot_filename):
         columns = 1
 
     sns.set(style="ticks")
-    g = sns.FacetGrid(rc, col="USGS Gage", hue="source",sharex=False, sharey=False,col_wrap=columns)
+    g = sns.FacetGrid(rc, col="USGS Gage", hue="source", hue_order=['USGS','FIM'], sharex=False, sharey=False,col_wrap=columns)
     g.map(sns.scatterplot, "discharge_cfs", "elevation_ft", palette="tab20c", marker="o")
     g.set_axis_labels(x_var="Discharge (cfs)", y_var="Elevation (ft)")
 
@@ -414,6 +414,7 @@ def calculate_rc_stats_elev(rc,stat_groups=None):
     log_file = open(join(output_dir,'rating_curve_comparison.log'),"w")
     sys.stdout = log_file
 
+    merged_elev_table = []
     huc_list  = os.listdir(fim_dir)
     for huc in huc_list:
 
@@ -426,6 +427,17 @@ def calculate_rc_stats_elev(rc,stat_groups=None):
 
             if isfile(elev_table_filename):
                 procs_list.append([elev_table_filename, hydrotable_filename, usgs_gages_filename, usgs_recurr_stats_filename, nwm_recurr_data_filename, rc_comparison_plot_filename,nwm_flow_dir, catfim_flows_filename, huc])
+                # Aggregate all of the individual huc elev_tables into one aggregate for accessing all data in one csv
+                read_elev_table = pd.read_csv(elev_table_filename)
+                read_elev_table['huc'] = huc
+                merged_elev_table.append(read_elev_table)
+
+    # Output a concatenated elev_table to_csv
+    if merged_elev_table:
+        print(f"Creating aggregate elev table csv")
+        concat_elev_table = pd.concat(merged_elev_table)
+        concat_elev_table['thal_burn_depth_meters'] = concat_elev_table['dem_elevation'] - concat_elev_table['dem_adj_elevation']
+        concat_elev_table.to_csv(join(output_dir,'agg_usgs_elev_table.csv'),index=False)
 
     # Initiate multiprocessing
     print(f"Generating rating curve metrics for {len(procs_list)} hucs using {number_of_jobs} jobs")