Skip to content

Commit

Permalink
[1pt] Tool to compare synthetic rating curve with benchmark rating cu…
Browse files Browse the repository at this point in the history
…rve (sierra test) (#332)

* `usgs_gage_crosswalk.py`: generates `usgs_elev_table.csv` in run_by_unit.py with elevation at USGS gages 
* `rating_curve_comparison.py`: post-processing script to plot and calculate metrics between FIM and USGS rating curve data
* updates `aggregate_fim_outputs.py` call argument in `fim_run.sh` from 4 jobs to 6 jobs (optimizing API performance)
* reroutes median elevation data from `add_crosswalk.py` and `rem.py` to new file (depreciating `hand_ref_elev_table.csv`)
*- adds new files to `viz_whitelist` in `output_cleanup.py`
  • Loading branch information
BrianAvant committed Apr 1, 2021
1 parent 4df6d1d commit 44b6333
Show file tree
Hide file tree
Showing 8 changed files with 590 additions and 31 deletions.
42 changes: 28 additions & 14 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,68 +1,82 @@
All notable changes to this project will be documented in this file.
We follow the [Semantic Versioning 2.0.0](http://semver.org/) format.

## v3.0.12.2 - 2021-04-01 - [PR #332](https://github.com/NOAA-OWP/cahaba/pull/332)

Created tool to compare synthetic rating curve with benchmark rating curve (sierra test). resolves issue #293; resolves issue #308

### Changes
- update `aggregate_fim_outputs.py` call argument in `fim_run.sh` from 4 jobs to 6 jobs (optimizing API performance)
- reroutes median elevation data from `add_crosswalk.py` and `rem.py` to new file (depreciating `hand_ref_elev_table.csv`)
- adding new files to `viz_whitelist` in `output_cleanup.py`

### Additions
- `usgs_gage_crosswalk.py`: generates `usgs_elev_table.csv` in run_by_unit.py with elevation and additional attributes at USGS gages.
- `rating_curve_comparison.py`: post-processing script to plot and calculate metrics between synthetic rating curves and USGS rating curve data.
<br/><br/>

## v3.0.12.1 - 2021-03-31 - [PR #336](https://github.com/NOAA-OWP/cahaba/pull/336)

Fix spatial option in `eval_plots.py` when creating plots and spatial outputs.
Fix spatial option in `eval_plots.py` when creating plots and spatial outputs.

### Changes
### Changes
- Removes file dependencies from spatial option. Does require the WBD layer which should be specified in `.env` file.
- Produces outputs in a format consistent with requirements needed for publishing.
- Preserves leading zeros in huc information for all outputs from `eval_plots.py`.

### Additions
- Creates `fim_performance_points.shp`: this layer consists of all evaluated ahps points (with metrics). Spatial data retrieved from WRDS on the fly.
- Creates `fim_performance_polys.shp`: this layer consists of all evaluated huc8s (with metrics). Spatial data retrieved from WBD layer.
- Creates `fim_performance_points.shp`: this layer consists of all evaluated ahps points (with metrics). Spatial data retrieved from WRDS on the fly.
- Creates `fim_performance_polys.shp`: this layer consists of all evaluated huc8s (with metrics). Spatial data retrieved from WBD layer.
<br/><br/>

## v3.0.12.0 - 2021-03-26 - [PR #327](https://github.com/NOAA-OWP/cahaba/pull/237)

Add more detail/information to plotting capabilities.
### Changes
Add more detail/information to plotting capabilities.

### Changes
- Merge `plot_functions.py` into `eval_plots.py` and move `eval_plots.py` into the tools directory.
- Remove `plots` subdirectory.
-
### Additions
- Optional argument to create barplots of CSI for each individual site.
- Create a csv containing the data used to create the scatterplots.
<br/><br/>

## v3.0.11.0 - 2021-03-22 - [PR #319](https://github.com/NOAA-OWP/cahaba/pull/298)

Improvements to CatFIM service source data generation.
Improvements to CatFIM service source data generation.

### Changes
### Changes
- Renamed `generate_categorical_fim.py` to `generate_categorical_fim_mapping.py`.
- Updated the status outputs of the `nws_lid_sites layer` and saved it in the same directory as the `merged catfim_library layer`.
- Additional stability fixes (such as improved compatability with WRDS updates).

### Additions
- Added `generate_categorical_fim.py` to wrap `generate_categorical_fim_flows.py` and `generate_categorical_fim_mapping.py`.
- Create new `nws_lid_sites` shapefile located in same directory as the `catfim_library` shapefile.

<br/><br/>

## v3.0.10.1 - 2021-03-24 - [PR #320](https://github.com/NOAA-OWP/cahaba/pull/320)

Patch to synthesize_test_cases.py.
Patch to synthesize_test_cases.py.

### Changes
### Changes
- Bug fix to `synthesize_test_cases.py` to allow comparison between `testing` version and `official` versions.

<br/><br/>

## v3.0.10.0 - 2021-03-12 - [PR #298](https://github.com/NOAA-OWP/cahaba/pull/298)

Preprocessing of flow files for Categorical FIM.
Preprocessing of flow files for Categorical FIM.

### Additions
- Generate Categorical FIM flow files for each category (action, minor, moderate, major).
- Generate point shapefile of Categorical FIM sites.
- Generate csv of attribute data in shapefile.
- Aggregate all shapefiles and csv files into one file in parent directory.
- Add flood of record category.

### Changes
- Stability fixes to `generate_categorical_fim.py`.

<br/><br/>

## v3.0.9.0 - 2021-03-12 - [PR #297](https://github.com/NOAA-OWP/cahaba/pull/297)
Expand Down
2 changes: 1 addition & 1 deletion fim_run.sh
Original file line number Diff line number Diff line change
Expand Up @@ -152,5 +152,5 @@ fi
echo "$viz"
if [[ "$viz" -eq 1 ]]; then
# aggregate outputs
python3 /foss_fim/src/aggregate_fim_outputs.py -d $outputRunDataDir -j 4
time python3 /foss_fim/src/aggregate_fim_outputs.py -d $outputRunDataDir -j 6
fi
3 changes: 1 addition & 2 deletions src/add_crosswalk.py
Original file line number Diff line number Diff line change
Expand Up @@ -220,9 +220,8 @@ def add_crosswalk(input_catchments_fileName,input_flows_fileName,input_srcbase_f
output_hydro_table = output_hydro_table.merge(input_huc.loc[:,[FIM_ID,'HUC8']],how='left',on=FIM_ID)

if output_flows.HydroID.dtype != 'str': output_flows.HydroID = output_flows.HydroID.astype(str)
output_hydro_table = output_hydro_table.merge(output_flows.loc[:,['HydroID','LakeID','Median_Thal_Elev_m']],how='left',on='HydroID')
output_hydro_table = output_hydro_table.merge(output_flows.loc[:,['HydroID','LakeID']],how='left',on='HydroID')
output_hydro_table['LakeID'] = output_hydro_table['LakeID'].astype(int)
output_hydro_table['Median_Thal_Elev_m'] = output_hydro_table['Median_Thal_Elev_m'].astype(float).round(2)
output_hydro_table = output_hydro_table.rename(columns={'HUC8':'HUC'})
if output_hydro_table.HUC.dtype != 'str': output_hydro_table.HUC = output_hydro_table.HUC.astype(str)

Expand Down
9 changes: 7 additions & 2 deletions src/output_cleanup.py
Original file line number Diff line number Diff line change
Expand Up @@ -31,15 +31,20 @@ def output_cleanup(huc_number, output_folder_path, additional_whitelist, is_prod
'gw_catchments_reaches_filtered_addedAttributes.tif',
'hydroTable.csv',
'src.json',
'small_segments.csv'
'small_segments.csv',
'usgs_elev_table.csv',
'hand_ref_elev_table.csv'
]

# List of files that will be saved during a viz run
viz_whitelist = [
'rem_zeroed_masked.tif',
'gw_catchments_reaches_filtered_addedAttributes_crosswalked.gpkg',
'demDerived_reaches_split_filtered_addedAttributes_crosswalked.gpkg',
'gw_catchments_reaches_filtered_addedAttributes.tif',
'hydroTable.csv',
'src.json'
'src.json',
'small_segments.csv'
]

# If "production" run, only keep whitelisted files
Expand Down
23 changes: 12 additions & 11 deletions src/rem.py
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@
from utils.shared_functions import getDriver


def rel_dem(dem_fileName, pixel_watersheds_fileName, rem_fileName, thalweg_raster, hydroid_fileName, hand_ref_elev_fileName, dem_reaches_filename):
def rel_dem(dem_fileName, pixel_watersheds_fileName, rem_fileName, thalweg_raster, hydroid_fileName, dem_reaches_filename):
"""
Calculates REM/HAND/Detrended DEM
Expand All @@ -25,8 +25,6 @@ def rel_dem(dem_fileName, pixel_watersheds_fileName, rem_fileName, thalweg_raste
File name of output relative elevation raster.
hydroid_fileName : str
File name of the hydroid raster (i.e. gw_catchments_reaches.tif)
hand_ref_elev_fileName
File name of the output csv containing list of hydroid values and HAND zero/reference elev
dem_reaches_filename
File name of the reaches layer to populate HAND elevation attribute values and overwrite as output
Expand Down Expand Up @@ -108,20 +106,25 @@ def make_catchment_min_dict(flat_dem, catchment_min_dict, flat_catchments, thalw
gw_catchments_pixels_masked_object.close()
thalweg_raster_object.close()

###############################################
# Merge and export dictionary to to_csv
catchment_min_dict_df = pd.DataFrame.from_dict(catchment_min_dict, orient='index') # convert dict to dataframe
catchment_min_dict_df.columns = ['Median_Thal_Elev_m']
catchment_hydroid_dict_df = pd.DataFrame.from_dict(catchment_hydroid_dict, orient='index') # convert dict to dataframe
catchment_hydroid_dict_df.columns = ['HydroID']
merge_df = catchment_hydroid_dict_df.merge(catchment_min_dict_df, left_index=True, right_index=True)
merge_df.index.name = 'pixelcatch_id'
merge_df.to_csv(hand_ref_elev_fileName,index=True) # export dataframe to csv file

# Merge the HAND reference elvation by HydroID dataframe with the demDerived_reaches layer (add new layer attribute)
merge_df = merge_df.groupby(['HydroID']).median() # median value of all Median_Thal_Elev_m for pixel catchments in each HydroID reach
# Merge the HAND reference elevation by HydroID dataframe with the demDerived_reaches layer (add new layer attribute)
min_by_hydroid = merge_df.groupby(['HydroID']).min() # min value of all med_thal_elev for pixel catchments in each HydroID reach
min_by_hydroid.columns = ['min_thal_elev']
med_by_hydroid = merge_df.groupby(['HydroID']).median() # median value of all med_thal_elev for pixel catchments in each HydroID reach
med_by_hydroid.columns = ['med_thal_elev']
max_by_hydroid = merge_df.groupby(['HydroID']).max() # max value of all med_thal_elev for pixel catchments in each HydroID reach
max_by_hydroid.columns = ['max_thal_elev']
input_reaches = gpd.read_file(dem_reaches_filename)
input_reaches = input_reaches.merge(merge_df, on='HydroID') # merge dataframes by HydroID variable
input_reaches = input_reaches.merge(min_by_hydroid, on='HydroID') # merge dataframes by HydroID variable
input_reaches = input_reaches.merge(med_by_hydroid, on='HydroID') # merge dataframes by HydroID variable
input_reaches = input_reaches.merge(max_by_hydroid, on='HydroID') # merge dataframes by HydroID variable
input_reaches.to_file(dem_reaches_filename,driver=getDriver(dem_reaches_filename),index=False)
# ------------------------------------------------------------------------------------------------------------------------ #

Expand Down Expand Up @@ -171,7 +174,6 @@ def calculate_rem(flat_dem,catchmentMinDict,flat_catchments,ndv):
parser.add_argument('-t','--thalweg-raster',help='A binary raster representing the thalweg. 1 for thalweg, 0 for non-thalweg.',required=True)
parser.add_argument('-o','--rem',help='Output REM raster',required=True)
parser.add_argument('-i','--hydroid', help='HydroID raster to use within project path', required=True)
parser.add_argument('-r','--hand_ref_elev_table',help='Output table of HAND reference elev by catchment',required=True)
parser.add_argument('-s','--dem_reaches_in_out',help='DEM derived reach layer to join HAND reference elevation attribute',required=True)


Expand All @@ -184,7 +186,6 @@ def calculate_rem(flat_dem,catchmentMinDict,flat_catchments,ndv):
rem_fileName = args['rem']
thalweg_raster = args['thalweg_raster']
hydroid_fileName = args['hydroid']
hand_ref_elev_fileName = args['hand_ref_elev_table']
dem_reaches_filename = args['dem_reaches_in_out']

rel_dem(dem_fileName, pixel_watersheds_fileName, rem_fileName, thalweg_raster, hydroid_fileName, hand_ref_elev_fileName, dem_reaches_filename)
rel_dem(dem_fileName, pixel_watersheds_fileName, rem_fileName, thalweg_raster, hydroid_fileName, dem_reaches_filename)
10 changes: 9 additions & 1 deletion src/run_by_unit.sh
Original file line number Diff line number Diff line change
Expand Up @@ -321,7 +321,7 @@ echo -e $startDiv"D8 REM $hucNumber"$stopDiv
date -u
Tstart
[ ! -f $outputHucDataDir/rem.tif ] && \
$srcDir/rem.py -d $dem_thalwegCond -w $outputHucDataDir/gw_catchments_pixels.tif -o $outputHucDataDir/rem.tif -t $demDerived_streamPixels -i $outputHucDataDir/gw_catchments_reaches.tif -r $outputHucDataDir/hand_ref_elev_table.csv -s $outputHucDataDir/demDerived_reaches_split.gpkg
$srcDir/rem.py -d $dem_thalwegCond -w $outputHucDataDir/gw_catchments_pixels.tif -o $outputHucDataDir/rem.tif -t $demDerived_streamPixels -i $outputHucDataDir/gw_catchments_reaches.tif -s $outputHucDataDir/demDerived_reaches_split.gpkg
Tcount

## DINF DISTANCE DOWN ##
Expand Down Expand Up @@ -432,6 +432,14 @@ Tstart
$srcDir/add_crosswalk.py -d $outputHucDataDir/gw_catchments_reaches_filtered_addedAttributes.gpkg -a $outputHucDataDir/demDerived_reaches_split_filtered.gpkg -s $outputHucDataDir/src_base.csv -l $outputHucDataDir/gw_catchments_reaches_filtered_addedAttributes_crosswalked.gpkg -f $outputHucDataDir/demDerived_reaches_split_filtered_addedAttributes_crosswalked.gpkg -r $outputHucDataDir/src_full_crosswalked.csv -j $outputHucDataDir/src.json -x $outputHucDataDir/crosswalk_table.csv -t $outputHucDataDir/hydroTable.csv -w $outputHucDataDir/wbd8_clp.gpkg -b $outputHucDataDir/nwm_subset_streams.gpkg -y $outputHucDataDir/nwm_catchments_proj_subset.tif -m $manning_n -z $input_NWM_Catchments -p $extent -k $outputHucDataDir/small_segments.csv
Tcount


## USGS CROSSWALK ##
echo -e $startDiv"USGS Crosswalk $hucNumber"$stopDiv
date -u
Tstart
$srcDir/usgs_gage_crosswalk.py -gages $inputDataDir/usgs_gages/usgs_gages.gpkg -dem $outputHucDataDir/dem_meters.tif -flows $outputHucDataDir/demDerived_reaches_split_filtered_addedAttributes_crosswalked.gpkg -cat $outputHucDataDir/gw_catchments_reaches_filtered_addedAttributes_crosswalked.gpkg -wbd $outputHucDataDir/wbd_buffered.gpkg -dem_adj $dem_thalwegCond -outtable $outputHucDataDir/usgs_elev_table.csv
Tcount

## CLEANUP OUTPUTS ##
echo -e $startDiv"Cleaning up outputs $hucNumber"$stopDiv
args=()
Expand Down
124 changes: 124 additions & 0 deletions src/usgs_gage_crosswalk.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,124 @@
#!/usr/bin/env python3

import os
import geopandas as gpd
import pandas as pd
import numpy as np
import rasterio
import argparse
import pygeos
from shapely.wkb import dumps, loads
import warnings
warnings.simplefilter(action='ignore', category=FutureWarning)

''' Get elevation at adjusted USGS gages locations
Parameters
----------
usgs_gages_filename : str
File name of USGS stations layer.
dem_filename : str
File name of original DEM.
input_flows_filename : str
File name of FIM streams layer.
input_catchment_filename : str
File name of FIM catchment layer.
wbd_buffer_filename : str
File name of buffered wbd.
dem_adj_filename : str
File name of thalweg adjusted DEM.
output_table_filename : str
File name of output table.
'''


def crosswalk_usgs_gage(usgs_gages_filename,dem_filename,input_flows_filename,input_catchment_filename,wbd_buffer_filename,dem_adj_filename,output_table_filename):

wbd_buffer = gpd.read_file(wbd_buffer_filename)
usgs_gages = gpd.read_file(usgs_gages_filename, mask=wbd_buffer)
dem_m = rasterio.open(dem_filename,'r')
input_flows = gpd.read_file(input_flows_filename)
input_catchment = gpd.read_file(input_catchment_filename)
dem_adj = rasterio.open(dem_adj_filename,'r')

if input_flows.HydroID.dtype != 'int': input_flows.HydroID = input_flows.HydroID.astype(int)

# Identify closest HydroID
closest_catchment = gpd.sjoin(usgs_gages, input_catchment, how='left', op='within').reset_index(drop=True)
closest_hydro_id = closest_catchment.filter(items=['site_no','HydroID','min_thal_elev','med_thal_elev','max_thal_elev', 'order_'])
closest_hydro_id = closest_hydro_id.dropna()

# Get USGS gages that are within catchment boundaries
usgs_gages = usgs_gages.loc[usgs_gages.site_no.isin(list(closest_hydro_id.site_no))]

columns = ['location_id','HydroID','dem_elevation','dem_adj_elevation','min_thal_elev', 'med_thal_elev','max_thal_elev','str_order']
gage_data = []

# Move USGS gage to stream
for index, gage in usgs_gages.iterrows():

# Get stream attributes
hydro_id = closest_hydro_id.loc[closest_hydro_id.site_no==gage.site_no].HydroID.item()
str_order = str(int(closest_hydro_id.loc[closest_hydro_id.site_no==gage.site_no].order_.item()))
min_thal_elev = round(closest_hydro_id.loc[closest_hydro_id.site_no==gage.site_no].min_thal_elev.item(),2)
med_thal_elev = round(closest_hydro_id.loc[closest_hydro_id.site_no==gage.site_no].med_thal_elev.item(),2)
max_thal_elev = round(closest_hydro_id.loc[closest_hydro_id.site_no==gage.site_no].max_thal_elev.item(),2)

# Convert headwater point geometries to WKB representation
wkb_gages = dumps(gage.geometry)

# Create pygeos headwater point geometries from WKB representation
gage_bin_geom = pygeos.io.from_wkb(wkb_gages)

# Closest segment to headwater
closest_stream = input_flows.loc[input_flows.HydroID==hydro_id]
wkb_closest_stream = dumps(closest_stream.geometry.item())
stream_bin_geom = pygeos.io.from_wkb(wkb_closest_stream)

# Linear reference headwater to closest stream segment
gage_distance_to_line = pygeos.linear.line_locate_point(stream_bin_geom, gage_bin_geom)
referenced_gage = pygeos.linear.line_interpolate_point(stream_bin_geom, gage_distance_to_line)

# Convert geometries to wkb representation
bin_referenced_gage = pygeos.io.to_wkb(referenced_gage)

# Convert to shapely geometries
shply_referenced_gage = loads(bin_referenced_gage)

# Sample rasters at adjusted gage
dem_m_elev = round(list(rasterio.sample.sample_gen(dem_m,shply_referenced_gage.coords))[0].item(),2)
dem_adj_elev = round(list(rasterio.sample.sample_gen(dem_adj,shply_referenced_gage.coords))[0].item(),2)

# Append dem_m_elev, dem_adj_elev, hydro_id, and gage number to table
site_elevations = [str(gage.site_no), str(hydro_id), dem_m_elev, dem_adj_elev, min_thal_elev, med_thal_elev, max_thal_elev,str(str_order)]
gage_data.append(site_elevations)


elev_table = pd.DataFrame(gage_data, columns=columns)

if not elev_table.empty:
elev_table.to_csv(output_table_filename,index=False)


if __name__ == '__main__':

parser = argparse.ArgumentParser(description='Crosswalk USGS sites to HydroID and get elevations')
parser.add_argument('-gages','--usgs-gages-filename', help='USGS gages', required=True)
parser.add_argument('-dem','--dem-filename',help='DEM',required=True)
parser.add_argument('-flows','--input-flows-filename', help='DEM derived streams', required=True)
parser.add_argument('-cat','--input-catchment-filename', help='DEM derived catchments', required=True)
parser.add_argument('-wbd','--wbd-buffer-filename', help='WBD buffer', required=True)
parser.add_argument('-dem_adj','--dem-adj-filename', help='Thalweg adjusted DEM', required=True)
parser.add_argument('-outtable','--output-table-filename', help='Table to append data', required=True)

args = vars(parser.parse_args())

usgs_gages_filename = args['usgs_gages_filename']
dem_filename = args['dem_filename']
input_flows_filename = args['input_flows_filename']
input_catchment_filename = args['input_catchment_filename']
wbd_buffer_filename = args['wbd_buffer_filename']
dem_adj_filename = args['dem_adj_filename']
output_table_filename = args['output_table_filename']

crosswalk_usgs_gage(usgs_gages_filename,dem_filename,input_flows_filename,input_catchment_filename,wbd_buffer_filename, dem_adj_filename,output_table_filename)
Loading

0 comments on commit 44b6333

Please sign in to comment.