## StreamStats API Scraper Automatic

__Description__: Tool to automatically run the [USGS StreamStats tool](https://www.usgs.gov/mission-areas/water-resources/science/streamstats-streamflow-statistics-and-spatial-analysis-tools?qt-science_center_objects=0#qt-science_center_objects) for multiple points within a catchment and return the flow frequency curves and subcatchment boundaries.

__Input__: A shapefile containing the latitude and longitude of points on the stream grid for the specified state (confluence and main stem locations).

__Output__: GeoJSON file containing the delinated catchment boundary and flow frequency data for each point, as well as a CSV file containing the flow frequency data.

*Authors*: sputnam@Dewberry.com & slawler@Dewberry.com

### Load libraries and Python options:

In [1]:
import os
import sys
sys.path.append('../USGStools')
from StreamStats_API_Scraper import*
import geopandas as gpd
from geojson import dump

### Specify the state abbreviation and location of the shapefile: 

In [2]:
state='NY' #The state abbreviation in uppercase

path=r'C:\Users\sputnam\Documents\GitHub\usgs-tools\StreamStats\results\SalmonCreek' #Specify the location of the shapefile containing the lat/lon of points on the stream grid
name='Confluences.shp' #The name of the shapefile

##### Load the shapefile:

In [3]:
use_epsg='4326' #Specify a consistent coordinate reference system

gdf=gpd.read_file(os.path.join(path, name)) #Read the shapefile as a geopandas dataframe
gdf=gdf.set_index('num').copy(deep=True) #Set the index to the confluence number

gdf=gdf.to_crs({'init': 'epsg:{0}'.format(use_epsg)}) #Transform the coordinate reference system of the geodataframe

geom=gdf.geometry #Extract the shapley geometry for the outlets in the shapefile

print(geom.head(2))

num
0    POINT (-74.54546735718218 44.99859678269318)
1    POINT (-74.54704371705137 44.99176157571224)
Name: geometry, dtype: object


### Run the API tool for each point:

In [None]:
pp_dic = {} #Dictionary to store the outlet flow frequency data dictionaries

watershed_poly_dic= {} #Dictionary to store the catchment polygons (catchment boundaries)

pp_fail=[] #List to store outlet locations whose flow frequency/catchment polygons were not calculated

pp_dic, watershed_poly_dic=snappoint_analysis(geom, state, status=True) #Run the snappoint function for all catchment outlets within the shapefile and for the specified state. Option: set status=False to hide print statements

-74.54546735718218 44.998596782693184
Fetched Peak Flows
-74.54704371705137 44.99176157571224
Fetched Peak Flows
-74.5534966416888 44.993947452052396
Fetched Peak Flows
-74.54824694774688 44.98393464881284
Fetched Peak Flows
-74.57079689624692 44.98771242150379
Line 28: Expecting value: line 1 column 1 (char 0
while loop: watershed_data count: 1
Fetched Peak Flows
-74.55104795706318 44.98259540136499
Fetched Peak Flows
-74.54016232617454 44.979761483211284


### Construct a summary table of the flow frequency data for each outlet:

In [None]:
ffdata=ff_summary(pp_dic) #Run this function to construct the summary table for all outlet locations

ffdata.head()    

###  Save the results:

##### As a CSV:

In [None]:
ffdata.to_csv(os.path.join(path,'StreamStats_FlowFrequency.csv')) #Save the results as a csv

##### As a geojson:

In [None]:
for i in pp_dic.keys():
    watershed_poly_dic[i]['features'][0]['ffcurve']=pp_dic[i]
    
with open(os.path.join(path,'StreamStats_Polygons.geojson'), 'w') as f:
   dump(watershed_poly_dic, f)        

# END