## StreamStats API Scraper Automatic

__Description__: Tool to automatically run the [USGS StreamStats tool](https://www.usgs.gov/mission-areas/water-resources/science/streamstats-streamflow-statistics-and-spatial-analysis-tools?qt-science_center_objects=0#qt-science_center_objects) for multiple points within a catchment and return the flow frequency curves and subcatchment boundaries.

__Input__: A shapefile containing the latitude and longitude of points on the stream grid for the specified state (confluence and main stem locations).

__Output__: GeoJSON file containing the delinated catchment boundary and flow frequency data for each point, as well as a CSV file containing the flow frequency data.

*Authors*: sputnam@Dewberry.com & slawler@Dewberry.com

### Load libraries and Python options:

In [1]:
import os
import re
import sys
sys.path.append('../USGStools')
from StreamStats_API_Scraper import*
import geopandas as gpd
from geojson import dump

### Specify the state abbreviation and location of the shapefile: 

##### Specify:

In [2]:
state='NY' #The state abbreviation in uppercase

path=r'C:\Users\sputnam\Documents\GitHub\usgs-tools\StreamStats\results\04150303' #Specify the location of the shapefile containing the lat/lon of points on the stream grid

name='04150303_Confluences_Scoped.shp' #The name of the shapefile

use_epsg='4326' #Specify a consistent coordinate reference system

allresults=os.path.join(path,'AllStreamStats') #Location to save the StreamStats results for each polygon

if os.path.isdir(allresults)==False: #If the desired path does not exist, create it.
    os.mkdir(allresults)    

##### Load the shapefile:

In [3]:
gdf=gpd.read_file(os.path.join(path, name)) #Read the shapefile as a geopandas dataframe

gdf=gdf.set_index('num').copy(deep=True) #Set the index to the confluence number

gdf=gdf.to_crs({'init': 'epsg:{0}'.format(use_epsg)}) #Transform the coordinate reference system of the geodataframe

geom=gdf.geometry #Extract the shapley geometry for the outlets in the shapefile

print(geom.head(2))

num
0    POINT (-75.47503709579541 44.62285051746835)
1    POINT (-75.48063360291154 44.61364452521327)
Name: geometry, dtype: object


### Run the API tool for each point:

In [None]:
polyg={} #Dictionary to store the catchment polygons (catchment boundaries) 

ffdata={} #Dictionary to store the outlet flow frequency data dictionaries

get_flow=True
print_status=True

if state=='WI': get_flow=False 

start=296 #The confluence number to start. Normally set to zero unless there was an issue
for i, xy in enumerate(geom[start:]): #For gdf.geometry:
    j=i+start
    lon, lat = xy.x, xy.y #Longitude and latitude for each shapely point
    if print_status: print("Lat/Lon/Index:", lat, lon, j)
    polyg[j], ff_json  = SS_scrape(state, lon, lat, use_epsg, print_status) #Run the SS_scrape function. Option: set status=False to hide print statements
    if get_flow: 
        ffdata[j]= get_peaks(ff_json) #Use the function above to extract the json data
        polyg[j]['features'][0]['ffcurve']=ffdata[j]
    with open(os.path.join(allresults,'StreamStats_Polygons_{0}.geojson'.format(gdf.index[j])), 'w') as f:
       dump(polyg[j], f)       

Lat/Lon/Index: 44.38826996361243 -75.58500088589655 296
Fetched Peak Flows
Lat/Lon/Index: 44.553540387226015 -75.45042505997195 297
Fetched Peak Flows
Lat/Lon/Index: 44.55291171124255 -75.45004249996724 298
Fetched Peak Flows
Lat/Lon/Index: 44.54156376458063 -75.45121385985014 299
Line 28: Expecting value: line 1 column 1 (char 0
while loop: watershed_data count: 1
Fetched Peak Flows
Lat/Lon/Index: 44.54183234326578 -75.45159357341575 300
Fetched Peak Flows
Lat/Lon/Index: 44.444189300797106 -75.52712223866489 301
Fetched Peak Flows
Lat/Lon/Index: 44.41606538094875 -75.74379708621055 302
Fetched Peak Flows
Lat/Lon/Index: 44.54663419492093 -75.44382551563056 303
Fetched Peak Flows
Lat/Lon/Index: 44.51274451554173 -75.4760290255593 304
Fetched Peak Flows
Lat/Lon/Index: 44.55992807826284 -75.45148168989718 305
Fetched Peak Flows
Lat/Lon/Index: 44.555166759539944 -75.44892670438938 306
Fetched Peak Flows
Lat/Lon/Index: 44.547559683296576 -75.4372861621211 307
Fetched Peak Flows
Lat/Lon/Inde

### Load the results:

In [None]:
poly_files=[] #Empty list to store the geojson paths

poly_files=load_all_results(allresults)

gdf=gpd.GeoDataFrame(crs={'init': 'epsg:{}'.format(use_epsg)})
                          
for _,filename in enumerate(poly_files):
    num=re.findall('\d+', filename)
    temp_df=gpd.read_file(filename)
    temp_df['ID_Num']=int(num[-1])
    gdf=gdf.append(temp_df.iloc[0])

###  Save:

##### The flow frequency data as a CSV:

In [None]:
if get_flow: ffdata_df=ff_summary(ffdata) #Run this function to construct the summary table for all outlet locations
    
if get_flow: ffdata_df.to_csv(os.path.join(path,'StreamStats_FlowFrequency.csv')) #Save the results as a csv

##### The catchment polygons as a Shapefile:

In [None]:
gdf.to_file(filename = os.path.join(path,'StreamStats_Polygons.shp')) #Export the geodataframe as a shapefile

##### The catchment polygons as a geojson:

In [None]:
with open(os.path.join(path,'StreamStats_Polygons.geojson'), 'w') as f:
     dump(gdf, f)  

# END