## Identify Points for StreamStats

__Description__: Tool to identify confluence pair points for tributaries of a specific length, add points to the main stem of a stream network at a specific distance interval, and export a shapefile of the points. For additional details, see the [StreamStats Automation Wiki](https://github.com/Dewberry/usgs-tools/wiki/StreamStats-Automation).

__Input__: Stream grid from the [SteamStats Repository](https://streamstatsags.cr.usgs.gov/StreamGrids/directoryBrowsing.asp), masked using `ClipRaster_withMask.ipynb` and the latitude and longitude of the catchment outlet.

__Output__: A shapefile containing the latitude and longitude of points (confluence and main stem locations) that will be used as the input to `StreamStats_API_Scraper_Auto.ipynb`.


*Authors*: sputnam@Dewberry.com & slawler@Dewberry.com

### Load libraries and Python options:

In [1]:
import os
import collections 
import pandas as pd
import numpy as np
import geopandas as gpd
from osgeo import gdal, ogr,osr
from shapely.geometry import Point
from StreamStats_Points import*

### Load the masked stream grid:

##### Specify:

In [2]:
path=r'C:\Users\sputnam\Documents\GitHub\usgs-tools\results\Rock_Creek.tif' #Load the stream grid raster which was masked by the catchment polygon

##### Load:

In [3]:
sg = StreamGrid(path) #Open the stream grid raster and create an object

crs=sg.crs_value() #Extract the coordinate reference system value (epsg) for the raster
print("epsg:",crs) 

df = sg.dataframe() #Create a dataframe from the stream grid data
df.replace(255, 0, inplace=True) #Replace 255 with 0, where 255 corresponds to the non-stream cells
df.head(n=2) 

epsg: 5070


Unnamed: 0,0,1,2,3,4,5,6,7,8,9,...,1205,1206,1207,1208,1209,1210,1211,1212,1213,1214
0,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
1,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0


### Specify the pour point to set the start location of the search:

##### Specify:

In [4]:
lat=1925315.186 #latitude of the pourpoint at the catchment outlet
lon=1616784.964 #longitude of the pourpoint

##### Convert the lat/lon to row/column in the stream grid dataframe and extract the cell size:

In [5]:
pix_x, pix_y =coord2index(sg, lat, lon) #Transform the lat and lon values to the row/column location with the stream grid dataframe
pourpoint=[(pix_x, pix_y)] #Add these values to a list as a touple
print("Pourpoint XY:", pourpoint)

cellsize=sg.cell_size() #Raster cell size in meters
print("The Cell Size:", cellsize)

Pourpoint XY: [(877, 1848)]
The Cell Size: 10.0


### Move up the stream and identify the confluences:

##### Specify parameters and intalize objects:

In [6]:
nogo=[] #Empty list to store the stream cells that we do not want to return to since we have already searched them
confluence_pairs=[] #Empty list to store the identified confluence pairs
save_confluence=[] #Empty list to store the location of confluences that are three cells away from the original confluence location
cnum=0 #The confluence number or ID
count=0 #Counting variable. The number of times we have looped over the while loop below

starting_point=pourpoint[0]+(cnum,) #The starting point of the stream network where we want to start searching for confluences. Add the confluence number

nogo.append(starting_point) #Add the starting point to the no go list

##### Identify confluences:

In [7]:
nogo=[] #Empty list to store the stream cells that we do not want to return to since we have already searched them
confluence_pairs=[] #Empty list to store the identified confluence pairs
save_confluence=[] #Empty list to store the location of confluences that are three cells away from the original confluence location
cnum=0 #The confluence number or ID
count=0 #Counting variable. The number of times we have looped over the while loop below

starting_point=pourpoint[0]+(cnum,) #The starting point of the stream network where we want to start searching for confluences. Add the confluence number

nogo.append(starting_point) #Add the starting point to the no go list

while len(starting_point)>0:
    count+=1
    cnum=count
    
    next_cell=MoveUpstream(df, starting_point, nogo, cnum)  
    
    if len(next_cell) == 1:
        nogo.append(next_cell[0])
        starting_point = next_cell[0]
        
    else:
        if len(next_cell)>1:
            nogo=nogo+next_cell
            confluence_pairs=confluence_pairs+next_cell
        if len(confluence_pairs)>0:
            starting_point=confluence_pairs[0]
            confluence_pairs.remove(starting_point)
            cnum=starting_point[2]
            
            i=0
            while i<2:
                next_cell=MoveUpstream(df, starting_point, nogo, cnum)
                
                if len(next_cell) == 1:
                    nogo.append(next_cell[0])
                    starting_point = next_cell[0]
                    i+=1
                    continue
                elif len(next_cell)>1:
                    confluence_pairs=confluence_pairs+next_cell
                    i=2
                else:
                    i=2
                
            if len(next_cell) == 1:
                save_confluence.append(starting_point)
        else:
            starting_point=[]

### Remove superflous confluences:

In [8]:
true_confluence=[] #Empty list to store the true confluences, i.e. those that are not just two stream cells next to eachother
confl_num=[] #List to store the extracted confluence numbers

for cell in save_confluence: #For each stream cell, add the confluence number to a list
    confl_num.append(cell[2])

for cell in save_confluence: #For each stream cell, if there are two or more stream cells with the same confluence number, i.e it is a confluence, add to the true_confluence list.
    if confl_num.count(cell[2])>=2:
        true_confluence.append(cell)
        
false_confluence=list(set(save_confluence)-set(true_confluence)) #List of cells that were identified as confluences but do not have tributaries

print("All Points:", len(save_confluence), "True Confluences:", len(true_confluence))    

All Points: 614 True Confluences: 475


##### Identify the original superflous confluence location:

In [10]:
false_cnum=[]
false_points=[]

for cell in false_confluence:
    false_cnum.append(cell[2])

for cell in nogo:
    if cell[2] in false_cnum:
        false_points.append(cell)
        
false_points=list(set(false_points)-set(false_confluence))        

### Calculate the tributary length:

##### Specify parameters and intalize objects:

In [11]:
tributary=[] 
mainstem=[]

walk_confluence=true_confluence.copy() #Copy the true_confluence list, since we want to walk upstream of these to calculate the length to the end of the trib or next confluence
nogo=[walk_confluence[0]]
starting_point=walk_confluence[0] #Assign the first confluence point to the starting_point

total_dis=0.0 #The total distance from the confluence point
count=1
repeat=0

false_pointswocnum=remove_cnum(false_points)

##### Calculate length:

In [12]:
nogoabs=nogo.copy()

while len(walk_confluence)>0:
        next_cell=MoveUpstream(df, starting_point, nogo)
        
        if len(next_cell)==1 or len(next_cell)>1 and count==1:
            step_dis=TrueDistance(starting_point, next_cell[0], cellsize)
            total_dis=step_dis+total_dis
            nogo.append(next_cell[0])
            starting_point = next_cell[0]
            count+=1
            continue
 
        elif len(next_cell)>1 and 1<count<=4:
            next_cellwocnum=remove_cnum(next_cell)
            if any(x in next_cellwocnum for x in false_pointswocnum): #If any of the cells in next_cell are in false_confluence, then find that cell and assign it to next cell
                nogo=nogo+next_cell
                for cell in next_cell:
                    test_cell=MoveUpstream(df, cell, nogo)
                    if len(test_cell)==0:
                        next_cell.remove(cell)
                step_dis=TrueDistance(starting_point, next_cell[0], cellsize)
                total_dis=step_dis+total_dis
                starting_point=next_cell[0]
                count+=1
                continue
            else:
                if repeat==0:
                    total_dis=0.0
                    starting_point = walk_confluence[0]
                    count=1
                    repeat=1
                else:
                    step_dis=TrueDistance(starting_point, next_cell[0], cellsize)
                    total_dis=step_dis+total_dis
                    mainstem.append(walk_confluence[0]+(total_dis,))
                    walk_confluence.remove(walk_confluence[0])
                    if len(walk_confluence)>0:
                        total_dis=0.0
                        nogoabs=nogoabs+nogo
                        starting_point=walk_confluence[0]                        
                        nogo=[starting_point]
                        repeat=0
                        count=1
                    else:
                        walk_confluence=[] 
            
        elif len(next_cell)>1 and count>4:    
            next_cellwocnum=remove_cnum(next_cell)
            if any(x in next_cellwocnum for x in false_pointswocnum): #If any of the cells in next_cell are in false_confluence, then find that cell and assign it to next cell
                nogo=nogo+next_cell
                for cell in next_cell:
                    test_cell=MoveUpstream(df, cell, nogo)
                    if len(test_cell)==0:
                        next_cell.remove(cell)
                step_dis=TrueDistance(starting_point, next_cell[0], cellsize)
                total_dis=step_dis+total_dis
                starting_point=next_cell[0]
                count+=1
                continue
            else:
                step_dis=TrueDistance(starting_point, next_cell[0], cellsize)
                total_dis=step_dis+total_dis
                mainstem.append(walk_confluence[0]+(total_dis,))
                walk_confluence.remove(walk_confluence[0])
                if len(walk_confluence)>0:
                    total_dis=0.0
                    nogoabs=nogoabs+nogo
                    starting_point=walk_confluence[0]
                    nogo=[starting_point]
                    repeat=0
                    count=1
                else:
                    walk_confluence=[] 
                    
        elif len(next_cell)==0:
            tributary.append(walk_confluence[0]+(total_dis,))
            walk_confluence.remove(walk_confluence[0])
            if len(walk_confluence)>0:
                total_dis=0.0
                nogoabs=nogoabs+nogo
                starting_point=walk_confluence[0]
                nogo=[starting_point]
                repeat=0
                count=1
            else:
                walk_confluence=[]
                
nogoabs=list(set(nogoabs))                

In [13]:
len(mainstem)+len(tributary)

475

In [14]:
len(mainstem)

243

##### Remove confluences with tributaries less than specific length:

In [19]:
disexl=(5280/2.0)*(0.3048)

incl_tribs=[]

for cell in tributary:
    if cell[3]>=disexl:
        incl_tribs.append(cell) 

print(len(incl_tribs))        

18


### Save the results:

##### Extract the confluence number:

In [20]:
cnum_mainstem=[]
cnum_tributary=[]

for cell in mainstem:
    cnum_mainstem.append(cell[2])

for cell in tributary:
    cnum_tributary.append(cell[2])

##### Extract the distance:

In [21]:
dis_tribs=[]

for cell in tributary:
    dis_tribs.append(cell[3])    

##### Transform and save as a shapefile:

In [22]:
lists=[save_confluence, true_confluence, mainstem, tributary, false_points, nogoabs, incl_tribs]
names=['save_confluence', 'true_confluence', 'mainstem', 'tributary', 'false_points', 'nogo', 'incl_tribs']
distance=[[],[],[],dis_tribs,[],[],[]]
cnums=[[],[],cnum_mainstem,cnum_tributary,[],[],[]]

for i in range(len(lists)):
    longitude, latitude=index2coord(sg, lists[i])  #Transform the row/column value from the stream grid dataframe to latitude/longitude for each confluence
    gdf=geodataframe(longitude, latitude, crs, distance[i],cnums[i]) #Store the longitude/latitude for each confluence in a geodataframe
    gdf.to_file(filename = r'C:\Users\sputnam\Documents\GitHub\usgs-tools\results\{}.shp'.format(names[i])) #Export the geodataframe as a shapefule

# End