# Create watersheds and catchments

AKSSF project has ~ 500 sites that have been shifted to the flow networks.
We need to create watersheds for each. This will make extracting spatial
and climatic covariates for modeling go much faster.

Dustin pointed out that there are fromnodes and tonodes in the NHDPlus that can be used to navigate
upstream, save all the NHPPlusIDs, select and merge the catchments to create watersheds for each site.
This works in R so just need to transfer to python. Premise is to use a while loop to keep selecting
new stream segments that have their tonode match the fromnode of the last segment(s).
Logic for stopping while loop:
Stop when summing the ids is not greater than 0. I'm not sure why this works, but it does
for some watersheds, although I think it may be running infinitely on other watersheds.
Alternatively, if sum(StartFlag) == count(rec) then all NHDPlusIDs are headwater streams.

NOTE: creating an attribute index on the field that we are selecting on in the cats_merge
feature class is really important for creating large watersheds. This vastly sped up the time
for the Cook Inlet watersheds, like the mainstem Susitna. But, you can only run it once, then
the index is created. This has been added to the Cook Inlet watersheds steps 4-7 code chunk.
Update 20211122 - Now include eliminate polygon part in watershed creation and limit attribute index to only include
streams/rivers and artificial paths when building watersheds.

Create a loop and process watersheds for all the points. Start with Cook Inlet first.
Note that folders, geodbs, and merged catchments are created in the merge_grids script.
1. select catchments that intersect points to get NHDPlusID
2. create list of IDs
3. use loop to create watersheds
4. first get list of all upstream NHDPlusIDs
5. create temporary layer of catchments
6. select catchments that match the upstream IDs
7. dissolved on those catchments and save to Cook Inlet gdb and watersheds feature dataset

In [1]:
#Set paths to gdbs
cr_gdb = r"D:\\GIS\\AKSSF\\Copper_River\\Copper_River.gdb"
ci_gdb = r"D:\\GIS\\AKSSF\\Cook_Inlet\\Cook_Inlet.gdb"
bbay_gdb = r"D:\\GIS\\AKSSF\\Bristol_Bay\\Bristol_Bay.gdb"
pws_gdb = r"D:\\GIS\\AKSSF\\Prince_William_Sound\\Prince_William_Sound.gdb"
kod_gdb = r"D:\\GIS\\AKSSF\\Kodiak\\Kodiak.gdb"
points = r"C:\\Users\\dwmerrigan\\Documents\\GitHub\\AKSSF\\hydrography\\AKSSF_Hydrography.gdb\\AKSSF_Sites_Shifted_2021202"

In [2]:
# COOK INLET
# steps 1 and 2
# intersect points with catchments and create list of NHDPlusIDs
import arcpy, os
arcpy.env.workspace = ci_gdb
arcpy.env.overwriteOutput = True
#Updated point layer - save to T: and change path once run.
points = r"C:\\Users\\dwmerrigan\\Documents\\GitHub\\AKSSF\\hydrography\\AKSSF_Hydrography.gdb\\AKSSF_Sites_Shifted_2021202"
cats = os.path.join(ci_gdb,'cats_merge')
#List to store catchments to create watersheds from
idList = []
outcats = "cats_intersect"

arcpy.MakeFeatureLayer_management(cats, "tempLayer")
#intersect cats merge with verified points
arcpy.management.SelectLayerByLocation("tempLayer", "INTERSECT", points)
arcpy.CopyFeatures_management("tempLayer", outcats)

fields = arcpy.ListFields("tempLayer")
for field in fields:
    print("{0}".format(field.name))
#Populate idList from cats_intersect
with arcpy.da.SearchCursor("tempLayer", ["NHDPlusID"]) as cursor:
    for row in cursor:
        idList.append(row[0])

print(len(idList))

#check if duplicate catchments in the idList
idset = set(idList)
print(idset)
print(len(idset))


OBJECTID
Shape
NHDPlusID
SourceFC
GridCode
AreaSqKm
VPUID
Shape_Length
Shape_Area
240
{75004400010754.0, 75000200013319.0, 75004300001288.0, 75000200004615.0, 75004300002314.0, 75000200000010.0, 75000200000011.0, 75000200015883.0, 75000200015882.0, 75000300025361.0, 75000200000021.0, 75000200003094.0, 75004300002332.0, 75000200015900.0, 75000200008734.0, 75004400001568.0, 75000200003107.0, 75004300006440.0, 75004400009260.0, 75004400009261.0, 75005300022831.0, 75004300002352.0, 75004300000311.0, 75004400010810.0, 75004400011322.0, 75004300005437.0, 75004400000576.0, 75000200014914.0, 75000400014403.0, 75000200013892.0, 75000400014405.0, 75000700036166.0, 75000200011336.0, 75000200005532.0, 75004300008012.0, 75004400010320.0, 75004300003409.0, 75004400003154.0, 75004300000856.0, 75004400009308.0, 75004300004452.0, 75000200010344.0, 75000100003947.0, 75000200015469.0, 75000200017518.0, 75000500010609.0, 75004300001906.0, 75004400000627.0, 75004400009331.0, 75000200002162.0, 7500430000498

In [3]:
# COOK INLET
# steps 4-7

import arcpy, time, datetime, os
import pandas as pd
from functools import reduce

# steps 4-9 for loop to create watersheds
arcpy.env.workspace = ci_gdb
arcpy.env.overwriteOutput = True
arcpy.env.qualifiedFieldNames = False
arcpy.env.overwriteOutput = True
sr = arcpy.SpatialReference(3338)  #'NAD_1983_Alaska_Albers'
arcpy.env.outputCoordinateSystem = sr

# Start timing function
processStart = time.time()
processStartdt = datetime.datetime.now()

vaa = os.path.join(ci_gdb, "vaa_merge")
cats = os.path.join(ci_gdb, "cats_merge")
streams = os.path.join(ci_gdb,'NHDFlowline_merge')

# Get list of index names for cats merge and add index if not already created
index_names = [i.name for i in arcpy.ListIndexes(cats)]
print(index_names)
if 'NHDPlusID_index' not in index_names:
    print (f'Creating index for {cats}')
    arcpy.AddIndex_management(cats,'NHDPlusID','NHDPlusID_index')
else:
    print(f'{cats} Indexed')

#watersheds feature dataset for storing fcs
fdat = os.path.join(ci_gdb, 'Watersheds')
if not arcpy.Exists(fdat):
    print(f'Creating watershed feature dataset {fdat}')
    arcpy.management.CreateFeatureDataset(ci_gdb, "Watersheds", sr)
else:
    print(f'{fdat} exists for Cook Inlet')

vaa_df1 = pd.DataFrame(arcpy.da.TableToNumPyArray(vaa, ("NHDPlusID", "FromNode", "ToNode", "StartFlag")))
stream_df = pd.DataFrame(arcpy.da.TableToNumPyArray(streams, ("NHDPlusID", "FType")))
dfs = [vaa_df1, stream_df]
vaa_df = reduce(lambda left,right: pd.merge(left,right,on='NHDPlusID',how="outer"), dfs)
# remove pipelines
vaa_df = vaa_df[(vaa_df['FType'] != 428 )]
vaa_df

c=1
for id in idList:
    print(f"{c}. Starting watershed for: " + str(id))
    rec = [id]
    up_ids = []
    up_ids.append(rec)
    rec_len = len(rec)
    hws_sum = 0

    while rec_len != hws_sum:
        fromnode = vaa_df.loc[vaa_df["NHDPlusID"].isin(rec), "FromNode"]
        rec = vaa_df.loc[vaa_df["ToNode"].isin(fromnode), "NHDPlusID"]
        rec_len = len(rec)
        rec_hws = vaa_df.loc[vaa_df["ToNode"].isin(fromnode), "StartFlag"]
        hws_sum = sum(rec_hws)
        up_ids.append(rec)

    newup_ids = []
    for x in up_ids:
        newup_ids.extend(x)
    tempLayer = "catsLyr"
    expression = '"NHDPlusID" IN ({0})'.format(', '.join(map(str, newup_ids)) or 'NULL')
    arcpy.MakeFeatureLayer_management(cats, tempLayer, where_clause=expression)
    outdis = "memory/wtd_" + str(round(id))
    outwtd = "Watersheds\\wtd_" + str(round(id))
    print(outwtd)
    print('----------')
    dis = arcpy.Dissolve_management(tempLayer, outdis)
    watershed = arcpy.EliminatePolygonPart_management(dis, outwtd,"PERCENT", "0 SquareKilometers", 90, "CONTAINED_ONLY")
    c=c+1

# End timing
processEnd = time.time()
processElapsed = int(processEnd - processStart)
processSuccess_time = datetime.datetime.now()

# Report success
print(f'Process completed at {processSuccess_time.strftime("%Y-%m-%d %H:%M")} '
      f'(Elapsed time: {datetime.timedelta(seconds=processElapsed)})')
print('----------')

['FDO_OBJECTID', 'FDO_Shape']
Creating index for D:\\GIS\\AKSSF\\Cook_Inlet\\Cook_Inlet.gdb\cats_merge
Creating watershed feature dataset D:\\GIS\\AKSSF\\Cook_Inlet\\Cook_Inlet.gdb\Watersheds
1. Starting watershed for: 75004300006312.0
Watersheds\wtd_75004300006312
----------
2. Starting watershed for: 75004300001906.0
Watersheds\wtd_75004300001906
----------
3. Starting watershed for: 75004300000100.0
Watersheds\wtd_75004300000100
----------
4. Starting watershed for: 75004300004983.0
Watersheds\wtd_75004300004983
----------
5. Starting watershed for: 75004300004332.0
Watersheds\wtd_75004300004332
----------
6. Starting watershed for: 75004300006239.0
Watersheds\wtd_75004300006239
----------
7. Starting watershed for: 75004300004304.0
Watersheds\wtd_75004300004304
----------
8. Starting watershed for: 75004300002332.0
Watersheds\wtd_75004300002332
----------
9. Starting watershed for: 75004300005437.0
Watersheds\wtd_75004300005437
----------
10. Starting watershed for: 75004300003464.

In [4]:
# COPPER RIVER
# steps 1 and 2
# intersect points with catchments and create list of NHDPlusIDs
import arcpy, os
arcpy.env.workspace = cr_gdb
arcpy.env.overwriteOutput = True
sr = arcpy.SpatialReference(3338)  #'NAD_1983_Alaska_Albers'
arcpy.env.outputCoordinateSystem = sr


cats = os.path.join(cr_gdb,'cats_merge')
idList = []
outcats = "cats_intersect"

arcpy.MakeFeatureLayer_management(cats, "tempLayer")
arcpy.management.SelectLayerByLocation("tempLayer", "INTERSECT", points)
arcpy.CopyFeatures_management("tempLayer", outcats)

fields = arcpy.ListFields("tempLayer")
for field in fields:
    print("{0}".format(field.name))
with arcpy.da.SearchCursor("tempLayer", ["NHDPlusID"]) as cursor:
    for row in cursor:
        idList.append(row[0])

print(len(idList))

OBJECTID
Shape
NHDPlusID
SourceFC
GridCode
AreaSqKm
VPUID
Shape_Length
Shape_Area
28


In [None]:
# COPPER RIVER
# steps 4-7

import arcpy, time, datetime, os
import pandas as pd
from functools import reduce

# steps 4-9 for loop to create watersheds
arcpy.env.workspace = cr_gdb
arcpy.env.overwriteOutput = True
arcpy.env.qualifiedFieldNames = False
sr = arcpy.SpatialReference(3338)  #'NAD_1983_Alaska_Albers'
arcpy.env.outputCoordinateSystem = sr

# Start timing function
processStart = time.time()
processStartdt = datetime.datetime.now()

vaa = os.path.join(cr_gdb, "vaa_merge")
cats = os.path.join(cr_gdb, "cats_merge")
streams = os.path.join(cr_gdb, "NHDFlowline_merge")

# Get list of index names for cats merge and add index if not already created
index_names = [i.name for i in arcpy.ListIndexes(cats)]
print(index_names)
if 'NHDPlusID_index' not in index_names:
    print (f'Creating index for {cats}')
    arcpy.AddIndex_management(cats,'NHDPlusID','NHDPlusID_index')
else:
    print(f'{cats} Indexed')

#watersheds feature dataset for storing fcs
fdat = os.path.join(cr_gdb, 'Watersheds')
if not arcpy.Exists(fdat):
    arcpy.management.CreateFeatureDataset(cr_gdb, "Watersheds", sr)
else:
    print(f'{fdat} exists for copper- river')

vaa_df1 = pd.DataFrame(arcpy.da.TableToNumPyArray(vaa, ("NHDPlusID", "FromNode", "ToNode", "StartFlag")))
stream_df = pd.DataFrame(arcpy.da.TableToNumPyArray(streams, ("NHDPlusID", "FType")))
dfs = [vaa_df1, stream_df]
vaa_df = reduce(lambda left,right: pd.merge(left,right,on='NHDPlusID',how="outer"), dfs)
# remove pipelines
vaa_df = vaa_df[(vaa_df['FType'] != 428 )]
vaa_df

c=1
for id in idList:
    print(f"{c}. Starting watershed for: " + str(id))
    rec = [id]
    up_ids = []
    up_ids.append(rec)
    rec_len = len(rec)
    hws_sum = 0

    while rec_len != hws_sum:
        fromnode = vaa_df.loc[vaa_df["NHDPlusID"].isin(rec), "FromNode"]
        rec = vaa_df.loc[vaa_df["ToNode"].isin(fromnode), "NHDPlusID"]
        rec_len = len(rec)
        rec_hws = vaa_df.loc[vaa_df["ToNode"].isin(fromnode), "StartFlag"]
        hws_sum = sum(rec_hws)
        up_ids.append(rec)
    #up_ids is a list with more than numbers, use extend to only keep numeric nhdplusids
    newup_ids = []
    for x in up_ids:
        newup_ids.extend(x)

    tempLayer = "catsLyr"
    expression = '"NHDPlusID" IN ({0})'.format(', '.join(map(str, newup_ids)) or 'NULL')
    arcpy.MakeFeatureLayer_management(cats, tempLayer, where_clause=expression)
    outdis = "memory/wtd_" + str(round(id))
    outwtd = "Watersheds\\wtd_" + str(round(id))
    print(outwtd)
    print('----------')
    dis = arcpy.Dissolve_management(tempLayer, outdis)
    watershed = arcpy.EliminatePolygonPart_management(dis, outwtd,"PERCENT", "0 SquareKilometers", 90, "CONTAINED_ONLY")
    c=c+1

# End timing
processEnd = time.time()
processElapsed = int(processEnd - processStart)
processSuccess_time = datetime.datetime.now()

# Report success
print(f'Process completed at {processSuccess_time.strftime("%Y-%m-%d %H:%M")} '
      f'(Elapsed time: {datetime.timedelta(seconds=processElapsed)})')
print('----------')

## If operation hangs then suspend and run following chunk to check what watersheds were missed

In [6]:
arcpy.env.workspace = cr_gdb
wtd_lst = []
for ds in arcpy.ListDatasets():
        arcpy.env.workspace = ds
        for dfc in arcpy.ListFeatureClasses():
            print (dfc)
            wtd_lst.append(float(dfc[4:]+'.0'))
cr_missed = (set(idList).difference(wtd_lst))
print (f'{len(cr_missed)} Watersheds not created')
print(f'Missing values in first list: {cr_missed}')

list1_as_set = set(idList)
intersection = list1_as_set.intersection(wtd_lst)
intersection_as_list = list(intersection)
print(f'List intersection {intersection_as_list}')
print (f'Length of list inter {len(intersection_as_list)}')

wtd_75019800000406
wtd_75019800010313
wtd_75019800014348
wtd_75019800001957
wtd_75019800019692
wtd_75019600118138
wtd_75019700004190
wtd_75019700004084
wtd_75019700017692
wtd_75019700001794
wtd_75019700003889
wtd_75003900062338
wtd_75003900033524
wtd_75003900054316
wtd_75003900055039
wtd_75003900023942
wtd_75003900058380
wtd_75003900028507
wtd_75003900027489
wtd_75003900044936
wtd_75003900023855
wtd_75003900044738
wtd_75003900055694
wtd_75003900023674
wtd_75003900062264
wtd_75003900055316
wtd_75003900039073
wtd_75003900027771
0 Watersheds not created
Missing values in first list: set()
List intersection [75019700001794.0, 75003900062338.0, 75003900023942.0, 75003900044936.0, 75019800014348.0, 75003900058380.0, 75003900055694.0, 75003900055316.0, 75019800000406.0, 75019700017692.0, 75003900039073.0, 75019800001957.0, 75003900054316.0, 75003900023855.0, 75019700003889.0, 75003900062264.0, 75003900044738.0, 75019800010313.0, 75003900028507.0, 75019700004190.0, 75003900027489.0, 7501980001

In [7]:
# BRISTOL BAY WATERSHEDS

import arcpy, os
import pandas as pd

arcpy.env.workspace = bbay_gdb
arcpy.env.overwriteOutput = True
sr = arcpy.SpatialReference(3338)  #'NAD_1983_Alaska_Albers'
arcpy.env.outputCoordinateSystem = sr

cats = os.path.join(bbay_gdb, 'cats_merge')
idList = []
outcats = "cats_intersect"

arcpy.MakeFeatureLayer_management(cats, "tempLayer")
arcpy.management.SelectLayerByLocation("tempLayer", "INTERSECT", points)
arcpy.CopyFeatures_management("tempLayer", outcats)

fields = arcpy.ListFields("tempLayer")
for field in fields:
    print("{0}".format(field.name))
with arcpy.da.SearchCursor("tempLayer", ["catID"]) as cursor:
    for row in cursor:
        idList.append(row[0])

print(len(idList))

OBJECTID
Shape
gridcode
catID
Shape_Length
Shape_Area
114


In [8]:
# BRISTOL BAY
# steps 4-7

import arcpy
import pandas as pd
import time

# steps 4-9 for loop to create watersheds
arcpy.env.workspace = bbay_gdb
arcpy.env.overwriteOutput = True
arcpy.env.qualifiedFieldNames = False
sr = arcpy.SpatialReference(3338)  #'NAD_1983_Alaska_Albers'
arcpy.env.outputCoordinateSystem = sr

# Start timing function
processStart = time.time()
processStartdt = datetime.datetime.now()
streams = "streams_merge"
cats = "cats_merge"

# Get list of index names for cats merge and add index if not already created
index_names = [i.name for i in arcpy.ListIndexes(cats)]
print(index_names)
if 'catid_index' not in index_names:
    print (f'Creating index for {cats}')
    arcpy.AddIndex_management(cats, "catID", "catid_index")
else:
    print(f'{cats} Indexed')

#watersheds feature dataset for storing fcs
fdat = os.path.join(bbay_gdb, 'Watersheds')
if not arcpy.Exists(fdat):
    arcpy.management.CreateFeatureDataset(bbay_gdb, "Watersheds", sr)
else:
    print(f'{fdat} exists for Bristol Bay')

str_df = pd.DataFrame(arcpy.da.FeatureClassToNumPyArray(streams, ("catID", "upCatID1", "upCatID2")))
hws_codes = [999999, 1999999, 2999999, 3999999, 4999999]

#idList if doing ALL watersheds.
c=1
for id in idList:
    print(f"{c}. Starting watershed for: " + str(id))
    rec = [id]
    up_ids = []
    sum_rec = sum(rec)
    timeout = time.time() + 60*15 # 15 minutes from this point

    while(sum_rec > 0):
        if time.time() > timeout:
            break
        up_ids.append(rec)
        rec = str_df.loc[str_df["catID"].isin(rec), ("upCatID1", "upCatID2")]
        rec = rec.replace(hws_codes, 0)
        rec = pd.concat([rec['upCatID1'], rec['upCatID2']])

        sum_rec = sum(rec)
    newup_ids = []
    for x in up_ids:
        newup_ids.extend(x)
    tempLayer = "catsLyr"
    expression = '"catID" IN ({0})'.format(', '.join(map(str, newup_ids)) or 'NULL')
    arcpy.MakeFeatureLayer_management(cats, tempLayer)
    arcpy.management.SelectLayerByAttribute(tempLayer, "NEW_SELECTION", expression, None)
    print("Starting dissolve")
    outdis = "memory/wtd_" + str(round(id))
    outwtd = "Watersheds\\wtd_" + str(round(id))
    dis = arcpy.Dissolve_management(tempLayer, outdis)
    watershed = arcpy.EliminatePolygonPart_management(dis, outwtd,"PERCENT", "0 SquareKilometers", 90, "CONTAINED_ONLY")
    print("Watershed created at:" + outwtd)
    c=c+1

# End timing
processEnd = time.time()
processElapsed = int(processEnd - processStart)
processSuccess_time = datetime.datetime.now()

# Report success
print(f'Process completed at {processSuccess_time.strftime("%Y-%m-%d %H:%M")} '
      f'(Elapsed time: {datetime.timedelta(seconds=processElapsed)})')

['FDO_OBJECTID', 'FDO_Shape']
Creating index for cats_merge
1. Starting watershed for: 1023044
Starting dissolve
Watershed created at:Watersheds\wtd_1023044
2. Starting watershed for: 2041471
Starting dissolve
Watershed created at:Watersheds\wtd_2041471
3. Starting watershed for: 2065755
Starting dissolve
Watershed created at:Watersheds\wtd_2065755
4. Starting watershed for: 2065914
Starting dissolve
Watershed created at:Watersheds\wtd_2065914
5. Starting watershed for: 2066924
Starting dissolve
Watershed created at:Watersheds\wtd_2066924
6. Starting watershed for: 2066955
Starting dissolve
Watershed created at:Watersheds\wtd_2066955
7. Starting watershed for: 2067494
Starting dissolve
Watershed created at:Watersheds\wtd_2067494
8. Starting watershed for: 2068072
Starting dissolve
Watershed created at:Watersheds\wtd_2068072
9. Starting watershed for: 2068584
Starting dissolve
Watershed created at:Watersheds\wtd_2068584
10. Starting watershed for: 2070402
Starting dissolve
Watershed cre

In [9]:
# code when trouble-shooting bb above.
import arcpy

# Got through 95 watersheds and all other programs froze, restarted and finding which watersheds remain.
arcpy.env.workspace = os.path.join(bbay_gdb,'Watersheds')
wtds = arcpy.ListFeatureClasses()
#just get numeric part
wtds = [x[4:20] for x in wtds]
#convert to numeric
wtds = [int(i) for i in wtds]
print(wtds)
print(len(wtds))
print(len(idList))
#
# idFilter = [x for x in idList if x not in wtds]
# print(idFilter)
# print("Original list of sites in BB: " + str(len(idList)))
# print("Watersheds completed: " + str(len(wtds)))
# print("Watersheds remaining: " + str(len(idFilter)))


[1023044, 2041471, 2065755, 2065914, 2066924, 2066955, 2067494, 2068072, 2068584, 2070402, 2071934, 2072993, 2073464, 2074424, 2078232, 2078282, 2082373, 2082393, 2084713, 2085642, 2085962, 2087913, 2088163, 2088253, 2088326, 2088586, 3023044, 3024343, 3030955, 3033666, 3033956, 3034126, 4046617, 4048046, 4051055, 4051056, 4051616, 4052076, 4054120, 4054200, 4055337, 4059016, 4059736, 4060036, 4061136, 4063606, 4063775, 4064055, 4064845, 4068584, 4069455, 4069495, 4069586, 4071036, 4071256, 4071625, 4073375, 4074184, 4074505, 4074726, 4075225, 4075474, 4075715, 4076505, 4076575, 4076675, 4078235, 4079234, 4080045, 4080875, 4082225, 4084145, 4084351, 4084915, 4086815, 4087015, 4087224, 4088675, 4088904, 4089074, 4089384, 4089394, 4089574, 4091164, 4091424, 4092244, 4093084, 4095301, 4095374, 4096064, 4096454, 4098114, 4098704, 4099594, 4099694, 4101123, 4101274, 4102124, 4104574, 4105204, 4105594, 4105844, 4106904, 4107464, 4115672, 5008707, 5020796, 5021476, 5030363, 5030704, 5032474, 

In [22]:
# PRINCE WILLIAM SOUND WATERSHEDS

import arcpy,os
arcpy.env.overwriteOutput = True
sr = arcpy.SpatialReference(3338)  #'NAD_1983_Alaska_Albers'
arcpy.env.outputCoordinateSystem = sr
arcpy.env.workspace = pws_gdb
arcpy.env.overwriteOutput = True

idList = []
cats = os.path.join(pws_gdb,'cats_merge')
outcats = os.path.join(pws_gdb, "cats_intersect")
arcpy.MakeFeatureLayer_management(cats, "tempLayer")
arcpy.management.SelectLayerByLocation("templayer", "INTERSECT", points)
arcpy.CopyFeatures_management("templayer", outcats)

fields = arcpy.ListFields("tempLayer")
for field in fields:
    print("{0}".format(field.name))
with arcpy.da.SearchCursor("tempLayer", ["gridcode"]) as cursor:
    for row in cursor:
        idList.append(row[0])

print(len(idList))

OBJECTID
Shape
gridcode
Shape_Length
Shape_Area
catID
20


In [23]:
# Prince_William_Sound
# steps 4-7

import arcpy,os
import pandas as pd
arcpy.env.workspace = pws_gdb
arcpy.env.overwriteOutput = True
arcpy.env.qualifiedFieldNames = False
sr = arcpy.SpatialReference(3338)  #'NAD_1983_Alaska_Albers'
arcpy.env.outputCoordinateSystem = sr

# Start timing function
processStart = time.time()
processStartdt = datetime.datetime.now()
streams = "streams_merge"
cats = "cats_merge"

# Get list of index names for cats merge and add index if not already created
index_names = [i.name for i in arcpy.ListIndexes(cats)]
print(index_names)
if 'catid_index' not in index_names:
    print (f'Creating index for {cats}')
    arcpy.AddIndex_management(cats, "catID", "catid_index")
else:
    print(f'{cats} Indexed')

#watersheds feature dataset for storing fcs
fdat = os.path.join(pws_gdb, 'Watersheds')
if not arcpy.Exists(fdat):
    arcpy.management.CreateFeatureDataset(pws_gdb, "Watersheds", sr)
else:
    print(f'{fdat} exists for PWS')

fields = arcpy.ListFields(streams)
for field in fields:
    print("{0}".format(field.name))

str_df = pd.DataFrame(arcpy.da.FeatureClassToNumPyArray(streams, ("LINKNO", "USLINKNO1", "USLINKNO2")))
hws_codes = [-1]

# idList = [46055]

#idList if doing ALL watersheds.
c=1
for id in idList:
    print(f"{c}. Starting watershed for: " + str(id))
    rec = [id]
    up_ids = []
    sum_rec = sum(rec)

    while(sum_rec > 0):
        up_ids.append(rec)
        rec = str_df.loc[str_df["LINKNO"].isin(rec), ("USLINKNO1", "USLINKNO2")]
        rec = pd.concat([rec['USLINKNO1'], rec['USLINKNO2']])
        sum_rec = sum(rec)
        print(sum_rec)

    # up_ids is a list with more than numbers, use extend to only keep numeric nhdplusids
    newup_ids = []
    for x in up_ids:
        newup_ids.extend(x)

    tempLayer = "catsLyr"
    expression = '"gridcode" IN ({0})'.format(', '.join(map(str, newup_ids)) or 'NULL')
    arcpy.MakeFeatureLayer_management(cats, tempLayer)
    arcpy.management.SelectLayerByAttribute(tempLayer, "NEW_SELECTION", expression, None)
    print("Starting dissolve")
    outdis = "memory/wtd_" + str(round(id))
    outwtd = "Watersheds\\wtd_" + str(round(id))
    dis = arcpy.Dissolve_management(tempLayer, outdis)
    watershed = arcpy.EliminatePolygonPart_management(dis, outwtd,"PERCENT", "0 SquareKilometers", 90, "CONTAINED_ONLY")
    print("Watershed created at:" + outwtd)
    c=c+1

# End timing
processEnd = time.time()
processElapsed = int(processEnd - processStart)
processSuccess_time = datetime.datetime.now()

# Report success
print(f'Process completed at {processSuccess_time.strftime("%Y-%m-%d %H:%M")} '
      f'(Elapsed time: {datetime.timedelta(seconds=processElapsed)})')

['FDO_OBJECTID', 'FDO_Shape', 'catid_index']
cats_merge Indexed
OBJECTID
Shape
LINKNO
DSLINKNO
USLINKNO1
USLINKNO2
DSNODEID
strmOrder
Length
Magnitude
DSContArea
strmDrop
Slope
StraightL
USContArea
WSNO
DOUTEND
DOUTSTART
DOUTMID
Shape_Length
1. Starting watershed for: 18457
27744
28132
28222
28392
31982
55458
70970
51980
46384
32378
49918
46944
27748
27942
30852
45518
23998
25882
41028
19208
-4
Starting dissolve
Watershed created at:Watersheds\wtd_18457
2. Starting watershed for: 26464
31298
29766
29746
27746
26276
25536
11766
-4
Starting dissolve
Watershed created at:Watersheds\wtd_26464
3. Starting watershed for: 28086
33752
31730
30390
28450
11820
-4
Starting dissolve
Watershed created at:Watersheds\wtd_28086
4. Starting watershed for: 29854
57928
92356
136652
140448
114430
116608
71984
36578
50646
66676
74932
72752
47198
26782
27316
15926
-4
Starting dissolve
Watershed created at:Watersheds\wtd_29854
5. Starting watershed for: 30884
49888
78696
58952
100026
123672
90698
94972
12424

In [24]:
# KODIAK WATERSHEDS

import arcpy
sr = arcpy.SpatialReference(3338)  #'NAD_1983_Alaska_Albers'
arcpy.env.outputCoordinateSystem = sr
arcpy.env.workspace = kod_gdb
arcpy.env.overwriteOutput = True

idList = []
cats = os.path.join(kod_gdb,'cats_merge')
outcats = "cats_intersect"
arcpy.MakeFeatureLayer_management(cats, "tempLayer")
arcpy.management.SelectLayerByLocation("tempLayer", "INTERSECT", points)
arcpy.CopyFeatures_management("tempLayer", outcats)
fields = arcpy.ListFields("tempLayer")
for field in fields:
    print("{0}".format(field.name))
with arcpy.da.SearchCursor("tempLayer", ["gridcode"]) as cursor:
    for row in cursor:
        idList.append(row[0])

print(len(idList))

OBJECTID
Shape
gridcode
Shape_Length
Shape_Area
catID
28


In [25]:
# Kodiak
# steps 4-7

import arcpy
import pandas as pd
arcpy.env.workspace = kod_gdb
arcpy.env.overwriteOutput = True
arcpy.env.qualifiedFieldNames = False
sr = arcpy.SpatialReference(3338)  #'NAD_1983_Alaska_Albers'
arcpy.env.outputCoordinateSystem = sr

# Start timing function
processStart = time.time()
processStartdt = datetime.datetime.now()
streams = "streams_merge"
cats = "cats_merge"

# Get list of index names for cats merge and add index if not already created
index_names = [i.name for i in arcpy.ListIndexes(cats)]
print(index_names)
if 'catid_index' not in index_names:
    print (f'Creating index for {cats}')
    arcpy.AddIndex_management(cats, "catID", "catid_index")
else:
    print(f'{cats} Indexed')

#watersheds feature dataset for storing fcs
fdat = os.path.join(kod_gdb, 'Watersheds')
if not arcpy.Exists(fdat):
    arcpy.management.CreateFeatureDataset(kod_gdb, "Watersheds", sr)
else:
    print(f'{fdat} exists for Kodiak')


fields = arcpy.ListFields(streams)
for field in fields:
    print("{0}".format(field.name))

str_df = pd.DataFrame(arcpy.da.FeatureClassToNumPyArray(streams, ("LINKNO", "USLINKNO1", "USLINKNO2")))
hws_codes = [-1]

#idList if doing ALL watersheds.
c = 1
for id in idList:
    print(f"{c}. Starting watershed for: " + str(id))
    rec = [id]
    up_ids = []
    sum_rec = sum(rec)

    while(sum_rec > 0):
        up_ids.append(rec)
        rec = str_df.loc[str_df["LINKNO"].isin(rec), ("USLINKNO1", "USLINKNO2")]
        rec = pd.concat([rec['USLINKNO1'], rec['USLINKNO2']])
        sum_rec = sum(rec)
 
    #up_ids is a list with more than numbers, use extend to only keep numeric nhdplusids
    newup_ids = []
    for x in up_ids:
        newup_ids.extend(x)

    tempLayer = "catsLyr"
    expression = '"gridcode" IN ({0})'.format(', '.join(map(str, newup_ids)) or 'NULL')
    arcpy.MakeFeatureLayer_management(cats, tempLayer)
    arcpy.management.SelectLayerByAttribute(tempLayer, "NEW_SELECTION", expression, None)
    print("Starting dissolve")
    outdis = "memory/wtd_" + str(round(id))
    outwtd = "Watersheds\\wtd_" + str(round(id))
    dis = arcpy.Dissolve_management(tempLayer, outdis)
    watershed = arcpy.EliminatePolygonPart_management(dis, outwtd,"PERCENT", "0 SquareKilometers", 90, "CONTAINED_ONLY")
    print("Watershed created at:" + outwtd)
    c=c+1

# End timing
processEnd = time.time()
processElapsed = int(processEnd - processStart)
processSuccess_time = datetime.datetime.now()

# Report success
print(f'Process completed at {processSuccess_time.strftime("%Y-%m-%d %H:%M")} '
      f'(Elapsed time: {datetime.timedelta(seconds=processElapsed)})')


['FDO_OBJECTID', 'FDO_Shape', 'catid_index']
cats_merge Indexed
OBJECTID
Shape
LINKNO
DSLINKNO
USLINKNO1
USLINKNO2
DSNODEID
strmOrder
Length
Magnitude
DSContArea
strmDrop
Slope
StraightL
USContArea
WSNO
DOUTEND
DOUTSTART
DOUTMID
proc_reg
Shape_Length
1. Starting watershed for: 48267
Starting dissolve
Watershed created at:Watersheds\wtd_48267
2. Starting watershed for: 49617
Starting dissolve
Watershed created at:Watersheds\wtd_49617
3. Starting watershed for: 50197
Starting dissolve
Watershed created at:Watersheds\wtd_50197
4. Starting watershed for: 64593
Starting dissolve
Watershed created at:Watersheds\wtd_64593
5. Starting watershed for: 72144
Starting dissolve
Watershed created at:Watersheds\wtd_72144
6. Starting watershed for: 76954
Starting dissolve
Watershed created at:Watersheds\wtd_76954
7. Starting watershed for: 77794
Starting dissolve
Watershed created at:Watersheds\wtd_77794
8. Starting watershed for: 90346
Starting dissolve
Watershed created at:Watersheds\wtd_90346
9. St