## NHD Network Analysis Demo
___TL;DR___: *We are trying to parallelize hydraulic calculations for dynamic subsets of the U.S. river and stream network*<br><br>
The following was developed as part of the process of preparing a method for forecasting flows on the US network of rivers and streams as represented in the National Hydrography Dataset (NHD). The NHD is a continuously evolving characterization of a fractal system so we felt that we needed to plan to have some flexibility. We hope to identify the complexity inherent in the network at different levels of resolution and we hope to be able to do so dynamically. The goal is also to be able to manage the complexity calculation for arbitrary collections of headwater points, such as might be obtained from a list of named streams or during a major flood event in a particular region.<br>
As a point of terminology, we use the word 'routing' as shorthand to refer to the computation of the translation of a particular flow condition, high or low, to downstream (or in some cases upstream) areas of influence.
The network complexity is related to the potential for parallelization of a serial analysis of the network. We have identified three levels of parallelization that may be implemented: 
1. Network-level parallelization of independent systems -- the routing computations for the Mississippi River have little (nothing, except conceptual similarity and a shared existence on earth) to do with the computations for the Columbia river for any practical level of analysis.
1. System-level parallelization of interconnected reaches -- There is a need to consider the computations for adjacent branches within system of con-flowing streams, but with proper ordering, some of the computations may be considered in parallel. For example, the Illinois River headwaters and the Mississippi River headwaters are related within their broader Mississippi system, but a most of the routing calculations for those headwaters are pratically agnostic to one another.
1. Reach-level parallelization of the specific routing computation -- the numerical work of routing water downstream is a matrix computation and consists of exploring solutions to differential equations, all of which may potentially be examined in parallel, under the proper conditions and with suitable assumptions.<br>


### Import The Git Repo Including Test Data
This humble* git repo is a branch of the national water model public repository hosted by UCAR. The UCAR repo is the basis for the WRF-Hydro model that is presently the modeling engine of the [US National Water Model.](https://water.noaa.gov/about/nwm)<br>

The network analysis code assumes that the downstream neighbor is identified in the table for each stream segment as is the case for the test datasets. 

*The humility is prompted by the realization that others have done similar work and may possibly have done it better. We are working on being more able to nimpbly respond to suggestions and opportunities for improvement. Please let us know if you see something (and be patient if you don't feel like we heard you.)

In [None]:
# # !pip install tensorflow  # or tensorflow-gpu
# # !pip install ray[rllib]  # also recommended: ray[debug]
# # !pip uninstall -y pyarrow
# print("Setting up colab environment")
# !pip uninstall -y -q pyarrow
# !pip install -q ray[debug]

# # A hack to force the runtime to restart, needed to include the above dependencies.
# print("Done installing! Restarting via forced crash (this is not an issue).")
# import os
# os._exit(0)

##### Turn Autosave off
Autosave generates additional work for the version control. Remember to manually save any chances after clearing kernel and output for more effective version control.

In [None]:
# %autosave 0 

##### Google Colab execution
Using this Chrome extension, the github-hosted jupyter notebooks may be opened directly in Google Colaboratory
https://chrome.google.com/webstore/detail/open-in-colab/iogfkhleblhcpcekbiedikdehleodpjo
more info here:
https://colab.research.google.com/github/googlecolab/colabtools/blob/master/notebooks/colab-github-demo.ipynb

The following two StackOverflow posts helped with managing the dependencies
https://stackoverflow.com/questions/53581278/test-if-notebook-is-running-on-google-colab
https://stackoverflow.com/questions/53793731/using-custom-packages-on-google-colaboratory

[![Open This Notebook In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/jameshalgren/wrf_hydro_nwm_public/blob/dynamic_channel_routing/trunk/NDHMS/dynamic_channel_routing/notebooks/NHD_Network_Density_Analysis.ipynb)

### Create Some General Functions
The next three blocks define interaction with the `networkbuilder` module in the git repo, which is the tool for creating the `connnections` object to characterize the network. 

In [1]:
import sys
try:
    import google.colab
    ENV_IS_CL = True
    !git clone --single-branch --branch network https://github.com/jameshalgren/wrf_hydro_nwm_public.git
    sys.path.append('/content/wrf_hydro_nwm_public/trunk/NDHMS/dynamic_channel_routing/src/python_framework')
    !pip install geopandas
    !pip install netcdf4
    #default recursion limit (~1000) is slightly too small for the deepest branches of the network
    sys.setrecursionlimit(6000) 
    #TODO: convert recursive functions to stack-based functions
except:
    ENV_IS_CL = False
    sys.path.append(r'../src/python_framework')


In [2]:
import networkbuilder as networkbuilder
import recursive_print
import os
import geopandas as gpd
import pandas as pd
import xarray as xr
# -*- coding: utf-8 -*-
"""NHD Network traversal

A demonstration version of this code is stored in this Colaboratory notebook:
https://colab.research.google.com/github/jameshalgren/wrf_hydro_nwm_public/blob/network/trunk/NDHMS/dynamic_channel_routing/notebooks/NHD_Network_Density_Analysis.ipynb#scrollTo=h_BEdl4LID34

"""
def do_network(
        geofile_path = None
        , title_string = None
        , layer_string = None
        , driver_string = None
        , key_col = None
        , downstream_col = None
        , length_col = None
        , terminal_code = None
        , verbose = False
        , debuglevel = 0
        ):

    # NOTE: these methods can lose the "connections" and "rows" arguments when
    # implemented as class methods where those arrays are members of the class.
    if verbose: print(title_string)
    if driver_string == 'NetCDF':
        geofile = xr.open_dataset(geofile_path)
        geofile_rows = (geofile.to_dataframe()).values
        # The xarray method for NetCDFs was implemented after the geopandas method for 
        # GIS source files. It's possible (probable?) that we are doing something 
        # inefficient by converting away from the Pandas dataframe.
        # TODO: Check the optimal use of the Pandas dataframe
        if debuglevel <= -1: print(f'reading -- dataset: {geofile_path}; layer: {layer_string}; driver: {driver_string}')
    else:
        if debuglevel <= -1: print(f'reading -- dataset: {geofile_path}; layer: {layer_string}; fiona driver: {driver_string}')
        geofile = gpd.read_file(geofile_path, driver=driver_string, layer=layer_string)
        geofile_rows = geofile.to_numpy()
    if debuglevel <= -2: geofile.plot() #TODO: WILL THIS WORK WITH NetCDF???
    if debuglevel <= -1: print(geofile.head()) #TODO: WILL THIS WORK WITH NetCDF???
    # Kick off recursive call for all connections and keys
    (connections) = networkbuilder.get_down_connections(
                    rows = geofile_rows
                    , key_col = key_col
                    , downstream_col = downstream_col
                    , length_col = length_col
                    , verbose = verbose
                    , debuglevel = debuglevel)
    
    (all_keys, ref_keys, headwater_keys
        , terminal_keys
        , terminal_ref_keys
        , circular_keys) = networkbuilder.determine_keys(
                    connections = connections
#                     , rows = geofile_rows
                    , key_col = key_col
                    , downstream_col = downstream_col
                    , terminal_code = terminal_code
                    , verbose = verbose
                    , debuglevel = debuglevel)
    
    (junction_keys) = networkbuilder.get_up_connections(
                    connections = connections
                    , terminal_code = terminal_code
                    , headwater_keys = headwater_keys
                    , terminal_keys = terminal_keys
                    , verbose = verbose
                    , debuglevel = debuglevel)
    return connections, all_keys, ref_keys, headwater_keys \
        , terminal_keys, terminal_ref_keys \
        , circular_keys, junction_keys

In [3]:
def do_print():    
    recursive_print.print_basic_network_info(
                    connections = connections_NHD
                    , headwater_keys = headwater_keys_NHD
                    , junction_keys = junction_keys_NHD
                    , terminal_keys = terminal_keys_NHD
                    , terminal_code = terminal_code_NHD
                    , verbose = True
                    )
    
    if 1 == 0: #THE RECURSIVE PRINT IS NOT A GOOD IDEA WITH LARGE NETWORKS!!!
        recursive_print.print_connections(
                    headwater_keys = headwater_keys_NHD
                    , down_connections = connections_NHD
                    , up_connections = connections_NHD
                    , terminal_code = terminal_code_NHD
                    , terminal_keys = terminal_keys_NHD
                    , terminal_ref_keys = terminal_ref_keys_NHD
                    , debuglevel = -2
                    )
    


### Two Real Networks


In [4]:
if ENV_IS_CL: root = '/content/wrf_hydro_nwm_public/trunk/NDHMS/dynamic_channel_routing/'
elif not ENV_IS_CL: root = os.path.dirname(os.path.abspath(''))
test_folder = os.path.join(root, r'test')
geo_input_folder = os.path.join(test_folder, r'input', r'geo', r'Channels')

"""##NHD Subset (Brazos/Lower Colorado)"""
Brazos_LowerColorado_ge5 = True
"""##NHD CONUS order 5 and greater"""
CONUS_ge5 = True
"""These are large -- be careful"""
CONUS_FULL_RES = True
CONUS_Named_Streams = False #create a subset of the full resolution by reading the GNIS field
CONUS_Named_combined = False #process the Named streams through the Full-Res paths to join the many hanging reaches

debuglevel = -1
verbose = True

# The CONUS_ge5 and Brazos_LowerColorado_ge5 datasets are included
# in the github test folder and are extracts from the NHD version 1.2 datasets
# from https://www.nohrsc.noaa.gov/pub/staff/keicher/NWM_live/web/data_tools/
#  
# The CONUS_FULL_RES file was generated from the RouteLink file in the parameter
# archive and converted to a compressed NetCDF via the following command:
# nccopy -d1 -s RouteLink_NWMv2.0_20190517_cheyenne_pull.nc RouteLink_NWMv2.0_20190517_cheyenne_pull.nc4s
# TODO: Explain CONUS_Named_Streams
# CONUS_Named_Streams was generated by intersecting the FULL_RES file ...
# of the data in the nohrsc-hosted archive but are too large to efficiently 
# package inside of the repository. 

if Brazos_LowerColorado_ge5:
    nhd_conus_file_path = os.path.join(geo_input_folder
            , r'NHD_BrazosLowerColorado_Channels.shp')
    key_col_NHD = 2
    downstream_col_NHD = 7
    length_col_NHD = 6
    terminal_code_NHD = 0
    title_string = 'Brazos + Lower Colorado\nNHD stream orders 5 and greater\n'
    title_string = 'CONUS Order 5 and Greater '
    driver_string = 'ESRI Shapefile'
    layer_string = 0

    Brazos_LowerColorado_ge5_values = do_network (nhd_conus_file_path
                , title_string = title_string
                , layer_string = layer_string
                , driver_string = driver_string
                , key_col = key_col_NHD
                , downstream_col = downstream_col_NHD
                , length_col = length_col_NHD
                , terminal_code = terminal_code_NHD
                , verbose = verbose
                , debuglevel = debuglevel)

if CONUS_ge5:
    nhd_conus_file_path = os.path.join(geo_input_folder
            , r'NHD_Conus_Channels.shp')
    key_col_NHD = 1
    downstream_col_NHD = 6
    length_col_NHD = 5
    terminal_code_NHD = 0
    title_string = 'CONUS Order 5 and Greater '
    driver_string = 'ESRI Shapefile'
    layer_string = 0

    CONUS_ge5_values = do_network (nhd_conus_file_path
                , title_string = title_string
                , layer_string = layer_string
                , driver_string = driver_string
                , key_col = key_col_NHD
                , downstream_col = downstream_col_NHD
                , length_col = length_col_NHD
                , terminal_code = terminal_code_NHD
                , verbose = verbose
                , debuglevel = debuglevel)

if CONUS_FULL_RES:
    # nhd_conus_file_path = '../../../../../../GISTemp/nwm_v12.gdb'
    nhd_conus_file_path = os.path.join(geo_input_folder
            , r'RouteLink_NWMv2.0_20190517_cheyenne_pull.nc')
    key_col_NHD = 0
    downstream_col_NHD = 2
    length_col_NHD = 10
    terminal_code_NHD = 0
    title_string = 'CONUS Full Resolution NWM v2.0'
    # driver_string = 'FileGDB'
    driver_string = 'NetCDF'
    # layer_string = 'channels_nwm_v12_routeLink'
    layer_string = 0

    CONUS_FULL_RES_values = do_network (nhd_conus_file_path
                , title_string = title_string
                , layer_string = layer_string
                , driver_string = driver_string
                , key_col = key_col_NHD
                , downstream_col = downstream_col_NHD
                , length_col = length_col_NHD
                , terminal_code = terminal_code_NHD
                , verbose = verbose
                , debuglevel = debuglevel)

CONUS Order 5 and Greater 
reading -- dataset: /home/jacob.hreha/nwm/trunk/NDHMS/dynamic_channel_routing/test/input/geo/Channels/NHD_BrazosLowerColorado_Channels.shp; layer: 0; fiona driver: ESRI Shapefile
   OBJECTID_1  OBJECTID  featureID  linkDim     link  order_  Length       to  \
0        1499     25442    3764288   460086  3764288       6  2150.0  3764296   
1        1500     25443    3764296   460088  3764296       6  1277.0  3765756   
2        1501     25444    3766380   460092  3766380       6   431.0  3766382   
3        1502     25445    3765756   460090  3765756       6  1274.0  3766380   
4        1503     25447    3765796   460180  3765796       6   219.0  3765798   

     MusK  MusX  ...  gages  NHDWaterbo      lat      lon   alt Kchan  \
0  3600.0   0.2  ...   None       -9999  29.0176 -96.0119  9.59     0   
1  3600.0   0.2  ...   None       -9999  29.0059 -96.0029  9.59     0   
2  3600.0   0.2  ...   None       -9999  28.9876 -95.9992  9.59     0   
3  3600.0   0.2

In [5]:
# if ENV_IS_CL: root = '/content/wrf_hydro_nwm_public/trunk/NDHMS/dynamic_channel_routing/'
# elif not ENV_IS_CL: root = os.path.dirname(os.path.abspath(''))
# test_folder = os.path.join(root, r'test')
# geo_input_folder = os.path.join(test_folder, r'input', r'geo', r'Channels')

# """##NHD CONUS order 5 and greater"""
# CONUS_ge5 = True
# """##NHD Subset (Brazos/Lower Colorado)"""
# Brazos_LowerColorado_ge5 = True
# CONUS_FULL_RES = True

# debuglevel = -1
# verbose = True

# # The following datasets are extracts from the feature datasets available
# # from https://www.nohrsc.noaa.gov/pub/staff/keicher/NWM_live/web/data_tools/
# # the CONUS_ge5 and Brazos_LowerColorado_ge5 datasets are included
# # in the github test folder

# if CONUS_ge5:
#     nhd_conus_file_path = os.path.join(geo_input_folder
#             , r'NHD_Conus_Channels.shp')
#     key_col_NHD = 1
#     downstream_col_NHD = 6
#     length_col_NHD = 5
#     terminal_code_NHD = 0
#     title_string = 'CONUS Order 5 and Greater '
#     driver_string = 'ESRI Shapefile'
#     layer_string = 0

#     CONUS_ge5_values = do_network (nhd_conus_file_path
#                 , title_string = title_string
#                 , layer_string = layer_string
#                 , driver_string = driver_string
#                 , key_col = key_col_NHD
#                 , downstream_col = downstream_col_NHD
#                 , length_col = length_col_NHD
#                 , terminal_code = terminal_code_NHD
#                 , verbose = verbose
#                 , debuglevel = debuglevel)
    

# if CONUS_FULL_RES:
#     nhd_conus_file_path = '../../../../../../GISTemp/nwm_v12.gdb'
#     key_col_NHD = 0
#     downstream_col_NHD = 5
#     length_col_NHD = 4
#     terminal_code_NHD = 0
#     title_string = 'CONUS Full Resolution NWM v1.2'
#     driver_string = 'FileGDB'
#     layer_string = 'channels_nwm_v12_routeLink'

#     do_network (nhd_conus_file_path
#                 , title_string = title_string
#                 , layer_string = layer_string
#                 , driver_string = driver_string
#                 , key_col = key_col_NHD
#                 , downstream_col = downstream_col_NHD
#                 , length_col = length_col_NHD
#                 , terminal_code = terminal_code_NHD
#                 , verbose = verbose
#                 , debuglevel = debuglevel)

# if Brazos_LowerColorado_ge5:
#     nhd_conus_file_path = os.path.join(geo_input_folder
#             , r'NHD_BrazosLowerColorado_Channels.shp')
#     key_col_NHD = 2
#     downstream_col_NHD = 7
#     length_col_NHD = 6
#     terminal_code_NHD = 0
#     title_string = 'Brazos + Lower Colorado\nNHD stream orders 5 and greater\n'
#     title_string = 'CONUS Order 5 and Greater '
#     driver_string = 'ESRI Shapefile'
#     layer_string = 0

#     Brazos_LowerColorado_ge5_values = do_network (nhd_conus_file_path
#                 , title_string = title_string
#                 , layer_string = layer_string
#                 , driver_string = driver_string
#                 , key_col = key_col_NHD
#                 , downstream_col = downstream_col_NHD
#                 , length_col = length_col_NHD
#                 , terminal_code = terminal_code_NHD
#                 , verbose = verbose
#                 , debuglevel = debuglevel)
    
    



In [6]:
from importlib import reload
reload(networkbuilder)
import pickle
import zipfile # TODO: Incorporate Named streams as zip into test datasets

# CONUS_Named_Streams = True # This variable is set in the previous cell
# CONUS_Named_combined = True # This variable is set in the previous cell

if CONUS_Named_Streams:
    nhd_conus_file_path = os.path.join(geo_input_folder
            , r'channels_nwm_v12_routeLink_NamedOnly.shp')
    key_col_Named_Streams = 0
    downstream_col_Named_Streams = 5
    length_col_Named_Streams = 4
    terminal_code_Named_Streams = 0
    title_string = 'NHD v1.2 segments corresponding to NHD 2.0 GNIS labeled streams\n'
    # driver_string = 'FileGDB'
    driver_string = 'ESRI Shapefile'
    # layer_string = 'named_streams_v12'
    layer_string = 0

    CONUS_Named_Streams_values = do_network (nhd_conus_file_path
                , title_string = title_string
                , layer_string = layer_string
                , driver_string = driver_string
                , key_col = key_col_Named_Streams
                , downstream_col = downstream_col_Named_Streams
                , length_col = length_col_Named_Streams
                , terminal_code = terminal_code_Named_Streams
                , verbose = verbose
                , debuglevel = debuglevel)


if CONUS_Named_combined:
    ''' NOW Combine the two CONUS analyses by starting with the Named Headwaters
        but trace the network down the Full Resolution NHD. It should only work
        if the other two datasets have been computed.
        ANY OTHER Set of Headerwaters could be substituted'''
    
    if not (CONUS_Named_Streams and CONUS_FULL_RES):
        print('\n\nWARNING: If this works, you are using old data...')

    
    # Use only headwater keys that are in the full dataset.
    headwater_keys_combined = CONUS_FULL_RES_values[3] & \
                                CONUS_Named_Streams_values[3]
    # Need to make sure that these objects are independent -- we will modify them a bit.
    connections_combined = pickle.loads(pickle.dumps(CONUS_FULL_RES_values[0]))
    terminal_keys_combined = pickle.loads(pickle.dumps(CONUS_FULL_RES_values[4]))
    terminal_code_combined = terminal_code_NHD
    
    for key in connections_combined: #Clear the upstreams and rebuild it with just named streams
        connections_combined[key].pop('upstreams',None)

    (junction_keys_combined
     , visited_keys_combined
     , visited_terminal_keys_combined
     , junction_count_combined) = networkbuilder.get_up_connections(
                 connections = connections_combined
                 , terminal_code = terminal_code_combined
                 , headwater_keys = headwater_keys_combined
                 , terminal_keys = terminal_keys_combined
                 , verbose = verbose
                 , debuglevel = debuglevel)
    
# # Useful for debugging the combined calculation
#     print(len(junction_keys_combined)
#      , len(visited_keys_combined)
#      , len(visited_terminal_keys_combined)
#      , junction_count_combined)
    
#     print(len(terminal_keys_combined - visited_terminal_keys_combined))

###Build A Test Case
The `test_rows` object simulates a river network dataset such as we recieve from the National Hydrography Dataset. Each data row has a node ID, a 'to' node ID, and some other relevant data. For this test dataset, the second data column is a dummy length (and the last column could be some other value, but we haven't tried anything yet... stay tuned) and in our traversals, we can add up the lengths as a surrogate for more complex water routing functions we need to eventually manage.

In [7]:
# # def main():
# if 1 == 1:
#     """##TEST"""
#     print("")
#     print ('Executing Test')
#     # Test data
#     test_rows = [
#     [0,456,None,0],
#     [1,678,4,0],
#     [2,394,0,0],
#     [3,815,2,0],
#     [4,798,0,0],
#     [5,679,4,0],
#     [6,394,0,0],
#     [7,815,2,0],
#     [8,841,None,0],
#     [9,524,12,0],
#     [10,458,9,0],
#     [11,548,8,0],
#     [12,543,8,0],
#     [13,458,14,0],
#     [14,548,10,0],
#     [15,543,14,0],
# ]

#     test_key_col = 0
#     test_downstream_col = 2
#     test_length_col = 1
#     test_terminal_code = -999
#     debuglevel = -2
#     verbose = True

#     (test_connections) = networkbuilder.get_down_connections(
#                 rows = test_rows
#                 , key_col = test_key_col
#                 , downstream_col = test_downstream_col
#                 , length_col = test_length_col
#                 , verbose = True
#                 , debuglevel = debuglevel
#                 )

#     (test_all_keys, test_ref_keys, test_headwater_keys
#      , test_terminal_keys
#      , test_terminal_ref_keys
#      , test_circular_keys) = networkbuilder.determine_keys(
#                 connections = test_connections
                
#                 , key_col = test_key_col
#                 , downstream_col = test_downstream_col
#                 , terminal_code = test_terminal_code
#                 , verbose = True
#                 , debuglevel = debuglevel
#                 )

#     test_junction_keys = networkbuilder.get_up_connections(
#                 connections = test_connections
#                 , terminal_code = test_terminal_code
#                 , headwater_keys = test_headwater_keys
#                 , terminal_keys = test_terminal_keys
#                 , verbose = True
#                 , debuglevel = debuglevel
#                 )

#     recursive_print.print_connections(
#                 headwater_keys = test_headwater_keys
#                 , down_connections = test_connections
#                 , up_connections = test_connections
#                 , terminal_code = test_terminal_code
#                 , terminal_keys = test_terminal_keys
#                 , terminal_ref_keys = test_terminal_ref_keys
#                 , debuglevel = debuglevel
#                 )

#     recursive_print.print_basic_network_info(
#                 connections = test_connections
#                 , headwater_keys = test_headwater_keys
#                 , junction_keys = test_junction_keys
#                 , terminal_keys = test_terminal_keys
#                 , terminal_code = test_terminal_code
#                 , verbose = True
#                 , debuglevel = debuglevel
#                 )


# # if __name__ == "__main__":
# #     main()


# Recursive network builder with ordering
**Recursive functions capable of constructing the network from their given terminal, upstream, and downstream keys. **
*   Segments between nodes are tallied.
*   Junctions are outputted in junc_dict.
*   Each node is assigned a computational order for parallel processing 
    outputted in order_dict.
*   Inputted network keys (CONUS,BRAZOS,TEST) can be controlled under imports. 
*   IDs are passed to these functions from super_networks.items() to step through from the initial river outlets to the headwaters while labeling each order. 


---

    ignore the 0s that are printed from the exception
    run everything prior to this cell 



    
    


In [8]:

import time

#list of the different key sets
#
# terminal_keys = Brazos_LowerColorado_ge5_values[4] 
# circular_keys = Brazos_LowerColorado_ge5_values[6]
# terminal_keys_super = terminal_keys - circular_keys
# con = Brazos_LowerColorado_ge5_values[0]
# terminal_code = terminal_code_NHD
# #
# terminal_keys = CONUS_ge5_values[4] 
# circular_keys = CONUS_ge5_values[6]
# terminal_keys_super = terminal_keys - circular_keys
# con = CONUS_ge5_values[0]
terminal_code = terminal_code_NHD
terminal_keys = CONUS_FULL_RES_values[4] 
circular_keys = CONUS_FULL_RES_values[6]
terminal_keys_super = terminal_keys - circular_keys
con = CONUS_FULL_RES_values[0]

# terminal_keys_super = test_terminal_keys - test_circular_keys
# con = test_connections
# terminal_code = test_terminal_code


#orders of everything for computation in dictionary format
order_dict={}
junc_dict={}
head_dict={}
terminal_avg = {}
def recursive_junction_read (keys, iterator, con, network, terminal_code, verbose = False, debuglevel = 0):
    # print(keys)
    for key in keys:
        result = 'node' + str(key) 
        order_dict[result] = iterator, nid
        ckey = key
        ukeys = con[key]['upstreams']
        con[key].update({'nid': nid})
        # con[key].update({'node_order' : iterator})
        # print(f"segs at ckey {ckey}: {network['segment_count']}")
        while not len(ukeys) >= 2 and not (ukeys == {terminal_code}):
            # the terminal code will indicate a headwater
            if debuglevel <= -2: print(ukeys)
            (ckey,) = ukeys
            ukeys = con[ckey]['upstreams']
            network['segment_count'] += 1
            # print(f"segs at ckey {ckey}: {network['segment_count']}")
            #adds ordering for all nodes
            result = 'node' + str(ckey)
            order_dict[result] = iterator, nid
            con[ckey].update({'nid': nid})
            # con[ckey].update({'node_order' : iterator})
            
        if len(ukeys) >= 2:
            if debuglevel <= -1: print(f"junction found at {ckey} with upstreams {ukeys}")
            network['segment_count'] += 1
            # print(f"segs at ckey {ckey}: {network['segment_count']}")
            network['junction_count'] += 1 #the Terminal Segment
            #iterator adds 1 each iteration to provide a new order of computation for each junction section not each node or group of segments
            result_junc = 'junc' + str(key)
            junc_dict[result_junc] = iterator
            
            recursive_junction_read (ukeys, iterator+1, con, network, terminal_code, verbose, debuglevel)
            
        elif ukeys == {terminal_code}:
            # print(f"headwater found at {ckey}")
            network['segment_count'] += 1
            # print(f"segs at ckey {ckey}: {network['segment_count']}")
            #below adds headwaters to the headwater list
            result_head = 'head' + str(key)
            head_dict[result_head] = iterator
            
      
def super_network_trace(nid, iterator, con, network, terminal_code, debuglevel = 0):
    # print(f'\ntraversing upstream on network {nid}:')
    try:
        network.update({'junction_count': 0})
        network.update({'segment_count': 0}) #the Terminal Segment
        
        recursive_junction_read([nid], iterator , con, network, terminal_code, debuglevel = debuglevel)
        
        # print(f"junctions: {network['junction_count']}")
        # print(f"segments: {network['segment_count']}")
        
    except Exception as exc:
        print(exc)

super_networks = {terminal_key:{}
                        for terminal_key in terminal_keys_super}
debuglevel = 0

start_time = time.time()
for nid, network in super_networks.items():
    super_network_trace(nid, 0, con, network, terminal_code, debuglevel = debuglevel)
    
# print(con)
# print(super_networks)
print(super_networks)
print("--- %s seconds ---" % (time.time() - start_time))



{10715136: {'junction_count': 1, 'segment_count': 4}, 943030272: {'junction_count': 4, 'segment_count': 11}, 15106053: {'junction_count': 1, 'segment_count': 3}, 10715144: {'junction_count': 3, 'segment_count': 12}, 12025870: {'junction_count': 12, 'segment_count': 38}, 943030292: {'junction_count': 70, 'segment_count': 212}, 942080032: {'junction_count': 2, 'segment_count': 7}, 943030306: {'junction_count': 9, 'segment_count': 24}, 166756387: {'junction_count': 1, 'segment_count': 5}, 942080036: {'junction_count': 4, 'segment_count': 12}, 10715174: {'junction_count': 47, 'segment_count': 100}, 7733287: {'junction_count': 0, 'segment_count': 6}, 3178536: {'junction_count': 0, 'segment_count': 3}, 7733307: {'junction_count': 0, 'segment_count': 1}, 7733311: {'junction_count': 0, 'segment_count': 1}, 17170500: {'junction_count': 81, 'segment_count': 165}, 7733327: {'junction_count': 9, 'segment_count': 37}, 2556005: {'junction_count': 0, 'segment_count': 1}, 943030374: {'junction_count':

In [9]:
# d = {0: {'downstream': None, 'length': 456, 'upstreams': {2, 4, 6}}, 1: {'downstream': 4, 'length': 678, 'upstreams': {-999}}, 2: {'downstream': 0, 'length': 394, 'upstreams': {3, 7}}, 3: {'downstream': 2, 'length': 815, 'upstreams': {-999}}, 4: {'downstream': 0, 'length': 798, 'upstreams': {1, 5}}, 5: {'downstream': 4, 'length': 679, 'upstreams': {-999}}, 6: {'downstream': 0, 'length': 394, 'upstreams': {-999}}, 7: {'downstream': 2, 'length': 815, 'upstreams': {-999}}, 8: {'downstream': None, 'length': 841, 'upstreams': {11, 12}}, 9: {'downstream': 12, 'length': 524, 'upstreams': {10}}, 10: {'downstream': 9, 'length': 458, 'upstreams': {14}}, 11: {'downstream': 8, 'length': 548, 'upstreams': {-999}}, 12: {'downstream': 8, 'length': 543, 'upstreams': {9}}, 13: {'downstream': 14, 'length': 458, 'upstreams': {-999}}, 14: {'downstream': 10, 'length': 548, 'upstreams': {13, 15}}, 15: {'downstream': 14, 'length': 543, 'upstreams': {-999}}}
# # d2 = {}
# # for x, y in d.items():
# #   print(x, y)
# #   if d[x]['length'] > 800:
# #     d2.update({x:y})


# '''
# # Maybe do this?
# for terminalkey in terminalkeys:
#   network[nid].update({generator for related network})
# '''
# print('\n\n')
# d2 = ({x:y} for x, y in d.items() if d[x]['length'] > 800)
# for i in d2:
#   print(i)

In [None]:
# 

###NOW for the next step
Once a 'connection' object has been created with a representation of the river network, we can traverse that object and perform calculations -- in the example below, we parallelize the process of traversing the independent portions of the network and then serially compute the number of junctions. We could compute total upstream length or [and this is the real goal] flow due to incoming lateral contributions from the land accumulated over the entire upstream network. That second calculation can also be parallelized but we have to figure out how to accomplish intelligently so that the collective calculation is network-aware. The upstream length will depend on the number of upstream branches and their configuration, so there has to be some concept of stream order and topology built into the parallelization method.

In [None]:

# #parallel compute
# import time
# import multiprocessing

# terminal_keys = CONUS_ge5_values[4] 
# circular_keys = CONUS_ge5_values[6]
# terminal_keys_super = terminal_keys - circular_keys
# con = CONUS_ge5_values[0]
# terminal_code = terminal_code_NHD
# # terminal_keys = test_terminal_keys 
# # circular_keys = test_circular_keys
# # terminal_keys_super = terminal_keys - circular_keys
# # con = test_connections
# # teminal_code = test_terminal_code
# # terminal_keys = Brazos_LowerColorado_ge5_values[4] 
# # circular_keys = Brazos_LowerColorado_ge5_values[6]
# # terminal_keys_super = terminal_keys - circular_keys
# # con = Brazos_LowerColorado_ge5_values[0]
# # terminal_code = terminal_code_NHD

# super_networks = {terminal_key:{}
#                         for terminal_key in terminal_keys_super}
# debuglevel = 0
# verbose = False

# def recursive_junction_read (
#                              keys
#                              , network
#                              , terminal_code = 0
#                              , verbose = False
#                              , debuglevel = 0
#                             ):
#     global con
#     for key in keys:
#         ckey = key
#         ukeys = con[key]['upstreams']
#         while not len(ukeys) >= 2 and not (ukeys == [terminal_code]):
#             if debuglevel <= -2: print(f"segs at ckey {ckey}: {network['segment_count']}")
#             # the terminal code will indicate a headwater
#             if debuglevel <= -3: print(ukeys)
#             (ckey,) = ukeys
#             ukeys = con[ckey]['upstreams']
#         if ukeys == [terminal_code]:
#             if debuglevel <= -1: print(f"headwater found at {ckey}")
#             network['segment_count'] += 1
#             if debuglevel <= -2: print(f"segs at ckey {ckey}: {network['segment_count']}")
#         elif len(ukeys) >= 2:
#             network['segment_count'] += 1
#             if debuglevel <= -1: print(f"junction found at {ckey} with upstreams {ukeys}")
#             network['segment_count'] += 1
#             if debuglevel <= -2: print(f"segs at ckey {ckey}: {network['segment_count']}")
#             network['junction_count'] += 1 #the Terminal Segment
#             recursive_junction_read (ukeys, network, verbose = verbose, debuglevel = debuglevel) 
#             # print(ukeys)
#             ukeys = con[ckey]['upstreams']
#             ckey = ukeys

# def super_network_trace(
#                         nid
#                         , verbose= False
#                         , terminal_code = 0
#                         , debuglevel = 0
#                         ):

#     network = {}
#     global con
    
#     if verbose: print(f'\ntraversing upstream on network {nid}:')
#     # try:
#     if 1 == 1:
#         network.update({'junction_count': 0})
#         network.update({'segment_count': 0}) #the Terminal Segment
#         recursive_junction_read([nid], network, verbose = verbose, terminal_code = terminal_code, debuglevel = debuglevel)
#         if verbose: print(f"junctions: {network['junction_count']}")
#         if verbose: print(f"segments: {network['segment_count']}")
#     # except Exception as exc:
#     #     print(exc)
#     return network


# start_time = time.time()
# results_serial = {}
# for nid, network in super_networks.items():
#     network.update({nid: super_network_trace(nid, verbose = verbose, debuglevel = debuglevel)})
#     # super_network_trace(nid, con, network, terminal_code, debuglevel = debuglevel
# print("--- %s seconds ---" % (time.time() - start_time))

# print(super_networks.items())
# print(len(super_networks.items()))
# nids = (nid for nid in super_networks)
# # networks = ((nid, network) for nid, network in super_networks.items())
# start_time = time.time()
# with multiprocessing.Pool() as pool:
#     # results = pool.starmap(super_network_trace, [(nid, network) for nid, network in super_networks.items()])
#     results = pool.map(super_network_trace, nids)
#     # results = pool.starmap(super_network_trace, networks)
#     print("--- %s seconds ---" % (time.time() - start_time))

# print(results)


# start_time = time.time()
# results_serial = {}
# for nid, network in super_networks.items():
#     network.update({nid: super_network_trace(nid, verbose = verbose, debuglevel = debuglevel)})
#     # super_network_trace(nid, con, network, terminal_code, debuglevel = debuglevel
# print("--- %s seconds ---" % (time.time() - start_time))

# print(super_networks.items())
# print(len(super_networks.items()))
# nids = (nid for nid in super_networks)
# # networks = ((nid, network) for nid, network in super_networks.items())
# start_time = time.time()
# with multiprocessing.Pool() as pool:
#     # results = pool.starmap(super_network_trace, [(nid, network) for nid, network in super_networks.items()])
#     results = pool.map(super_network_trace, nids)
#     # results = pool.starmap(super_network_trace, networks)
#     print("--- %s seconds ---" % (time.time() - start_time))

# print(results)


# Parallel VS Serial Computation Comparison
*   The recursive functions are able to be called in parallel
*   This allows for network creation to be achieve in parallel
*   Dictionary keys are lost during parallel computation 
*   Node ordering can still be achieved   




In [None]:
# #works for serial computation ordering
# #parallel compute
# import time
# import multiprocessing as mp


# # terminal_keys = CONUS_ge5_values[4] 
# # circular_keys = CONUS_ge5_values[6]
# # terminal_keys_super = terminal_keys - circular_keys
# # con = CONUS_ge5_values[0]
# # terminal_code = terminal_code_NHD
# # terminal_keys = test_terminal_keys 
# # circular_keys = test_circular_keys
# # terminal_keys_super = terminal_keys - circular_keys
# # con = test_connections
# # teminal_code = test_terminal_code
# terminal_keys = Brazos_LowerColorado_ge5_values[4] 
# circular_keys = Brazos_LowerColorado_ge5_values[6]
# terminal_keys_super = terminal_keys - circular_keys
# con = Brazos_LowerColorado_ge5_values[0]
# terminal_code = terminal_code_NHD

# super_networks = {terminal_key:{}
#                         for terminal_key in terminal_keys_super}
# debuglevel = 0
# verbose = False

# order_dict={}
# junc_dict={}
# head_dict={}

# def recursive_junction_read (
#                              keys
#                              , iterator
#                              , network
#                              , terminal_code = 0
#                              , verbose = False
#                              , debuglevel = 0
#                             ):
#     global con
#     for key in keys:
#         result = 'node' + str(key)
#         order_dict[result] = iterator
#         ckey = key
#         ukeys = con[key]['upstreams']
#         while not len(ukeys) >= 2 and not (ukeys == [terminal_code]):
#             if debuglevel <= -2: print(f"segs at ckey {ckey}: {network['segment_count']}")
#             # the terminal code will indicate a headwater
#             if debuglevel <= -3: print(ukeys)
#             (ckey,) = ukeys
#             ukeys = con[ckey]['upstreams']
#             result = 'node' + str(ckey)
#             order_dict[result] = iterator
#         if ukeys == [terminal_code]:
#             if debuglevel <= -1: print(f"headwater found at {ckey}")
#             result_head = 'head' + str(key)
#             head_dict[result_head] = iterator
#             network['segment_count'] += 1
#             if debuglevel <= -2: print(f"segs at ckey {ckey}: {network['segment_count']}")
#         elif len(ukeys) >= 2:
#             # result_junc = 'junc' + str(key)
#             # junc_dict[result_junc] = iterator
#             network['segment_count'] += 1
#             if debuglevel <= -1: print(f"junction found at {ckey} with upstreams {ukeys}")
#             network['segment_count'] += 1
#             if debuglevel <= -2: print(f"segs at ckey {ckey}: {network['segment_count']}")
#             network['junction_count'] += 1 #the Terminal Segment
#             result_junc = 'junc' + str(key)
#             junc_dict[result_junc] = iterator
#             recursive_junction_read (ukeys, iterator+1, network, verbose = verbose, debuglevel = debuglevel) 
#             # print(ukeys)
#             ukeys = con[ckey]['upstreams']
#             ckey = ukeys
#             # result_head = 'head' + str(key)
#             # head_dict[result_head] = iterator

# def super_network_trace(
#                         nid
#                         , iterator
#                         , verbose= False
#                         , terminal_code = 0
#                         , debuglevel = 0
#                         ):

#     network = {}
#     global con
    
#     if verbose: print(f'\ntraversing upstream on network {nid}:')
#     # try:
#     if 1 == 1:
#         network.update({'junction_count': 0})
#         network.update({'segment_count': 0}) #the Terminal Segment
#         recursive_junction_read([nid], 0, network, verbose = verbose, terminal_code = terminal_code, debuglevel = debuglevel)
#         if verbose: print(f"junctions: {network['junction_count']}")
#         if verbose: print(f"segments: {network['segment_count']}")
#     # except Exception as exc:
#     #     print(exc)
    
#     return network
    
    

# start_time = time.time()
# results_serial = {}
# for nid, network in super_networks.items():
#     network.update({nid: super_network_trace(nid, 0, verbose = verbose, debuglevel = debuglevel)})
#     # super_network_trace(nid, con, network, terminal_code, debuglevel = debuglevel
# print("--- %s seconds ---" % (time.time() - start_time))

# print(super_networks.items())
# # print(len(super_networks.items()))
# # nids = (nid for nid in super_networks)
# # # networks = ((nid, network) for nid, network in super_networks.items())
# # start_time = time.time()
# # with multiprocessing.Pool() as pool:
# #     # results = pool.starmap(super_network_trace, [(nid, network) for nid, network in super_networks.items()])
# #     results = pool.map(super_network_trace, nids, 0)
# #     # results = pool.starmap(super_network_trace, networks)
# #     print("--- %s seconds ---" % (time.time() - start_time))

# # print(results)
# # 

# #working 
# # pool = mp.Pool(mp.cpu_count())

# # results = pool.starmap(super_network_trace, [(nid, network) for nid, network in super_networks.items()])
# # network.update({nid: super_network_trace(nid, 0, verbose = verbose, debuglevel = debuglevel)})

# # pool.close()

# print(results)
# # start_time = time.time()
# # results_serial = {}
# # for nid, network in super_networks.items():
# #     network.update({nid: super_network_trace(nid, 0, verbose = verbose, debuglevel = debuglevel)})
# #     # super_network_trace(nid, con, network, terminal_code, debuglevel = debuglevel
# # print("--- %s seconds ---" % (time.time() - start_time))

# # print(super_networks.items())
# # print(len(super_networks.items()))

# # nids = (nid for nid in super_networks)
# # # networks = ((nid, network) for nid, network in super_networks.items())
# # start_time = time.time()
# # with multiprocessing.Pool() as pool:
# #     # results = pool.starmap(super_network_trace, [(nid, network) for nid, network in super_networks.items()])
# #     results = pool.map(super_network_trace, nids, 0)
# #     # results = pool.starmap(super_network_trace, networks)
# #     print("--- %s seconds ---" % (time.time() - start_time))

# # print(results)
# print(order_dict)
# print(junc_dict)
# print(head_dict)
# # print(junc_dict)
# # print(head_dict)

# New Lengths Working


In [None]:
import math
new_con = con.copy()
def grow(p):
    for x , y in p.items():
        if y['length'] < 400:
          
#for upstream junction with 1 downstream
            if len(list(y['upstreams'])) > 1 and len(([y['downstream']])) == 1:
                for w , q in p.items():
                    if y['downstream'] == w:
                        if x in (q['upstreams'].union(y['upstreams'])):
                            i = (q['upstreams'].union(y['upstreams']))
                            i.remove(x)
                        else:
                            i = (q['upstreams'].union(y['upstreams']))
                        new_con.update({ w : {'downstream' : q['downstream'], 'length' : (q['length']+y['length']), 'upstreams' : i, 'nid': q['nid']}})
                    for e in list(i): 
                        if w == e:
                            new_con.update({ w : {'downstream' : y['downstream'], 'length' : q['length'], 'upstreams' : q['upstreams'], 'nid': q['nid']}})
                            
                

                         #downstream
                        
                del new_con[x]       


#for 1 upstream with 1 downstream and upstream is not a headwater 
            if len(([y['downstream']])) == 1 and len(list(y['upstreams'])) == 1 and (y['upstreams']) != {-999}:
                for d in list(y['upstreams']):
                    for f , g in p.items():
                        if d == f:
                            if x in list(g['upstreams']):
                                o = (g['upstreams'])
                                o.remove(x)
                            else:
                                o = (g['upstreams'])
                            new_con.update({ d : {'downstream' : y['downstream'], 'length' : ((g['length'])+y['length']), 'upstreams' : o, 'nid': g['nid']}})       
                        
                      
                del new_con[x]


#for junction at both ends should split the short segment between the junctions 
            # if len(list(y['downstream'])) > 1 and len(list(y['upstreams'])) > 1:
            #     for c in list(y['upstreams']):
            #         for a , b in p.items():
            #             if c == a:
            #                 new_con.update({ c : {'downstream' : y['downstream'], 'length' : ((b['length'])+y['length']/len(list(y['upstreams']))), 'upstreams' : b['upstreams'], 'nid': b['nid']}})           
            #     for l in list(y['downstream']):
            #         if y['downstream'] == w:
            #             print(w)
            #             new_con.update({ w : {'downstream' : q['downstream'], 'length' : (q['length']+y['length']/len(list(y['downstream']))), 'upstreams' : q['upstreams'], 'nid': q['nid']}})
            #     del new_con[x]
            
                #            new_con.update({x : {'downstream' : y['downstream'], 'length' : int(y['length']/div), 'upstreams' : temp, 'nid': y['nid']}}) #downstream
          
                # for f in range(1,div):
                #     if f == 1 and f != div-1:
                #         con.update({(max(con.keys())+f) :{'downstream' : x, 'length' : int(y['length']/div), 'upstreams' :  {(max(con.keys())+2)}, 'nid': y['nid'] }}) # first node if only new node
                #     if  f == div-1:
                #         con.update({(max(con.keys())+f) :{'downstream' : x, 'length' : int(y['length']/div), 'upstreams' : y['upstreams'], 'nid': y['nid'] }}) #last node 
                #      #new nodes first
                #     if f != 1 and f != div-1:
                #         con.update({(max(con.keys())+1) :{'downstream' : (max(con.keys())), 'length' : int(y['length']/div), 'upstreams' :{(max(con.keys())+2)}, 'nid': y['nid'] }}) #middle nodes
                    
                    #     con.update({(max(con.keys())+f-1) :{'downstream' : (max(con.keys())+f-2) , 'length' : int(y['length']/div), 'upstreams' : {(max(con.keys())+f+1)}, 'nid': y['nid'] }}) #new nodes middle

   #######     #this ignores headwaters that are too short    #####
            
grow(con)

for x,y in new_con.items():
    print(x,y)

In [None]:
import math
con = new_con.copy()
def shrink(p):
    for x , y in p.items():
        
        if y['length'] > 800:
            
            div = math.ceil(y['length']/800)
            count = [int(div)]
            n = div-1
            temp = {(max(con.keys())+1)}
            
            if sum(count) > 0:
                # for z in list(y['upstreams']):
                #     print("hey")
                    # con.update({z :{'downstream' :(max(con.keys())+n), 'length' : con[z]['length'], 'upstreams' : con[z]['upstreams'], 'nid': y['nid']}})  #upstream
                con.update({x : {'downstream' : y['downstream'], 'length' : int(y['length']/div), 'upstreams' : temp, 'nid': y['nid']}}) #downstream
          
                for f in range(1,div):
                    if f == 1 and f != div-1:
                        con.update({(max(con.keys())+f) :{'downstream' : x, 'length' : int(y['length']/div), 'upstreams' :  {(max(con.keys())+2)}, 'nid': y['nid'] }}) # first node if only new node
                    if  f == div-1:
                        con.update({(max(con.keys())+f) :{'downstream' : x, 'length' : int(y['length']/div), 'upstreams' : y['upstreams'], 'nid': y['nid'] }}) #last node 
                     #new nodes first
                    if f != 1 and f != div-1:
                        con.update({(max(con.keys())+1) :{'downstream' : (max(con.keys())), 'length' : int(y['length']/div), 'upstreams' :{(max(con.keys())+2)}, 'nid': y['nid'] }}) #middle nodes
                    
                    #     con.update({(max(con.keys())+f-1) :{'downstream' : (max(con.keys())+f-2) , 'length' : int(y['length']/div), 'upstreams' : {(max(con.keys())+f+1)}, 'nid': y['nid'] }}) #new nodes middle

                count.append(-1)
                
            else:
                print("done")
            
shrink(new_con)

for x,y in con.items():
    print(x,y)


# Visualization of New Network

In [None]:
import recursive_print
# recursive_print.print_connections(test_headwater_keys, test_terminal_keys, )
resized_rows = []
for x , y in con.items():
    resized_rows.append([x,y['length'],y['downstream'],0])
print(resized_rows)



In [None]:
# def main():
if 1 == 1:
    """##TEST"""
    print("")
    print ('Executing Test')
    # Test data
    test_rows = resized_rows.copy()

    test_key_col = 0
    test_downstream_col = 2
    test_length_col = 1
    test_terminal_code = -999
    debuglevel = -2
    verbose = True

    (test_connections) = networkbuilder.get_down_connections(
                rows = test_rows
                , key_col = test_key_col
                , downstream_col = test_downstream_col
                , length_col = test_length_col
                , verbose = True
                , debuglevel = debuglevel
                )

    (test_all_keys, test_ref_keys, test_headwater_keys
     , test_terminal_keys
     , test_terminal_ref_keys
     , test_circular_keys) = networkbuilder.determine_keys(
                connections = test_connections
                
                , key_col = test_key_col
                , downstream_col = test_downstream_col
                , terminal_code = test_terminal_code
                , verbose = True
                , debuglevel = debuglevel
                )

    test_junction_keys = networkbuilder.get_up_connections(
                connections = test_connections
                , terminal_code = test_terminal_code
                , headwater_keys = test_headwater_keys
                , terminal_keys = test_terminal_keys
                , verbose = True
                , debuglevel = debuglevel
                )

    recursive_print.print_connections(
                headwater_keys = test_headwater_keys
                , down_connections = test_connections
                , up_connections = test_connections
                , terminal_code = test_terminal_code
                , terminal_keys = test_terminal_keys
                , terminal_ref_keys = test_terminal_ref_keys
                , debuglevel = debuglevel
                )

    recursive_print.print_basic_network_info(
                connections = test_connections
                , headwater_keys = test_headwater_keys
                , junction_keys = test_junction_keys
                , terminal_keys = test_terminal_keys
                , terminal_code = test_terminal_code
                , verbose = True
                , debuglevel = debuglevel
                )


# if __name__ == "__main__":
#     main()


In [None]:
#  con.update({6 : {'downstream' : {3,4}, 'length' : 394, 'upstreams' : {18}, 'nid': 0}})
# con.update({3 : {'downstream' : 2, 'length' : 815, 'upstreams' : {-999}, 'nid': 8}})
# con.update({7 : {'downstream' : 2, 'length' : 815, 'upstreams' : {-999}, 'nid': 8}})
# del con[16]
# del con[17]
# del con[18]
# del con[19]
# del con[20]
# del con[21]
# del con[22]
# del con[23]
# del con[24]
# del con[25]

# del con[26]
# del con[27]

# Assign Order IDs

In [80]:
#original 
import time

#list of the different key sets
#
# terminal_keys = Brazos_LowerColorado_ge5_values[4] 
# circular_keys = Brazos_LowerColorado_ge5_values[6]
# terminal_keys_super = terminal_keys - circular_keys
# con = Brazos_LowerColorado_ge5_values[0]
# terminal_code = terminal_code_NHD
# #
# terminal_keys = CONUS_ge5_values[4] 
# circular_keys = CONUS_ge5_values[6]
# terminal_keys_super = terminal_keys - circular_keys
# con = CONUS_ge5_values[0]
# #
# terminal_keys_super = test_terminal_keys - test_circular_keys
# # con = test_connections
# terminal_code = test_terminal_code

terminal_keys = CONUS_FULL_RES_values[4] 
circular_keys = CONUS_FULL_RES_values[6]
terminal_keys_super = terminal_keys - circular_keys
con = CONUS_FULL_RES_values[0]
terminal_code = terminal_code_NHD
#orders of everything for computation in dictionary format
order_dict={}
junc_dict={}
head_dict={}
terminal_avg = {}

def recursive_junction_read (keys, iterator, con, network, terminal_code, verbose = False, debuglevel = 0):
    # print(keys)
    for key in keys:
        
        result = 'node' + str(key) 
        order_dict[result] = iterator, nid
        ckey = key
        ukeys = con[key]['upstreams']
        
        con[key].update({'nid': nid})
        con[key].update({'node_order' : iterator})
        
        # print(f"segs at ckey {ckey}: {network['segment_count']}")
        n = []
        while not len(ukeys) >= 2 and not (ukeys == {terminal_code}):
            # the terminal code will indicate a headwater
            if debuglevel <= -2: print(ukeys)
            (ckey,) = ukeys
            ukeys = con[ckey]['upstreams']
            network['segment_count'] += 1
            # print(f"segs at ckey {ckey}: {network['segment_count']}")
            #adds ordering for all nodes
            result = 'node' + str(ckey)
            order_dict[result] = iterator, nid
            
#             print(ckey)
            con[ckey].update({'nid': nid})
            con[ckey].update({'node_order' : iterator+1+sum(n)})
            n.append(1)
            
        if len(ukeys) >= 2:
            if debuglevel <= -1: print(f"junction found at {ckey} with upstreams {ukeys}")
            network['segment_count'] += 1
            # print(f"segs at ckey {ckey}: {network['segment_count']}")
            network['junction_count'] += 1 #the Terminal Segment
            #iterator adds 1 each iteration to provide a new order of computation for each junction section not each node or group of segments
            result_junc = 'junc' + str(key)
            junc_dict[result_junc] = iterator
            con[ckey].update({'nid': nid})
            
            recursive_junction_read (ukeys, iterator+1+sum(n), con, network, terminal_code, verbose, debuglevel)
            n.clear()
        elif ukeys == {terminal_code}:
            # print(f"headwater found at {ckey}")
            network['segment_count'] += 1
            # print(f"segs at ckey {ckey}: {network['segment_count']}")
            #below adds headwaters to the headwater list
            result_head = 'head' + str(key)
            head_dict[result_head] = iterator
            con[ckey].update({'nid': nid})
            
      
def super_network_trace(nid, iterator, con, network, terminal_code, debuglevel = 0):
    # print(f'\ntraversing upstream on network {nid}:')
    try:
        network.update({'junction_count': 0})
        network.update({'segment_count': 0}) #the Terminal Segment
        
        recursive_junction_read([nid], iterator , con, network, terminal_code, debuglevel = debuglevel)
        
        # print(f"junctions: {network['junction_count']}")
        # print(f"segments: {network['segment_count']}")
        
    except Exception as exc:
        print(exc)

super_networks = {terminal_key:{}
                        for terminal_key in terminal_keys_super}
debuglevel = 0

start_time = time.time()
for nid, network in super_networks.items():
    super_network_trace(nid, 0, con, network, terminal_code, debuglevel = debuglevel)
    

print("--- %s seconds ---" % (time.time() - start_time))

# for x,y in con.items():
#     print(x,y)
# print(super_networks.items())
# print(network)

--- 8.87982964515686 seconds ---


# Collect Unique Node Ordering Characters

In [None]:

new_order = []
for key, value in con.items():
    f = con[key]
    if f['node_order'] not in new_order:
        new_order.append(f['node_order'])
        new_order.sort(reverse = True) 
print(new_order)


# Order of the keys for computation
*   List is created with nodes in order from greatest to smallest for computation




In [None]:
g2l = []
for x in new_order:
    for y, z in con.items():
        if x == z['node_order']:
          g2l.append(y)
          
print(g2l)
reordered = []
for x in g2l:
    for y,z in con.items():
        if x == y:
            reordered.append({y:z})

print(reordered)




# Adding Names to Rivers

In [81]:
import pandas as pd 
# Read data from file 'filename.csv' 
# (in the same directory that your python process is based)
# Control delimiters, rows, column names with read_csv (see later) 
data = pd.read_csv("/home/jacob.hreha/Desktop/nwm_reaches_conus_20_wGNIS.csv") 
# Preview the first 5 lines of the loaded data 
data.head()
data_rows = data.to_numpy()

name_con = {}
for row in data_rows:
#     print(data_rows)
    name_con.update({row[1]: {'gnis_name': row[5]}})


In [82]:
# Python program to get N key:value pairs in given dictionary 
# using itertools.islice() method 
  
import itertools  
   
# Initialize limit  
N = 10
    
# Using islice() + items()  
# Get first N items in dictionary  
out = dict(itertools.islice(name_con.items(), N))  
        
# printing result   
print("Dictionary limited by K is : ")
for o in out:
    print(str(o), str(out[o]))

Dictionary limited by K is : 
83050 {'gnis_name': 'South Fork Miami River'}
82592 {'gnis_name': 'Oleta River'}
79368 {'gnis_name': 'South Fork New River'}
84356 {'gnis_name': 'South Fork Miami River'}
82932 {'gnis_name': 'Little River'}
84374 {'gnis_name': 'Little River'}
82974 {'gnis_name': 'Little River'}
82702 {'gnis_name': 'Oleta River'}
82698 {'gnis_name': 'Oleta River'}
84104 {'gnis_name': 'Oleta River'}


In [93]:
for x,y in con.items():
    if y['nid'] == None:
        print(x)

In [100]:
terminal_keys = CONUS_FULL_RES_values[4] 

new_dict= {}

for z,y in con.items():
    if y['gnis_name'] == None:
        new_dict.update({ y['nid'] :{'gnis_name':None}})
print(new_dict[24778867])


{'gnis_name': None}


In [117]:
# found = []
# nfound = []
for x,y in con.items():
    if x in (name_con):
#         found.append(x, data.gnis_name[x])
        con[x].update(name_con[x]) #updates current con key with the values of name_con for that key
#         found.append(x)
    
    else:
        con[x].update({'gnis_name': None})
#         nfound.append(x, "None")
#         nfound.append(x)


def chk(b):
    for i in b:
        order = []
        order.append(con[i]['node_order'])
    for i in b:
        m = max(order)
        if con[i]['node_order'] == m:
            if con[i]['gnis_name'] != None:
                con[j].update({'gnis_name': con[i]['gnis_name']})
            elif con[i] in head_dict:
                con[j].update({'gnis_name': 'Unknown'})
            else:
                new_ups = list(con[i]['upstreams'])
                chk(new_ups)

for j,k in con.items():    
    if k['gnis_name'] == None:
        chk(list(k['upstreams'])     

# for j,k in con.items():
#     if j == 6640886:
#         print(k)
# for i,f in new_dict.items():
#     if i == 6640886:
#         print(f)




SyntaxError: unexpected EOF while parsing (<ipython-input-117-273159e7aaf0>, line 41)

In [114]:
print(con[11160653]['length'])

5471.0


In [112]:
for x,y in con.items():
    print(x,y)

6635572 {'downstream': 6635570, 'length': 1070.0, 'upstreams': {0}, 'nid': 7077392, 'node_order': 188, 'gnis_name': None}
6635590 {'downstream': 6635600, 'length': 1117.0, 'upstreams': {0}, 'nid': 7077392, 'node_order': 197, 'gnis_name': None}
6635598 {'downstream': 6635636, 'length': 2303.0, 'upstreams': {0}, 'nid': 7077392, 'node_order': 196, 'gnis_name': None}
6635622 {'downstream': 6635620, 'length': 1119.0, 'upstreams': {0}, 'nid': 7077392, 'node_order': 188, 'gnis_name': None}
6635626 {'downstream': 6635624, 'length': 3171.0, 'upstreams': {0}, 'nid': 7077392, 'node_order': 189, 'gnis_name': None}
6635640 {'downstream': 6635660, 'length': 5338.0, 'upstreams': {0}, 'nid': 7077392, 'node_order': 198, 'gnis_name': None}
6635654 {'downstream': 6635660, 'length': 2344.0, 'upstreams': {0}, 'nid': 7077392, 'node_order': 198, 'gnis_name': None}
6635658 {'downstream': 6635620, 'length': 2558.0, 'upstreams': {0}, 'nid': 7077392, 'node_order': 188, 'gnis_name': None}
6635672 {'downstream': 6

7077362 {'downstream': 7076862, 'length': 13132.0, 'upstreams': {0}, 'nid': 7071696, 'node_order': 3, 'gnis_name': None}
7077364 {'downstream': 7077482, 'length': 12774.0, 'upstreams': {0}, 'nid': 7077392, 'node_order': 21, 'gnis_name': None}
7077366 {'downstream': 7077490, 'length': 12038.0, 'upstreams': {0}, 'nid': 7077392, 'node_order': 23, 'gnis_name': None}
7077368 {'downstream': 7077072, 'length': 1527.0, 'upstreams': {0}, 'nid': 7077392, 'node_order': 26, 'gnis_name': None}
7077370 {'downstream': 7077072, 'length': 8315.0, 'upstreams': {0}, 'nid': 7077392, 'node_order': 26, 'gnis_name': None}
7077374 {'downstream': 7077084, 'length': 17491.0, 'upstreams': {0}, 'nid': 7077392, 'node_order': 43, 'gnis_name': None}
7077378 {'downstream': 7077112, 'length': 3914.0, 'upstreams': {0}, 'nid': 7077392, 'node_order': 36, 'gnis_name': None}
7077380 {'downstream': 7077126, 'length': 20701.0, 'upstreams': {0}, 'nid': 7077392, 'node_order': 40, 'gnis_name': None}
7077536 {'downstream': 70770

7743292 {'downstream': 7743124, 'length': 2622.0, 'upstreams': {0}, 'nid': 167200737, 'node_order': 71, 'gnis_name': None}
7743548 {'downstream': 7743132, 'length': 3177.0, 'upstreams': {0}, 'nid': 167200737, 'node_order': 76, 'gnis_name': None}
7743598 {'downstream': 7743764, 'length': 1127.0, 'upstreams': {0}, 'nid': 167200737, 'node_order': 78, 'gnis_name': None}
7743640 {'downstream': 7743642, 'length': 1100.0, 'upstreams': {0}, 'nid': 167200737, 'node_order': 81, 'gnis_name': 'Rapid River'}
7743650 {'downstream': 7743642, 'length': 870.0, 'upstreams': {0}, 'nid': 167200737, 'node_order': 81, 'gnis_name': None}
7746085 {'downstream': 7746589, 'length': 1594.0, 'upstreams': {0}, 'nid': 167200737, 'node_order': 29, 'gnis_name': None}
7746101 {'downstream': 7746099, 'length': 2400.0, 'upstreams': {0}, 'nid': 167200737, 'node_order': 38, 'gnis_name': None}
7746111 {'downstream': 7746107, 'length': 1515.0, 'upstreams': {0}, 'nid': 7746107, 'node_order': 1, 'gnis_name': None}
7746113 {'d

14415771 {'downstream': 14415769, 'length': 1612.0, 'upstreams': {0}, 'nid': 7077392, 'node_order': 89, 'gnis_name': None}
14415775 {'downstream': 14415773, 'length': 1352.0, 'upstreams': {0}, 'nid': 7077392, 'node_order': 98, 'gnis_name': None}
14415779 {'downstream': 14415755, 'length': 2532.0, 'upstreams': {0}, 'nid': 7077392, 'node_order': 60, 'gnis_name': None}
14415781 {'downstream': 14415769, 'length': 2381.0, 'upstreams': {0}, 'nid': 7077392, 'node_order': 89, 'gnis_name': None}
14415785 {'downstream': 14415759, 'length': 4568.0, 'upstreams': {0}, 'nid': 7077392, 'node_order': 97, 'gnis_name': None}
14415789 {'downstream': 14415783, 'length': 2062.0, 'upstreams': {0}, 'nid': 7077392, 'node_order': 98, 'gnis_name': None}
14415791 {'downstream': 14415961, 'length': 3436.0, 'upstreams': {0}, 'nid': 7077392, 'node_order': 100, 'gnis_name': None}
14415795 {'downstream': 14415787, 'length': 1176.0, 'upstreams': {0}, 'nid': 7077392, 'node_order': 100, 'gnis_name': None}
14415799 {'dow

7030223 {'downstream': 7030633, 'length': 1584.0, 'upstreams': {7030829}, 'nid': 7077392, 'node_order': 182, 'gnis_name': None}
7030231 {'downstream': 7030749, 'length': 1743.0, 'upstreams': {7030831}, 'nid': 7077392, 'node_order': 185, 'gnis_name': None}
7030233 {'downstream': 7030751, 'length': 362.0, 'upstreams': {7030827}, 'nid': 7077392, 'node_order': 203, 'gnis_name': None}
7030239 {'downstream': 7030249, 'length': 788.0, 'upstreams': {7030227, 7030229}, 'nid': 7077392, 'node_order': 172, 'gnis_name': None}
7030243 {'downstream': 7030251, 'length': 433.0, 'upstreams': {7030833}, 'nid': 7077392, 'node_order': 172, 'gnis_name': None}
7030279 {'downstream': 7030275, 'length': 1296.0, 'upstreams': {7030835}, 'nid': 7077392, 'node_order': 181, 'gnis_name': None}
7030293 {'downstream': 7030761, 'length': 4734.0, 'upstreams': {7030837}, 'nid': 7077392, 'node_order': 197, 'gnis_name': 'Little Elbow Creek'}
7030297 {'downstream': 7030777, 'length': 670.0, 'upstreams': {7030295}, 'nid': 70

9398246 {'downstream': 9398222, 'length': 306.0, 'upstreams': {9398248, 9398220}, 'nid': 7077392, 'node_order': 79, 'gnis_name': 'Elm Coulee'}
9398536 {'downstream': 9398528, 'length': 4797.0, 'upstreams': {9398546, 9398548}, 'nid': 7077392, 'node_order': 74, 'gnis_name': None}
9398568 {'downstream': 9398566, 'length': 2283.0, 'upstreams': {9398610}, 'nid': 7077392, 'node_order': 81, 'gnis_name': None}
9398626 {'downstream': 9398588, 'length': 4540.0, 'upstreams': {9398666, 9398662}, 'nid': 7077392, 'node_order': 84, 'gnis_name': None}
9398688 {'downstream': 9398644, 'length': 1602.0, 'upstreams': {9398722}, 'nid': 7077392, 'node_order': 96, 'gnis_name': None}
9398748 {'downstream': 9399154, 'length': 1808.0, 'upstreams': {9399178}, 'nid': 7077392, 'node_order': 125, 'gnis_name': None}
9398758 {'downstream': 9398772, 'length': 2845.0, 'upstreams': {9398724, 9398718}, 'nid': 7077392, 'node_order': 101, 'gnis_name': 'Maple Creek'}
9398760 {'downstream': 9399158, 'length': 1901.0, 'upstre

7098227 {'downstream': 7097123, 'length': 152.0, 'upstreams': {7097117}, 'nid': 167200737, 'node_order': 289, 'gnis_name': None}
7098245 {'downstream': 7097835, 'length': 532.0, 'upstreams': {7097207}, 'nid': 167200737, 'node_order': 391, 'gnis_name': None}
7098249 {'downstream': 7098259, 'length': 117.0, 'upstreams': {7098241}, 'nid': 167200737, 'node_order': 388, 'gnis_name': None}
7098307 {'downstream': 7097267, 'length': 255.0, 'upstreams': {7097265}, 'nid': 167200737, 'node_order': 389, 'gnis_name': None}
7098333 {'downstream': 7098329, 'length': 92.0, 'upstreams': {7097307}, 'nid': 167200737, 'node_order': 453, 'gnis_name': None}
7098355 {'downstream': 7097305, 'length': 182.0, 'upstreams': {7097293}, 'nid': 167200737, 'node_order': 291, 'gnis_name': None}
7098363 {'downstream': 7097339, 'length': 61.0, 'upstreams': {7097367}, 'nid': 167200737, 'node_order': 355, 'gnis_name': None}
7098369 {'downstream': 7098351, 'length': 341.0, 'upstreams': {7097375}, 'nid': 167200737, 'node_or

939021215 {'downstream': 14267830, 'length': 791.0, 'upstreams': {939021213, 939021214}, 'nid': 7077392, 'node_order': 354, 'gnis_name': None}
939021236 {'downstream': 939021235, 'length': 50.0, 'upstreams': {939021230}, 'nid': 7077392, 'node_order': 91, 'gnis_name': None}
939021238 {'downstream': 939021239, 'length': 947.0, 'upstreams': {939021237}, 'nid': 7077392, 'node_order': 188, 'gnis_name': None}
939021247 {'downstream': 6688853, 'length': 2387.0, 'upstreams': {0}, 'nid': 7077392, 'node_order': 144, 'gnis_name': None}
939021289 {'downstream': 939021311, 'length': 945.0, 'upstreams': {14428798}, 'nid': 7077392, 'node_order': 71, 'gnis_name': None}
939021298 {'downstream': 939021297, 'length': 70.0, 'upstreams': {7048787}, 'nid': 7077392, 'node_order': 151, 'gnis_name': None}
939021305 {'downstream': 939021306, 'length': 8853.0, 'upstreams': {14293961, 14293959}, 'nid': 14301853, 'node_order': 2, 'gnis_name': None}
939021332 {'downstream': 939021333, 'length': 2020.0, 'upstreams':

7106903 {'downstream': 7106907, 'length': 1482.0, 'upstreams': {7106899, 7106901}, 'nid': 167200737, 'node_order': 417, 'gnis_name': 'Maggie Creek'}
7106995 {'downstream': 7106993, 'length': 417.0, 'upstreams': {7107203}, 'nid': 167200737, 'node_order': 404, 'gnis_name': None}
7107091 {'downstream': 7106789, 'length': 870.0, 'upstreams': {7107089, 7107085, 7107087}, 'nid': 167200737, 'node_order': 403, 'gnis_name': None}
7107159 {'downstream': 7106847, 'length': 364.0, 'upstreams': {7106833}, 'nid': 167200737, 'node_order': 407, 'gnis_name': None}
7107173 {'downstream': 7106853, 'length': 556.0, 'upstreams': {7106843}, 'nid': 167200737, 'node_order': 407, 'gnis_name': None}
7109549 {'downstream': 7111443, 'length': 166.0, 'upstreams': {7097709}, 'nid': 167200737, 'node_order': 315, 'gnis_name': None}
7109569 {'downstream': 7111451, 'length': 826.0, 'upstreams': {7111411}, 'nid': 167200737, 'node_order': 314, 'gnis_name': None}
7109593 {'downstream': 7111415, 'length': 1867.0, 'upstream

14426318 {'downstream': 14426334, 'length': 1150.0, 'upstreams': {14425728}, 'nid': 7077392, 'node_order': 87, 'gnis_name': None}
14426508 {'downstream': 14426544, 'length': 930.0, 'upstreams': {14426382, 14426398}, 'nid': 7077392, 'node_order': 83, 'gnis_name': 'Middle Branch Park River'}
14426536 {'downstream': 14426548, 'length': 132.0, 'upstreams': {14426530}, 'nid': 7077392, 'node_order': 82, 'gnis_name': None}
14428262 {'downstream': 14428274, 'length': 5041.0, 'upstreams': {14430172}, 'nid': 7077392, 'node_order': 85, 'gnis_name': None}
14428508 {'downstream': 909021072, 'length': 3103.0, 'upstreams': {14426458}, 'nid': 7077392, 'node_order': 81, 'gnis_name': None}
14428634 {'downstream': 14428862, 'length': 8040.0, 'upstreams': {14428642}, 'nid': 7077392, 'node_order': 60, 'gnis_name': None}
14428650 {'downstream': 14428868, 'length': 4486.0, 'upstreams': {0}, 'nid': 7077392, 'node_order': 61, 'gnis_name': None}
14428754 {'downstream': 14428720, 'length': 2682.0, 'upstreams': {

IOPub data rate exceeded.
The notebook server will temporarily stop sending output
to the client in order to avoid crashing it.
To change this limit, set the config variable
`--NotebookApp.iopub_data_rate_limit`.

Current values:
NotebookApp.iopub_data_rate_limit=1000000.0 (bytes/sec)
NotebookApp.rate_limit_window=3.0 (secs)



 {'downstream': 1090012071, 'length': 1302.0, 'upstreams': {0}, 'nid': 167200737, 'node_order': 187, 'gnis_name': None}
1090011546 {'downstream': 1090011555, 'length': 661.4000244140625, 'upstreams': {1090011537, 1090011538}, 'nid': 167200737, 'node_order': 187, 'gnis_name': None}
1090011545 {'downstream': 1090011555, 'length': 74.80000305175781, 'upstreams': {0}, 'nid': 167200737, 'node_order': 187, 'gnis_name': None}
1090011457 {'downstream': 1090011439, 'length': 1904.800048828125, 'upstreams': {0}, 'nid': 167200737, 'node_order': 187, 'gnis_name': None}
1090011443 {'downstream': 1090011439, 'length': 2991.39990234375, 'upstreams': {0}, 'nid': 167200737, 'node_order': 187, 'gnis_name': None}
1090011426 {'downstream': 1090011437, 'length': 2166.5, 'upstreams': {1090011427, 1090011428}, 'nid': 167200737, 'node_order': 187, 'gnis_name': None}
1090011425 {'downstream': 1090011437, 'length': 360.29998779296875, 'upstreams': {0}, 'nid': 167200737, 'node_order': 187, 'gnis_name': None}
109

1090010152 {'downstream': 1090010072, 'length': 2476.60009765625, 'upstreams': {0}, 'nid': 167200737, 'node_order': 165, 'gnis_name': None}
1090010639 {'downstream': 1090010674, 'length': 2035.5, 'upstreams': {1090010613, 1090010614}, 'nid': 167200737, 'node_order': 165, 'gnis_name': None}
1090010072 {'downstream': 1090010100, 'length': 84.9000015258789, 'upstreams': {1090010152, 1090010098}, 'nid': 167200737, 'node_order': 164, 'gnis_name': None}
1090010069 {'downstream': 1090010100, 'length': 1473.9000244140625, 'upstreams': {1090009995, 1090009996}, 'nid': 167200737, 'node_order': 164, 'gnis_name': None}
1090010381 {'downstream': 1090010534, 'length': 1854.0, 'upstreams': {0}, 'nid': 167200737, 'node_order': 164, 'gnis_name': None}
1090010380 {'downstream': 1090010534, 'length': 500.6000061035156, 'upstreams': {0}, 'nid': 167200737, 'node_order': 164, 'gnis_name': None}
1090009339 {'downstream': 1090009445, 'length': 3861.89990234375, 'upstreams': {1090009161, 1090009146}, 'nid': 16

KeyboardInterrupt: 

#Graphical Examples

In [None]:
import matplotlib.pyplot as plt
import numpy as np

test_rows = [
    [0,456,None,0],
    [1,678,4,0],
    [2,394,0,0],
    [3,815,2,0],
    [4,798,0,0],
    [5,679,4,0],
    [6,394,0,0],
    [7,815,2,0],
    [8,841,None,0],
    [9,524,12,0],
    [10,458,9,0],
    [11,548,8,0],
    [12,543,8,0],
    [13,458,14,0],
    [14,548,10,0],
    [15,543,14,0],
]

label = []
length = []
for x in test_rows:
    label.append(x[0])
    length.append(x[1])
print(label)
print(length)

def plot_bar_x():
    # this is for plotting purpose
    index = np.arange(len(label))
    plt.bar(index, length)
    plt.xlabel('Terminal Number', fontsize=10)
    plt.ylabel('Segment Length', fontsize=10)
    plt.xticks(index, label, fontsize=10, rotation=30)
    plt.title('Occurences by Terminal Code')
    plt.show()
plot_bar_x()

In [None]:
import matplotlib.pyplot as plt
import numpy as np


test_rows = [
    [0,456,None,0],
    [1,678,4,0],
    [2,394,0,0],
    [3,815,2,0],
    [4,798,0,0],
    [5,679,4,0],
    [6,394,0,0],
    [7,815,2,0],
    [8,841,None,0],
    [9,524,12,0],
    [10,458,9,0],
    [11,548,8,0],
    [12,543,8,0],
    [13,458,14,0],
    [14,548,10,0],
    [15,543,14,0],
]

for x in test_rows:
    if x[2] == None:
        x[2] = -1
print(test_rows[0][2])



occ_count = []
label = []
length = []
for x in test_rows:
    label.append(x[0])
    length.append(x[2])

occ_count = []

for x in label:
    occ_count.append(length.count(x))

   
print(label)
print(length)
print(occ_count)
def plot_bar_x():
    # this is for plotting purpose
    index = np.arange(len(label))
    plt.bar(index, occ_count)
    plt.xlabel('Terminal Number', fontsize=10)
    plt.ylabel('Number_of_Connections', fontsize=10)
    plt.xticks(index, label, fontsize=10, rotation=30)
    plt.title('Segment Lengths by Terminal Code')
    plt.show()
plot_bar_x()

###Segment Lengths by Terminal Code

In [None]:
import matplotlib.pyplot as plt
import numpy as np

test_rows = [
    [0,456,None,0],
    [1,678,4,0],
    [2,394,0,0],
    [3,815,2,0],
    [4,798,0,0],
    [5,679,4,0],
    [6,394,0,0],
    [7,815,2,0],
    [8,841,None,0],
    [9,524,12,0],
    [10,458,9,0],
    [11,548,8,0],
    [12,543,8,0],
    [13,458,14,0],
    [14,548,10,0],
    [15,543,14,0],
]

label = []
length = []
for x in test_rows:
    label.append(x[0])
    length.append(x[1])
print(label)
print(length)


length, label = zip(*sorted(zip(length, label)))


def plot_bar_x():
    # this is for plotting purpose
    index = np.arange(len(label))
    plt.bar(index, length)
    plt.xlabel('Terminal Number', fontsize=10)
    plt.ylabel('Segment Length', fontsize=10)
    plt.xticks(index, label, fontsize=10, rotation=30)
    plt.title('Segment Lengths by Terminal Code')
    plt.show()
plot_bar_x()

print(label)
print(length)

In [None]:
terminal_keys = Brazos_LowerColorado_ge5_values[4] 
circular_keys = Brazos_LowerColorado_ge5_values[6]
terminal_keys_super = terminal_keys - circular_keys
con = Brazos_LowerColorado_ge5_values[0]
terminal_code = terminal_code_NHD

B_labels = []
B_lengths = []
dict1 = []
new_coor = []
for x in con.keys():
    B_labels.append(x)
    
B_lengths = list(con.values())

for x in range(0,len(B_lengths)):
    fx = ((B_lengths[x]))
    new_coor.append(int(fx['length']))
print((new_coor))
print((B_labels))
#this prints the new lengths used for graphing of brazos segments
# for x in list(con.values()):
#     print(dict(x)[1])
# for downstream, length, upstreams in con.items(): 
#     print (downstream, length)
# print((B_lengths[0]))
# fx = ((B_lengths[0]))
# print(fx)
# print(fx['length'])
# print(con)
# print(B_labels)
# print(B_lengths)

In [None]:
import matplotlib.pyplot as plt
import numpy as np
from matplotlib.pyplot import figure


figure(num=None, figsize=(20, 5), dpi=80, facecolor='w', edgecolor='k')

new_coor, B_labels = zip(*sorted(zip(new_coor, B_labels)))

def plot_bar_x():
    # this is for plotting purpose
    index = np.arange(len(B_labels))
    plt.bar(index, new_coor)
    plt.xlabel('Terminal Number', fontsize=10)
    plt.ylabel('Segment Length', fontsize=10)
    plt.xticks(index, B_labels, fontsize=5, rotation=10)
    plt.title('Segment Lengths by Terminal Code')
    plt.show()
plot_bar_x()

In [None]:
import matplotlib.pyplot as plt
import numpy as np
from matplotlib.pyplot import figure


figure(num=None, figsize=(20, 5), dpi=80, facecolor='w', edgecolor='k')

new_coor, B_labels = zip(*sorted(zip(new_coor, B_labels)))


def plot_bar_x():
    # this is for plotting purpose
    index = np.arange(len(B_labels[-20:]))
    plt.bar(index, new_coor[-20:])
    plt.xlabel('Terminal Number', fontsize=10)
    plt.ylabel('Segment Length', fontsize=10)
    plt.xticks(index, B_labels[-20:], fontsize=5, rotation=10)
    plt.title('Segment Lengths by Terminal Code')
    plt.show()
plot_bar_x()



###Occurences by Terminal Code

In [None]:
# import matplotlib.pyplot as plt
# import numpy as np

# terminal_keys = Brazos_LowerColorado_ge5_values[4] 
# circular_keys = Brazos_LowerColorado_ge5_values[6]
# terminal_keys_super = terminal_keys - circular_keys
# con = Brazos_LowerColorado_ge5_values[0]
# terminal_code = terminal_code_NHD

# figure(num=None, figsize=(20, 5), dpi=80, facecolor='w', edgecolor='k')

# B_labels = []
# B_lengths = []
# dict1 = []
# new_coor = []
# for x in con.keys():
#     B_labels.append(int(x))
    
# B_lengths = list(con.values())
# print(B_lengths[0])
# for x in range(0,len(B_lengths)):
#     fx = ((B_lengths[x]))
    
#     e = next(iter(fx['upstreams']))
#     new_coor.append((e))
# print((new_coor))
# print((B_labels))


# occ_count = []

# for x in B_labels:
#     occ_count.append(new_coor.count(x))



# def plot_bar_x():
#     # this is for plotting purpose
#     index = np.arange(len(B_labels))
#     plt.bar(index, occ_count)
#     plt.xlabel('Terminal Number', fontsize=10)
#     plt.ylabel('Number_of_Connections', fontsize=10)
#     plt.xticks(index, B_labels, fontsize=10, rotation=30)
#     plt.title('Occurences by Terminal Code')
#     plt.show()
# plot_bar_x()

###Averages by Terminal Code - min,max,std,var,lengths

In [None]:
import numpy as np

# terminal_keys = Brazos_LowerColorado_ge5_values[4] 
# circular_keys = Brazos_LowerColorado_ge5_values[6]
# terminal_keys_super = terminal_keys - circular_keys
# con = Brazos_LowerColorado_ge5_values[0]
# terminal_code = terminal_code_NHD
# terminal_keys = CONUS_ge5_values[4] 
# circular_keys = CONUS_ge5_values[6]
# terminal_keys_super = terminal_keys - circular_keys
# con = CONUS_ge5_values[0]

# terminal_keys_super = test_terminal_keys - test_circular_keys
# con = test_connections
# terminal_code = test_terminal_code
# terminal_keys = test_terminal_keys
g2l = []

for x,y in con.items():
    print(x,y)

for x in new_order:
    for y, z in con.items():
        if x == z['node_order']:
          g2l.append(y)
# print(g2l)
# print(con.items())
# print(list(test_terminal_keys))

count = []
count2 = []
totals_list = []
max_list = []
min_list = []
std_dev = []
var = []


print(list(terminal_keys))
for x in list(terminal_keys):
    for f, g in con.items():
        if g['nid'] == x:
            count.append(g['length'])
            count2.append(1)
            


    totals_list.append(int(sum(count)/sum(count2)))
    max_list.append(int(max(count)))  
    min_list.append(int(min(count)))
    std_dev.append(int(np.std(count)))
    var.append(int(np.var(count)))
        # else:
        #     print(sum(count))
        #     count.clear()    

print(totals_list)
print(terminal_keys)
print(max_list)
print(min_list)
print(std_dev)
print(var)

In [None]:
import matplotlib.pyplot as plt
import numpy as np


totals_list, terminal_keys = zip(*sorted(zip(totals_list, terminal_keys)))

figure(num=None, figsize=(10, 5), dpi=80, facecolor='w', edgecolor='k')
def plot_bar_x():
    # this is for plotting purpose
    index = np.arange(len((terminal_keys)[-20:]))
    plt.bar(index, totals_list[-20:])
    plt.xlabel('Terminal Number', fontsize=10)
    plt.ylabel('Segment Length', fontsize=10)
    plt.xticks(index, (terminal_keys)[-20:], fontsize=10, rotation=30)
    plt.title('Average Segment Lengths by Terminal Code')
    plt.show()
plot_bar_x()

print(terminal_keys)

In [None]:
import matplotlib.pyplot as plt
import numpy as np


max_list, terminal_keys = zip(*sorted(zip(max_list, terminal_keys)))

figure(num=None, figsize=(10, 5), dpi=80, facecolor='w', edgecolor='k')
def plot_bar_x():
    # this is for plotting purpose
    index = np.arange(len((terminal_keys)[:]))
    plt.bar(index, max_list[:])
    plt.xlabel('Terminal Number', fontsize=10)
    plt.ylabel('Segment Length', fontsize=10)
    plt.xticks(index, (terminal_keys)[:], fontsize=10, rotation=30)
    plt.title('Average Segment Lengths by Terminal Code')
    plt.show()
plot_bar_x()

print(terminal_keys)

In [None]:
import matplotlib.pyplot as plt
import numpy as np


min_list, terminal_keys = zip(*sorted(zip(min_list, terminal_keys)))

figure(num=None, figsize=(10, 5), dpi=80, facecolor='w', edgecolor='k')
def plot_bar_x():
    # this is for plotting purpose
    index = np.arange(len((terminal_keys)[:]))
    plt.bar(index, min_list[:])
    plt.xlabel('Terminal Number', fontsize=10)
    plt.ylabel('Segment Length', fontsize=10)
    plt.xticks(index, (terminal_keys)[:], fontsize=10, rotation=30)
    plt.title('Average Segment Lengths by Terminal Code')
    plt.show()
plot_bar_x()

print(terminal_keys)

In [None]:
import matplotlib.pyplot as plt
import numpy as np


std_dev, terminal_keys = zip(*sorted(zip(std_dev, terminal_keys)))

figure(num=None, figsize=(10, 5), dpi=80, facecolor='w', edgecolor='k')
def plot_bar_x():
    # this is for plotting purpose
    index = np.arange(len((terminal_keys)[:]))
    plt.bar(index, std_dev[:])
    plt.xlabel('Terminal Number', fontsize=10)
    plt.ylabel('Segment Length', fontsize=10)
    plt.xticks(index, (terminal_keys)[:], fontsize=10, rotation=30)
    plt.title('Standard Deviation by Terminal Code')
    plt.show()
plot_bar_x()

print(terminal_keys)

In [None]:
import matplotlib.pyplot as plt
import numpy as np


var, terminal_keys = zip(*sorted(zip(var, terminal_keys)))

figure(num=None, figsize=(10, 5), dpi=80, facecolor='w', edgecolor='k')
def plot_bar_x():
    # this is for plotting purpose
    index = np.arange(len((terminal_keys)[:]))
    plt.bar(index, var[:])
    plt.xlabel('Terminal Number', fontsize=10)
    plt.ylabel('Variance', fontsize=10)
    plt.xticks(index, (terminal_keys)[:], fontsize=10, rotation=30)
    plt.title('Variance by Terminal Code')
    plt.show()
plot_bar_x()

print(terminal_keys)

### Average Lengths by Bin

In [None]:
import numpy as np
import matplotlib.mlab as mlab
import matplotlib.pyplot as plt
from matplotlib.pyplot import figure


figure(num=None, figsize=(20, 5), dpi=80, facecolor='w', edgecolor='k')
x = length
num_bins = 10
n, bins, patches = plt.hist(x, num_bins, facecolor='green', alpha=0.5,edgecolor='black', linewidth=1.2)
plt.xlabel('Segment Length Bins', fontsize=10)
plt.ylabel('Count', fontsize=10)

plt.title('Occurences by Bin')
plt.show()

In [None]:
data.feature_id

In [None]:
for x,y in con.items():
    print(x,y)

In [None]:
from numba import jit
import numpy as np

x = np.arange(100).reshape(10, 10)

# @jit(nopython=True) # Set "nopython" mode for best performance, equivalent to @njit


def go_fast(a): # Function is compiled to machine code when called the first time
    trace = 0
    for i in range(a.shape[0]):
        print(i)# Numba likes loops
        trace += np.tanh(a[i, i])
        print(trace)# Numba likes NumPy functions
    return a + trace              # Numba likes NumPy broadcasting

print(go_fast(x))
print(x)

In [None]:
from numba import jit
import numpy as np
import time
temp = []

for x , y in con.items():
    temp.append(y['length'])
new_temp = np.array(temp)

# new_temp = np.reshape(new_temp, (38,53)).T



# print(temp[0:10])    
@jit(nopython=True) 
def test(a):
    for i in range(a.shape[0]):
        x = 1000
    return a + x 
start = time.time()    
print(test(new_temp))
end = time.time()
print("Elapsed (after compilation) = %s" % (end - start))
print(temp[0:10]) 
print(temp[-1:])

In [None]:
import time
r = []
def go_fast(x):
    for x in temp:
        r.append(x+1000)
    print(r)


start = time.time()
go_fast(x)
end = time.time()
print("Elapsed (after compilation) = %s" % (end - start))