### Clean Network
In this process developed by Charles Fox, we move from a GOSTnets raw graph object (see Extract from osm.pbf) to a routable network. This process is fairly bespoke, with several parameters and opportunities for significant network simplification. 

In [15]:
import geopandas as gpd
import os, sys, time
import pandas as pd
sys.path.append(r'C:\Users\charl\Documents\GitHub\GOST_PublicGoods\GOSTNets\GOSTNets')
import GOSTnet as gn
import importlib
import networkx as nx
import osmnx as ox
from shapely.ops import unary_union
from shapely.wkt import loads
from shapely.geometry import LineString, MultiLineString, Point

This function defines the order of GOSTnet functions we will call on the input network object. The verbose flag causes the process to save down intermediate files - helpful for troubleshooting

In [16]:
def CleanNetwork(G, wpath, country, UTM, WGS = {'init': 'epsg:4326'}, junctdist = 50, verbose = False):
    
    ### Topologically simplifies an input graph object by collapsing junctions and removing interstital nodes
    # REQUIRED - G: a graph object containing nodes and edges. edges should have a property 
    #               called 'Wkt' containing geometry objects describing the roads
    #            wpath: the write path - a drive directory for inputs and output
    #            country: this parameter allows for the sequential processing of multiple countries
    #            UTM: the epsg code of the projection, in metres, to apply the junctdist
    # OPTIONAL - junctdist: distance within which to collapse neighboring nodes. simplifies junctions. 
    #            Set to 0.1 if not simplification desired. 50m good for national (primary / secondary) networks
    #            verbose: if True, saves down intermediate stages for dissection
    ################################################################################################
    
    # Squeezes clusters of nodes down to a single node if they are within the snapping tolerance
    a = gn.simplify_junctions(G, UTM, WGS, junctdist)

    # ensures all streets are two-way
    a = gn.add_missing_reflected_edges(a)
    
    #save progress
    if verbose is True: 
        gn.save(a, 'a', wpath)
    
    # Finds and deletes interstital nodes based on node degree
    b = gn.custom_simplify(a)
    
    # rectify geometry
    for u, v, data in b.edges(data = True):
        if type(data['Wkt']) == list:
                data['Wkt'] = gn.unbundle_geometry(data['Wkt'])
    
    # save progress
    if verbose is True: 
        gn.save(b, 'b', wpath)
    
    # For some reason CustomSimplify doesn't return a MultiDiGraph. Fix that here
    c = gn.convert_to_MultiDiGraph(b)

    # This is the most controversial function - removes duplicated edges. This takes care of two-lane but separate highways, BUT
    # destroys internal loops within roads. Can be run with or without this line
    c = gn.remove_duplicate_edges(c)

    # Run this again after removing duplicated edges
    c = gn.custom_simplify(c)

    # Ensure all remaining edges are duplicated (two-way streets)
    c = gn.add_missing_reflected_edges(c)
    
    # save final
    gn.save(c, '%s_processed' % country, wpath)
    
    print('Edge reduction: %s to %s (%d percent)' % (G.number_of_edges(), 
                                               c.number_of_edges(), 
                                               ((G.number_of_edges() - c.number_of_edges())/G.number_of_edges()*100)))
    return c

This is the main process - and is only needed to fire off CleanNetwork. G objects can either be loaded from pickled graph objects, or can be passed in from extraction / other processing chains. 

WARNING: expect this step to take a while. It will produce a pickled graph object, a dataframe of the edges, and a dataframe of the nodes. The expectation is that this will only have to be run once.

In [18]:
UTMZs = {'ISL':32627} # Here, we set the EPSG code for the country (ISL)  we are working with. 
# Though formulaic options exist, choice of EPSG code should rest with the user; so this needs to be manually changed each time.

WGS = {'init': 'epsg:4326'} # do not adjust. OSM natively comes in ESPG 4326

countries = ['ISL'] # this process can clean multiple networks at once in a loop style. Here, we only need to do it once!

base_pth = r'C:\Users\charl\Documents\T' # adjust this input to your filepath. 
data_pth = os.path.join(base_pth, 'tutorial_outputs')

for country in countries:
    
    print('\n--- processing for: %s ---\n' % country)
    print('start: %s\n' % time.ctime())

    print('Outputs can be found at: %s\n' % (data_pth))
        
    UTM = {'init': 'epsg:%d' % UTMZs[country]}
    
    G = nx.read_gpickle(os.path.join(data_pth, 'Iceland_unclean.pickle'))
    
    G = CleanNetwork(G, data_pth, country, UTM, WGS, 0.5, verbose = False)
    print('\nend: %s' % time.ctime())
    print('\n--- processing complete for: %s ---' % country)


--- processing for: ISL ---

start: Fri Jun  7 15:40:09 2019

Outputs can be found at: C:\Users\charl\Documents\T

19456
38540
18095
35497
Edge reduction: 19463 to 35497 (-82 percent)

end: Fri Jun  7 15:41:18 2019

--- processing complete for: ISL ---


At this point, our road network is fully prepped. 
Move on to Step 3 to see how we can use this network for some travel time analysis!