# Network Processing
---

#### Run this code block by block to convert a road network(s) in ESRI Shapefile or GeoJSON format into a routable and conflated network graph to use in BikewaySim.

Note that three networks were used in this project. While code to obtain OSM GeoJSONs has been included with osm_processing notebook, the ABM and HERE networks need to be sourced from the Atlanta Regional Commission and HERE respectively. You can run most of this code with only OSM data, but you'll need to skip and/or modify some sections.

## Import/install the following packages:


In [None]:
import os
from pathlib import Path
import time
import geopandas as gpd

## Import Network Filter Module:

In [None]:
from network_filter import *

## Set Directory:
### Modify this directory to where you stored your network shapefiles.

In [None]:
#make directory/pathing more intuitive later
file_dir = r"C:\Users\tpassmore6\Documents\BikewaySimData" #directory of bikewaysim network processing code

#change this to where you stored this folder
os.chdir(file_dir)

## Choose study area:
#### Specify what area you want to mask the spatial data by. Only network links that are partially within the study area will be imported. Note: Network links are NOT clipped or split apart.

In [None]:
studyareafp = r'base_shapefiles/bikewaysim_study_area/bikewaysim_study_area.shp'
studyarea_name = 'bikewaysim'
#city_of_atlantafp = r'base_shapefiles/coa/Atlanta_City_Limits.shp'
#atlanta_regional_commissionfp = r'base_shapefiles/arc/arc_bounds.shp'

desired_crs = "EPSG:2240"

#add new study areas if desired
studyarea = import_study_area(studyareafp, studyarea_name, desired_crs)

## Network Mapper
This is how network node ID's will be identified and coded
- the first number in the node ID represents its origin network
- all numbers after that are the original network ID 

In [None]:
network_mapper = {
    "abm": "1",
    "here": "2",
    "osm": "3",
    "original": "0",
    "generated": "1"
}

## Network Data Filepaths:

In [None]:
abmfp = r'base_shapefiles/arc/ABM2020-TIP20-2020-150kShapefiles-outputs.gdb'
herefp = r'base_shapefiles/here/Streets.shp'
osmfp = r'base_shapefiles/osm/osm_links.geojson'

## Set Network Import Settings
#### Create a dictionary for use in filter networks function. 
#### For new networks follow this format:
```python
network = {
       "studyarea": studyarea, #geodataframe of the study area
       "studyarea_name": studyarea_name, #name for the study area
       "networkfp": networkfp, #filepath for the network, specified earlier
       "network_name": 'abm', #name for the network being evaluated
       "network_mapper": network_mapper, #leave this, edit in the block before
       "A": "A", #column with the starting node id; replace with None if there isn't a column
       "B": "B", #column with the ending node id; replace with None if there isn't a column
       "layer": 0 #if network has layers, then specify which layer to look at; if no layers then leave as 0 
       }
```

In [None]:
#abm inputs
abm = {
       "studyarea": studyarea,
       "studyarea_name": studyarea_name,
       "networkfp": abmfp,
       "network_name": 'abm',
       "network_mapper": network_mapper,
       "A": "A",
       "B": "B",
       "layer": "DAILY_LINK"
       }

#here inputs
here = {
       "studyarea": studyarea,
       "studyarea_name": studyarea_name,
       "networkfp": herefp,
       "network_name": 'here',
       "network_mapper": network_mapper,
       "A": "REF_IN_ID",
       "B": "NREF_IN_ID",
       "layer": 0,
       "network_mapper": network_mapper
       }

osm = {
      "studyarea": studyarea,
       "studyarea_name": studyarea_name,
       "networkfp": osmfp,
       "network_name": 'osm',
       "network_mapper": network_mapper,
       "A": 'A',
       "B": 'B',
       "layer": 0,
       }

## Run Network Filter Module to Create Initial Subnetworks
From the network_filter.py file run the filter networks function. This will first import the spatial data and then filter the data into a base, road, bike, or serivce link. 
**Note: If this is the a new network that is not OSM, HERE, or ABM then specify a new filter method by going into the network_filter.py file.** Otherwise, none of the links will be filtered into road/bike/service links. **Also note: all spatial files are being projected to EPSG 2240 right now.** Need to modify later.


## Filter ABM

In [None]:
filter_networks(**abm)

## Filter HERE

In [None]:
filter_networks(**here)

## Filter OSM

In [None]:
filter_networks(**osm)

## Summurize filtered networks
#### Prints out:
- Number of nodes
- Number of links
- Total length of all links
- Average link length

In [None]:
from network_summary_stats import * 

#network names to look for, will search your directory for network name
networks = ["abm","here","osm"]
studyarea_name = "bikewaysim"

#summurize networks and export summary as "network_summary.csv in the working directory
sum_all_networks(networks, studyarea_name)

## Conflation Process
In this step, the networks are conflated to each other by utilizing functions in the network_conflation.py module

i like to remove all of the columns that aren't related to node_id or geometry for this step. To make sure we preserve link information I also make a A_B column


there are three main function in conflation module. The first just matches nearest points.

The second will find the nearest spot on a link and create new points

The last is a function that deals with removing links/nodes that have already been considered


#only select columns that correspond to A, B, and geo

#create an A_B column

#determine what you want the base network to be, and which order you want to conflate in
#for this project, ABM was the base with HERE followed by OSM as the joining ones.




#do this step outside of the function

#Filter joining network
This should make it so that the only joining nodes that the base nodes can join to represent real intersections
<code>joining_nodes = joining_nodes[joining_nodes[f'{joining_name}_num_links'] != 2 ].reset_index(drop=True)<code>


In [None]:
from conflation_tools import *

#import importlib
#importlib.reload(conflation_tools)

### Links and Nodes to Conflate

In [None]:
base_name = "abm"
join_name = "here"

#road layers
base_links = gpd.read_file(r"processed_shapefiles/abm/abm_bikewaysim_road_links.geojson")
base_nodes = gpd.read_file(r"processed_shapefiles/abm/abm_bikewaysim_road_nodes.geojson")
join_links = gpd.read_file(r"processed_shapefiles/here/here_bikewaysim_road_links.geojson")
join_nodes = gpd.read_file(r"processed_shapefiles/here/here_bikewaysim_road_nodes.geojson")

#bike layers
bike_links = gpd.read_file(r'processed_shapefiles/here/here_bikewaysim_bike_links.geojson')
bike_nodes = gpd.read_file(r'processed_shapefiles/here/here_bikewaysim_bike_nodes.geojson')
bike_name = 'here'


### Cleaning to get rid of excess columns

In [None]:
base_links, base_nodes = cleaning_process(base_links,base_nodes,base_name)
join_links, join_nodes = cleaning_process(join_links,join_nodes,join_name)

#clean excess columns
bike_links, bike_nodes = cleaning_process(bike_links,bike_nodes,bike_name)

### Node Matching

In [None]:
#first match the nodes, can repeat this by adding in previously matched_nodes
tolerance_ft = 25
matched_nodes, unmatched_base_nodes, unmatched_join_nodes = match_nodes(base_nodes, base_name, join_nodes, join_name, tolerance_ft, prev_matched_nodes=None)

#join the matched nodes to the base nodes once done with matching
matched_nodes_final = pd.merge(base_nodes, matched_nodes, on = f'{base_name}_ID', how = "left")

### Link Splitting and Add New Links and Nodes

In [None]:
#create new node and lines from the base links by splitting lines can repeat after the add_new_links_nodes function
tolerance_ft = 25
split_lines, split_nodes, unmatched_join_nodes = split_lines_create_points(unmatched_join_nodes, join_name, base_links, base_name, tolerance_ft, export = False)
split_lines.head()

In [None]:
#add new links and nodes to the base links and nodes created from split_lines_create_points function
new_links, new_nodes = add_new_links_nodes(base_links, matched_nodes_final, split_lines, split_nodes, base_name)
new_links.head()

### Attribute Add

In [None]:
#match attribute information with greatest overlap from joining links
new_base_links_w_attr = add_attributes(new_links, base_name, join_links, join_name)
new_base_links_w_attr.head()

### Add rest of features

In [None]:
#add unrepresented features from joining by looking at the attributes added in prevoius step for links and the list of matched nodes
added_base_links, added_base_nodes = add_rest_of_features(new_base_links_w_attr,new_nodes,base_name,join_links,join_nodes,join_name)

### Merge with other networks

In [None]:
tolerance_ft = 25
merged_links, merged_nodes = merge_diff_networks(added_base_links, added_base_nodes, 'road', bike_links, bike_nodes, 'bike', tolerance_ft)

### Repeat for OSM

In [None]:
base_name = "abm"
join_name = "osm"

#road layers
base_links = merged
base_nodes = merged_nodes
join_links = gpd.read_file(r"processed_shapefiles/here/here_bikewaysim_road_links.geojson")
join_nodes = gpd.read_file(r"processed_shapefiles/here/here_bikewaysim_road_nodes.geojson")

#bike layers
bike_links = gpd.read_file(r'processed_shapefiles/here/here_bikewaysim_bike_links.geojson')
bike_nodes = gpd.read_file(r'processed_shapefiles/here/here_bikewaysim_bike_nodes.geojson')
bike_name = 'here'

### Add reference

In [None]:
# match reference IDs based on all the id in the nodes
refid_base_links = add_reference_ids(merged_links, merged_nodes)

In [None]:
refid_base_links.head()

### Export

In [None]:
refid_base_links.to_file(r'processed_shapefiles\conflation\final_links.geojson', driver = 'GeoJSON')
merged_nodes.to_file(r'processed_shapefiles\conflation\final_nodes.geojson', driver = 'GeoJSON')

## Convert for use in BikewaySim

This last section focusses on making sure that the conflated network is readable by BikewaySim. After this is completed, you can run the Running BikwaySim notebook.

In [23]:
import os
from pathlib import Path
import time
import pandas as pd
import geopandas as gpd
import pickle

#make directory/pathing more intuitive later
file_dir = r"C:\Users\tpassmore6\Documents\BikewaySimData" #directory of bikewaysim network processing code

#change this to where you stored this folder
os.chdir(file_dir)

### Specify filepaths

In [24]:
#filepath for conflated network
conflated_linksfp = r'processed_shapefiles\conflation\final_links.geojson'
conflated_nodesfp = r'processed_shapefiles\conflation\final_nodes.geojson'

#filepaths for network attribute data (doesn't have to be a shapefile)
abm_linksfp = r'processed_shapefiles\abm\abm_bikewaysim_base_links.geojson'
here_linksfp = r'processed_shapefiles\here\here_bikewaysim_base_links.geojson'
osm_linksfp = r'base_shapefiles\osm\osm_links_attr.p'

#### Node cleaning and export

In [37]:
#import conflated nodes
conflated_nodes = gpd.read_file(conflated_nodesfp)

#drop the num links columns
conflated_nodes = conflated_nodes.drop(columns=['abm_num_links','here_num_links'])

#create an N column that takes the abm_id if avaiable followed by the here_id
func = lambda row: row['here_ID'] if row['abm_ID'] == None else row['abm_ID']
conflated_nodes['N'] = conflated_nodes.apply(func,axis=1)

#create UTM coords columns
conflated_nodes['X'] = conflated_nodes.geometry.x
conflated_nodes['Y'] = conflated_nodes.geometry.y

#reproject and find latlon
conflated_nodes = conflated_nodes.to_crs(epsg=4326)
conflated_nodes['lon'] = conflated_nodes.geometry.x
conflated_nodes['lat'] = conflated_nodes.geometry.y

#filter
conflated_nodes = conflated_nodes[['N','X','Y','lon','lat','geometry']]

#export
conflated_nodes.to_file(r'processed_shapefiles\prepared_network\nodes\nodes.geojson',driver='GeoJSON')
conlfated_nodes = conflated_nodes.drop(columns=['geometry'])
conflated_nodes.to_csv(r'processed_shapefiles\prepared_network\nodes\nodes.csv')

Unnamed: 0,N,X,Y,lon,lat,geometry
0,1010848,2233266.0,1384663.0,-84.375115,33.806352,POINT (-84.37512 33.80635)
1,1010851,2230980.0,1375203.0,-84.382575,33.780342,POINT (-84.38258 33.78034)
2,1010854,2231248.0,1375719.0,-84.381695,33.781762,POINT (-84.38170 33.78176)
3,1010856,2231166.0,1375716.0,-84.381965,33.781752,POINT (-84.38197 33.78175)
4,10125456,2226988.0,1374225.0,-84.395705,33.777632,POINT (-84.39570 33.77763)


### Link cleaning and export

In [26]:
#import conflated network
conflated_links = gpd.read_file(conflated_linksfp)

#### Merging function

In [27]:
def merge_network_and_attributes(conflated_links,attr_network,cols_to_keep):
    #find the shared columns between conflated network and attribute network
    shared_cols = list(conflated_links.columns[conflated_links.columns.isin(attr_network.columns)])

    if len(shared_cols) > 2:
        #merge based on shared columns
        conflated_links = pd.merge(conflated_links,attr_network[cols_to_keep + shared_cols],on=shared_cols,how='left')
        print(conflated_links.head(20))
    else:
        print(f'Attr_network columns not in conflated network')
    return conflated_links

In [28]:
#import data with attributes, don't bring in geometry
abm_links = gpd.read_file(abm_linksfp,ignore_geometry=True)

#specify which columns you need
cols_to_keep = ['NAME','SPEEDLIMIT','two_way']

#perform the merge
conflated_links = merge_network_and_attributes(conflated_links,abm_links,cols_to_keep)

#delete data with attributes to free up memory
del(abm_links)

             abm_A_B                  here_A_B    abm_A      here_A     abm_B  \
0   1010848_10340018   201063859187_2017289087  1010848  2017289087  10340018   
1    1010854_1010856     2017289109_2017289100  1010854  2017289100   1010856   
2    1013524_1053435     2044584013_2030636889  1013524  2030636889   1053435   
3   1013524_10299911     2030636889_2044584017  1013524  2030636889  10299911   
4    1013525_1053438   201181585260_2044584017  1013525  2030636890   1053438   
5   1013594_10299904    2030638752_20979715708  1013594  2030638752  10299904   
6   1013595_10300685    20979791869_2030638844  1013595  2030638844  10300685   
7   1013595_10300687    2030638844_20979791871  1013595  2030638844  10300687   
8   1048680_10300690    20979792205_2044571513  1048680  2044571513  10300690   
9   1048681_10300690    2044571517_20979792205  1048681  2044571517  10300690   
10   1048684_1048699     2044571548_2044571524  1048684  2044571524   1048699   
11  1048684_10202745    2044

In [29]:
here_links = gpd.read_file(here_linksfp,ignore_geometry=True)

cols_to_keep = ['ST_NAME','DIR_TRAVEL']

conflated_links = merge_network_and_attributes(conflated_links,here_links,cols_to_keep)
del(here_links)

             abm_A_B                  here_A_B    abm_A      here_A     abm_B  \
0   1010848_10340018   201063859187_2017289087  1010848  2017289087  10340018   
1    1010854_1010856     2017289109_2017289100  1010854  2017289100   1010856   
2    1013524_1053435     2044584013_2030636889  1013524  2030636889   1053435   
3   1013524_10299911     2030636889_2044584017  1013524  2030636889  10299911   
4    1013525_1053438   201181585260_2044584017  1013525  2030636890   1053438   
5   1013594_10299904    2030638752_20979715708  1013594  2030638752  10299904   
6   1013595_10300685    20979791869_2030638844  1013595  2030638844  10300685   
7   1013595_10300687    2030638844_20979791871  1013595  2030638844  10300687   
8   1048680_10300690    20979792205_2044571513  1048680  2044571513  10300690   
9   1048681_10300690    2044571517_20979792205  1048681  2044571517  10300690   
10   1048684_1048699     2044571548_2044571524  1048684  2044571524   1048699   
11  1048684_10202745    2044

In [None]:
osm_links = pickle.load(open(osm_linksfp,"rb"))

cols_to_keep = ['name']

conflated_links = merge_network_and_attributes(conflated_links,osm_links,cols_to_keep)
del(osm_links)

In [30]:
conflated_links.head()

Unnamed: 0,abm_A_B,here_A_B,abm_A,here_A,abm_B,here_B,geometry,NAME,SPEEDLIMIT,two_way,ST_NAME,DIR_TRAVEL
0,1010848_10340018,201063859187_2017289087,1010848,2017289087,10340018,201063859187.0,"LINESTRING (2233265.865 1384662.528, 2233329.3...",MONROE DR NE,30.0,True,,
1,1010854_1010856,2017289109_2017289100,1010854,2017289100,1010856,2017289109.0,"LINESTRING (2231248.476 1375719.070, 2231166.4...",10TH ST NE,35.0,True,,
2,1013524_1053435,2044584013_2030636889,1013524,2030636889,1053435,2044584013.0,"LINESTRING (2239431.313 1375903.192, 2239373.5...",VIRGINIA AVE NE,30.0,True,,
3,1013524_10299911,2030636889_2044584017,1013524,2030636889,10299911,,"LINESTRING (2239431.313 1375903.192, 2239452.5...",VIRGINIA AVE NE,30.0,True,,
4,1013525_1053438,201181585260_2044584017,1013525,2030636890,1053438,2044584017.0,"LINESTRING (2239619.442 1375764.578, 2239568.0...",N HIGHLAND AVE NE,30.0,True,,


### Data Merging
In this case we're just using street names and speed limit, but this section is dedicated for dealing with duplicate and/or missing data.

In [31]:
#streetnames
#if abm name is present use that, else use the HERE name
conflated_links['name'] = conflated_links.apply(lambda row: row['ST_NAME'] if row['NAME'] == None else row['NAME'], axis=1)
#if no streetname exists then put in "placeholder" as the streetname
conflated_links['name'] = conflated_links.apply(lambda row: 'placeholder' if pd.isna(row['name']) else row['name'], axis=1)

#speed limits
#use the ABM speed limit, if none present assume 30mph
conflated_links['speedlimit'] = conflated_links['SPEEDLIMIT'].apply(lambda row: row if row == row else 30)

#drop old columns
conflated_links = conflated_links.drop(columns=['NAME','SPEEDLIMIT','ST_NAME'])

### Create A and B column
If ABM ID in A column then go with that, else go with HERE ID.

In [32]:
conflated_links['A'] = conflated_links.apply(lambda row: row['here_A'] if row['abm_A'] == None else row['abm_A'], axis=1)
conflated_links['B'] = conflated_links.apply(lambda row: row['here_B'] if row['abm_B'] == None else row['abm_B'], axis=1)
conflated_links.head()

Unnamed: 0,abm_A_B,here_A_B,abm_A,here_A,abm_B,here_B,geometry,two_way,DIR_TRAVEL,name,speedlimit,A,B
0,1010848_10340018,201063859187_2017289087,1010848,2017289087,10340018,201063859187.0,"LINESTRING (2233265.865 1384662.528, 2233329.3...",True,,MONROE DR NE,30.0,1010848,10340018
1,1010854_1010856,2017289109_2017289100,1010854,2017289100,1010856,2017289109.0,"LINESTRING (2231248.476 1375719.070, 2231166.4...",True,,10TH ST NE,35.0,1010854,1010856
2,1013524_1053435,2044584013_2030636889,1013524,2030636889,1053435,2044584013.0,"LINESTRING (2239431.313 1375903.192, 2239373.5...",True,,VIRGINIA AVE NE,30.0,1013524,1053435
3,1013524_10299911,2030636889_2044584017,1013524,2030636889,10299911,,"LINESTRING (2239431.313 1375903.192, 2239452.5...",True,,VIRGINIA AVE NE,30.0,1013524,10299911
4,1013525_1053438,201181585260_2044584017,1013525,2030636890,1053438,2044584017.0,"LINESTRING (2239619.442 1375764.578, 2239568.0...",True,,N HIGHLAND AVE NE,30.0,1013525,1053438


### Create reverse links for two way streets and calculate distance

In [33]:
conflated_links_rev = conflated_links.copy().rename(columns={'A':'B','B':'A'})

#filter to those that are two way
conflated_links_rev = conflated_links_rev[(conflated_links_rev['two_way'] != False) &
                                            (conflated_links_rev['DIR_TRAVEL'] != 'F') &
                                            (conflated_links_rev['DIR_TRAVEL'] != 'T')                            
                                            ]

conflated_links = conflated_links.append(conflated_links_rev).reset_index()

#create A_B column
conflated_links['A_B'] = conflated_links['A'] + '_' + conflated_links['B']

#drop uneeded cols
conflated_links = conflated_links.drop(columns=['two_way','DIR_TRAVEL'])

#calculate distance
conflated_links['distance'] = conflated_links.length

conflated_links.head()

Unnamed: 0,index,abm_A_B,here_A_B,abm_A,here_A,abm_B,here_B,geometry,name,speedlimit,A,B,A_B,distance
0,0,1010848_10340018,201063859187_2017289087,1010848,2017289087,10340018,201063859187.0,"LINESTRING (2233265.865 1384662.528, 2233329.3...",MONROE DR NE,30.0,1010848,10340018,1010848_10340018,148.97967
1,1,1010854_1010856,2017289109_2017289100,1010854,2017289100,1010856,2017289109.0,"LINESTRING (2231248.476 1375719.070, 2231166.4...",10TH ST NE,35.0,1010854,1010856,1010854_1010856,82.118032
2,2,1013524_1053435,2044584013_2030636889,1013524,2030636889,1053435,2044584013.0,"LINESTRING (2239431.313 1375903.192, 2239373.5...",VIRGINIA AVE NE,30.0,1013524,1053435,1013524_1053435,57.844133
3,3,1013524_10299911,2030636889_2044584017,1013524,2030636889,10299911,,"LINESTRING (2239431.313 1375903.192, 2239452.5...",VIRGINIA AVE NE,30.0,1013524,10299911,1013524_10299911,21.577605
4,4,1013525_1053438,201181585260_2044584017,1013525,2030636890,1053438,2044584017.0,"LINESTRING (2239619.442 1375764.578, 2239568.0...",N HIGHLAND AVE NE,30.0,1013525,1053438,1013525_1053438,168.229104


### Export

In [34]:
conflated_links.to_file(r'processed_shapefiles\prepared_network\links\links.geojson',driver='GeoJSON')