# Step 1: Creating graph from Osmium

## These series of notebooks rely on the 'load-traffic' branch of the GOSTnets repo and has not yet been merged into master. The link to the code is here: https://github.com/worldbank/GOSTnets/tree/load_traffic

### In addition to GOSTNets and the common dependencies, you also need Osmium and OSMNX installed for this notebook

In [1]:
import os, sys, time, importlib

import geopandas as gpd
import pandas as pd
import networkx as nx
import numpy as np

# make sure osmium is installed (pip install osmium)
# An internal function called when creating the OSM_to_Network object will import osmium

from shapely.geometry import LineString, Point
import osmnx as ox

### Append to your GOSTNets path and make sure you are using the 'load-traffic' branch

In [4]:
sys.path.append(r"C:\Users\user1\Documents\GitHub\GOSTnets")
import GOSTnets as gn

In [5]:
# This is a Jupyter Notebook extension which reloads all of the modules whenever you run the code
# This is optional but good if you are modifying and testing source code
%load_ext autoreload
%autoreload 2

In [6]:
from GOSTnets.load_traffic2 import *

### Loading all three MapBox traffic files into a merged Dataframe

In [7]:
traffic_simplified_df = gn.load_traffic2.generate_traffic_metrics("./1233300-Asia-Colombo.csv", "./1233300-Asia-Colombo.csv", "./1233303-Asia-Colombo.csv")

2
./1233300-Asia-Colombo.csv
finished reading ./1233300-Asia-Colombo.csv into dataframe
FILE ./1233303-Asia-Colombo.csv
1233303-Asia-Colombo_df
finished merging 1233303-Asia-Colombo_df into combined dataframe
calculating min, max, and mean values.
finished calculating min, max, and mean values. Printing traffic_simplified head
    FROM_NODE     TO_NODE  min_speed  max_speed  mean_speed
0  1148494884  4177608798       31.0       38.0   36.866071
1  1148495298  4137314867       57.0       57.0   57.000000
2  1242700523  6537570627       60.0       60.0   60.000000
3  1242730766  3377418986       46.0       46.0   46.000000
4  1243299175  3805435746       40.0       40.0   40.000000


In [8]:
traffic_simplified_df

Unnamed: 0,FROM_NODE,TO_NODE,min_speed,max_speed,mean_speed
0,1148494884,4177608798,31.0,38.0,36.866071
1,1148495298,4137314867,57.0,57.0,57.000000
2,1242700523,6537570627,60.0,60.0,60.000000
3,1242730766,3377418986,46.0,46.0,46.000000
4,1243299175,3805435746,40.0,40.0,40.000000
...,...,...,...,...,...
34172,6495299720,6495344002,36.0,60.0,46.133929
34173,6495398243,6495398223,20.0,26.0,25.888393
34174,6899782766,3317976522,10.0,11.0,10.977679
34175,7082570748,7082570747,64.0,70.0,66.014881


### You can download a OpenStreetMap pbf file from GeoFabrik

In [9]:
# set file
input_OSM_file = './sri-lanka-latest.osm.pbf'

### The GOSTnets OSM_to_network object gets created by using Osmium to extract roads from the OSM file and adding traffic data where traffic data exists. The OSM_to_network code is modified in the 'load-traffic' branch to create a graph that mirrors how OSMNX creates NetworkX graphs.

In [10]:
sri_lanka = OSM_to_network(input_OSM_file,traffic_simplified_df)



hit exception




hit exception
way 812734222 may not have nodes
hit exception
way 813407157 may not have nodes
finished with Osmium data extraction
234726
1
Error adding edge between nodes 5770930353 and 5770924651
{'osmid': 609259140, 'nodes': [2907768383, 5770930354, 5770930356, 7591229329, 5770930353, 5770924651, 2907768384], 'shp': <shapely.geometry.linestring.LineString object at 0x00000227216A38C8>, 'highway': 'residential', 'maxspeed': '30'}
try to list coords
[(81.8253258, 7.4266025), (81.8253447, 7.4265667), (81.8255643, 7.4262145), (81.8257891, 7.4258475), (81.8258106, 7.4257995)]
length of nodes
7
print index
4
(81.8258106, 7.4257995)
checkpoint reached
Error adding edge between nodes 5770924651 and 2907768384
{'osmid': 609259140, 'nodes': [2907768383, 5770930354, 5770930356, 7591229329, 5770930353, 5770924651, 2907768384], 'shp': <shapely.geometry.linestring.LineString object at 0x00000227216A38C8>, 'highway': 'residential', 'maxspeed': '30'}
try to list coords
[(81.8253258, 7.4266025), (81

In [11]:
len(sri_lanka.network.edges())

8829416

### Using OSMNX, add length (great circle distance between nodes) attribute to each edge (in meters)

In [12]:
G = ox.utils_graph.add_edge_lengths(sri_lanka.network)

In [13]:
len(G.nodes())

4354795

In [14]:
#fig, ax = ox.plot_graph(G, node_zorder=2, node_color='w', bgcolor='k')

### take the largest sub-graph

In [15]:
G = ox.utils_graph.get_largest_component(G)

In [16]:
len(G.edges())

8660210

In [17]:
# save graph for now

In [18]:
gn.save(G,'sri_lanka_processed_graph_uncleaned','./', pickle = True, edges = False, nodes = False)

### Remove all of the in-between nodes
We are using the GOSTNets clean_network function. The clean_network function in the 'load-traffic' branch has been heavily modified. It removes most of the functions called from the inside of the clean_network function and only includes the custom_simplify function. The custom_simplify function in turn mirrors the OSMNX simplification module (https://github.com/gboeing/osmnx/blob/5da49157161c5b1d2de69238536e95173d215da0/osmnx/simplification.py), but is modified to only remove all of the in-between nodes of edges that **do not** have traffic.

#### This function is very resource intensive. It took over 230 hours to run on a i7-7800 CPU with 64 GB of RAM

In [21]:
start = time.time()
G_clean = gn.clean_network(G, UTM = 'epsg:32644', WGS = 'epsg:4326', junctdist = 10, verbose = False)
end = time.time()
print(end - start)

finished with simplify_junctions
print node_list
17320420
print no_traffic_node_list
17252920
reached 100 nodes
count is 5000
count is 10000
count is 15000
count is 20000
count is 25000
count is 30000
count is 35000
count is 40000
count is 45000
count is 50000
count is 55000
count is 60000
count is 65000
count is 70000
count is 75000
count is 80000
count is 85000
count is 90000
count is 95000
count is 100000
count is 105000
count is 110000
count is 115000
count is 120000
count is 125000
count is 130000
count is 135000
count is 140000
count is 145000
count is 150000
count is 155000
count is 160000
count is 165000
count is 170000
count is 175000
count is 180000
count is 185000
count is 190000
count is 195000
count is 200000
count is 205000
count is 210000
count is 215000
count is 220000
count is 225000
count is 230000
count is 235000
count is 240000
count is 245000
count is 250000
count is 255000
count is 260000
count is 265000
count is 270000
count is 275000
count is 280000
count is 285

In [22]:
#save
gn.save(G_clean,'sri_lanka_processed_graph_cleaned_part1','./', pickle = True, edges = False, nodes = False)

In [None]:
#fig, ax = ox.plot_graph(G_clean, node_zorder=2, node_color='w', bgcolor='k')

In [None]:
#len(G_clean.edges())

In [None]:
#len(G_clean)

### Project Graph

In [23]:
start = time.time()
G_proj = ox.project_graph(G_clean)
end = time.time()
print(end - start)

156.83470916748047


In [None]:
#save
gn.save(G_clean,'sri_lanka_processed_graph_cleaned_part1_proj','./', pickle = True, edges = False, nodes = False)

### note: while it is possible to run the OSMNX consolidate_intersections function. Do not do this because it will create new nodes with an incompatible format for the Step3 notebook