# Extract OSM road-network detailed

Code for extracting and storing the simplified and detailed OSM road-network.

Process is load from OSMN, convert to data-frames, remove non-highway entities, store results as geojson.

Results can then be combined back into Networkx graph.

In [145]:
%reload_kedro
%config IPCompleter.use_jedi = False
import geopandas as gpd
import networkx as nx
import osmnx as ox
import pandas as pd

from utils.process_gdf import process_edges

2022-04-05 00:10:45,334 - kedro.framework.session.store - INFO - `read()` not implemented for `BaseSessionStore`. Assuming empty store.
2022-04-05 00:10:45,444 - root - INFO - ** Kedro project Demand estimation and waste collection routing optimisation for the City of Cape Town
2022-04-05 00:10:45,445 - root - INFO - Defined global variable `context`, `session` and `catalog`
2022-04-05 00:10:45,454 - root - INFO - Registered line magic `run_viz`


## Load Cape Town sample road-network

The following custom filter works for Singapore, given that its road network is quite porous. 

It seems to return non-highway boundaries and entities, which is a problem.

A subsample of the full network is loaded for test purposes:

In [128]:
city = "City of Cape Town"

In [129]:
drive_service_all = '["area"!~"yes"]["highway"!~"cycleway|footway|path|pedestrian|steps|track|corridor|elevator|escalator|proposed|construction|bridleway|abandoned|platform|raceway"]["motor_vehicle"!~"no"]["motorcar"!~"no"]'
exclude_emergency_services = '["service"!~"emergency_access"]'
exclude_custom_services = (
    '["access"!~"private"]["service"!~"parking|parking_aisle|private|emergency_access"]'
)
exclude_parking_private_emergency = (
    '["access"!~"private"]["service"!~"parking|private|emergency_access"]'
)
exclude_private = '["access"!~"private"]["service"!~"private"]'
custom_filter = drive_service_all + exclude_emergency_services

It relies too much on `["motor_vehicle"!~"no"]["motorcar"!~"no"]`. Easiest way to filter roads seem to be to use high-way.

In [130]:
custom_filter

'["area"!~"yes"]["highway"!~"cycleway|footway|path|pedestrian|steps|track|corridor|elevator|escalator|proposed|construction|bridleway|abandoned|platform|raceway"]["motor_vehicle"!~"no"]["motorcar"!~"no"]["service"!~"emergency_access"]'

In [131]:
gap_zones = catalog.load("gap_zones")
gap_zones_sample = gap_zones.loc[gap_zones["OBJECTID"].isin([24645])]
geometry = gap_zones_sample.geometry.values[0]

2022-04-04 23:49:14,257 - kedro.io.data_catalog - INFO - Loading data from `gap_zones` (GeoJSONDataSet)...


In [132]:
%%time
G_full = ox.graph_from_polygon(geometry, custom_filter=custom_filter, simplify=False)

CPU times: user 16.2 s, sys: 258 ms, total: 16.4 s
Wall time: 16.6 s


### Write files

In [133]:
catalog.save("road_network_full_24645", G_full)

2022-04-04 23:49:31,438 - kedro.io.data_catalog - INFO - Saving data to `road_network_full_24645` (NetworkXDataSet)...


## Unpack road elements

In [134]:
G_full = catalog.load("road_network_full_24645")
node_list_full, edge_list_full = ox.graph_to_gdfs(G_full)
node_list_full = node_list_full.reset_index()
edge_list_full = edge_list_full.reset_index()
node_list_full_xy = node_list_full.to_crs("EPSG:3414").reset_index()
edge_list_full_xy = edge_list_full.to_crs("EPSG:3414").reset_index()

2022-04-04 23:49:33,762 - kedro.io.data_catalog - INFO - Loading data from `road_network_full_24645` (NetworkXDataSet)...


In [135]:
node_list_full["index"] = node_list_full["osmid"]
node_list_full_xy["index"] = node_list_full_xy["osmid"]

Remove non-highway tags:

In [136]:
def filter_highway(nodes, edges):
    print("BEFORE:", nodes.shape, edges.shape)
    edges = edges.copy()
    nodes = nodes.copy()
    nodes["u"] = nodes["index"]
    edges = edges.loc[~edges["highway"].isna()]
    nodes = nodes.loc[(nodes["u"].isin(edges["u"])) | (nodes["u"].isin(edges["v"]))]
    print("AFTER:", nodes.shape, edges.shape)
    return nodes, edges


nodes_list_full, edges_list_full = filter_highway(node_list_full, edge_list_full)
node_list_full_xy, edge_list_full_xy = filter_highway(
    node_list_full_xy, edge_list_full_xy
)

BEFORE: (33665, 8) (67613, 19)
AFTER: (26008, 9) (48941, 19)
BEFORE: (33665, 8) (67613, 20)
AFTER: (26008, 9) (48941, 20)


### Write files

In [110]:
%reload_kedro
catalog.save("road_network_full_24645_nodes", node_list_full)
catalog.save("road_network_full_24645_edges", edge_list_full)
catalog.save("road_network_full_24645_nodes_xy", node_list_full_xy)
catalog.save("road_network_full_24645_edges_xy", edge_list_full_xy)

2022-04-04 23:33:57,748 - kedro.framework.session.store - INFO - `read()` not implemented for `BaseSessionStore`. Assuming empty store.
2022-04-04 23:33:57,908 - root - INFO - ** Kedro project Demand estimation and waste collection routing optimisation for the City of Cape Town
2022-04-04 23:33:57,928 - root - INFO - Defined global variable `context`, `session` and `catalog`
2022-04-04 23:33:57,938 - root - INFO - Registered line magic `run_viz`
2022-04-04 23:33:57,955 - kedro.io.data_catalog - INFO - Saving data to `road_network_full_24645_nodes` (CSVDataSet)...
2022-04-04 23:33:58,804 - kedro.io.data_catalog - INFO - Saving data to `road_network_full_24645_edges` (CSVDataSet)...
2022-04-04 23:34:00,116 - kedro.io.data_catalog - INFO - Saving data to `road_network_full_24645_nodes_xy` (CSVDataSet)...
2022-04-04 23:34:00,640 - kedro.io.data_catalog - INFO - Saving data to `road_network_full_24645_edges_xy` (CSVDataSet)...


## Load and save simplified road network

In [111]:
%%time
G_simplified = ox.graph_from_polygon(
    geometry, custom_filter=custom_filter_SG, simplify=True
)

CPU times: user 14.5 s, sys: 259 ms, total: 14.7 s
Wall time: 14.9 s


### Write file

In [112]:
%reload_kedro
# catalog.save("road_network_simplified_24645", G_simplified)
# DataSetError: Failed while saving data to data set NetworkXDataSet(filepath=/Users/ejwillemse/dev/waste_labs_dev/project_rdi_cpt/data/02_intermediate/road_network/road_network_simplified_24645.json, load_args={}, protocol=file, save_args={}).
# Object of type LineString is not JSON serializable

2022-04-04 23:34:16,763 - kedro.framework.session.store - INFO - `read()` not implemented for `BaseSessionStore`. Assuming empty store.
2022-04-04 23:34:16,875 - root - INFO - ** Kedro project Demand estimation and waste collection routing optimisation for the City of Cape Town
2022-04-04 23:34:16,876 - root - INFO - Defined global variable `context`, `session` and `catalog`
2022-04-04 23:34:16,885 - root - INFO - Registered line magic `run_viz`


In [113]:
node_list, edge_list = ox.graph_to_gdfs(G_simplified)
node_list = node_list.reset_index()
edge_list = edge_list.reset_index()
node_list_xy = node_list.to_crs("EPSG:3414")
edge_list_xy = edge_list.to_crs("EPSG:3414")

In [114]:
node_list["index"] = node_list["osmid"]
node_list_xy["index"] = node_list_xy["osmid"]

In [115]:
nodes_list_full, edges_list_full = filter_highway(node_list, edge_list)
node_list_full_xy, edge_list_full_xy = filter_highway(node_list_xy, edge_list_xy)

BEFORE: (8138, 8) (22165, 19)
AFTER: (7088, 9) (16769, 19)
BEFORE: (8138, 8) (22165, 19)
AFTER: (7088, 9) (16769, 19)


## Write files

In [116]:
%reload_kedro
catalog.save("road_network_simplified_24645_nodes", node_list)
catalog.save("road_network_simplified_24645_edges", edge_list)
catalog.save("road_network_simplified_24645_nodes_xy", node_list_xy)
catalog.save("road_network_simplified_24645_edges_xy", edge_list_xy)

2022-04-04 23:34:20,496 - kedro.framework.session.store - INFO - `read()` not implemented for `BaseSessionStore`. Assuming empty store.
2022-04-04 23:34:20,602 - root - INFO - ** Kedro project Demand estimation and waste collection routing optimisation for the City of Cape Town
2022-04-04 23:34:20,604 - root - INFO - Defined global variable `context`, `session` and `catalog`
2022-04-04 23:34:20,612 - root - INFO - Registered line magic `run_viz`
2022-04-04 23:34:20,614 - kedro.io.data_catalog - INFO - Saving data to `road_network_simplified_24645_nodes` (CSVDataSet)...
2022-04-04 23:34:20,755 - kedro.io.data_catalog - INFO - Saving data to `road_network_simplified_24645_edges` (CSVDataSet)...
2022-04-04 23:34:21,182 - kedro.io.data_catalog - INFO - Saving data to `road_network_simplified_24645_nodes_xy` (CSVDataSet)...
2022-04-04 23:34:21,314 - kedro.io.data_catalog - INFO - Saving data to `road_network_simplified_24645_edges_xy` (CSVDataSet)...


## Save simplified directed edge list


Requires external libraries:

In [147]:
edge_list_xy_direct = process_edges(edge_list_xy)
edge_list_xy_direct = edge_list_xy_direct.drop_duplicates(["geom_id_order"])
edge_list_xy_direct.shape

(11163, 23)

In [148]:
edge_list.shape

(22165, 19)

In [149]:
edge_list_direct = process_edges(edge_list)
edge_list_direct = edge_list_direct.drop_duplicates(["geom_id_order"])
edge_list_direct.shape

(11163, 23)

### Write file

In [151]:
%reload_kedro
catalog.save("road_network_simplified_24645_edges_directed", edge_list_direct)
catalog.save("road_network_simplified_24645_edges_xy_directed", edge_list_xy_direct)

2022-04-05 00:13:35,243 - kedro.framework.session.store - INFO - `read()` not implemented for `BaseSessionStore`. Assuming empty store.
2022-04-05 00:13:35,358 - root - INFO - ** Kedro project Demand estimation and waste collection routing optimisation for the City of Cape Town
2022-04-05 00:13:35,359 - root - INFO - Defined global variable `context`, `session` and `catalog`
2022-04-05 00:13:35,370 - root - INFO - Registered line magic `run_viz`
2022-04-05 00:13:35,372 - kedro.io.data_catalog - INFO - Saving data to `road_network_simplified_24645_edges_directed` (GeoJSONDataSet)...


DataSetError: Failed while saving data to data set GeoJSONDataSet(filepath=/Users/ejwillemse/dev/waste_labs_dev/project_rdi_cpt/data/02_intermediate/road_network_edges_directed/road_network_simplified_24645.geojson, load_args={}, protocol=file, save_args={'driver': GeoJSON}).
Invalid field type <class 'list'>

## ANNEX: Future code for converting back into Graph

In [117]:
node_list_full = gpd.GeoDataFrame(node_list_full, geometry="geometry", crs="EPSG:4326")
node_list_full = node_list_full.set_index("index")
edge_list_full = gpd.GeoDataFrame(edge_list_full, geometry="geometry", crs="EPSG:4326")

In [121]:
edge_list_full = edge_list_full.set_index(["u", "v", "key"])

In [123]:
G_full_red = ox.graph_from_gdfs(
    node_list_full,
    edge_list_full,
)

In [124]:
len([node for node in G_full_red.nodes if type(node) is not dict])

33665