For our Sierra Leone example, we have IRI data for some of the roads, as collected by Road Lab Pro. Here, we match this data on to our much more detailed (and topologically correct) OSM road network

In [1]:
import os, sys
import pandas as pd
import geopandas as gpd
import networkx as nx
from shapely.geometry import Point, MultiPoint
from shapely.wkt import loads
from scipy import spatial
from functools import partial
import pyproj
from shapely.ops import transform
sys.path.append(r'C:\Users\charl\Documents\GitHub\GOST_PublicGoods\GOSTNets\GOSTNets')
import GOSTnet as gn

peartree version: 0.6.1 
networkx version: 2.3 
matplotlib version: 3.0.3 
osmnx version: 0.9 


Set the EPSG code for the projection. This will be the projection where real world distances are measured

In [2]:
code = 2161

Set paths to your graph object and import

In [3]:
net_pth = r'C:\Users\charl\Documents\GOST\SierraLeone\RoadNet'
net_name = r'largest_G.pickle'
G = nx.read_gpickle(os.path.join(net_pth, net_name))

Set paths to your IRI dataset and import. Ensure project is WGS 84

In [4]:
iri_pth = r'C:\Users\charl\Documents\GOST\SierraLeone\IRI_data'
iri_name = r'road_network_condition_vCombo.shp'
iri_df = gpd.read_file(os.path.join(iri_pth, iri_name))
iri_df = iri_df.to_crs({'init':'epsg:4326'})

Remove any records in the IRI dataframe which are equal to 0 - we only want to match on valid information

In [5]:
iri_df = iri_df.loc[iri_df.Avg_iri > 0]

Convert the LineString to a list object of the constituent point coordinates. We do this because linestring to linestring intersections are slow, painful and unpredictable. It is easier to conceptualize the intersect as line to point, or polygon to point. We pursue the latter here

In [6]:
iri_df['point_bag'] = iri_df.geometry.apply(lambda x: list(x.coords))

Create a dictionary object of IRI:list(points) objects

In [7]:
bag = {}
for index, row in iri_df.iterrows():
    bag[row.Avg_iri] = MultiPoint(row['point_bag'])

Iterate out the points into their own list, with corresponding IRI list

In [8]:
points = []
iris = []
for b in bag:
    for c in bag[b].geoms:
        points.append(c)
        iris.append(b)

Generate a new dataframe composed only of geometry:IRI pairs

In [9]:
points_df = pd.DataFrame({'IRIs':iris, 'Points':points})

Convert to GeoDataFrame using knonw projection (WGS 84)

In [10]:
points_gdf = gpd.GeoDataFrame(points_df, crs = {'init':'epsg:4326'}, geometry = 'Points')

Project over to metres to allow for binding on to graph

In [11]:
points_gdf_proj = points_gdf.to_crs({'init':'epsg:%s' % code})

Save down as required

In [12]:
#points_gdf.to_file(os.path.join(net_pth, 'IRIpoints.shp'), driver = 'ESRI Shapefile')

Generate a spatial index. This will allow us to do faster intersections later

In [13]:
sindex = points_gdf_proj.sindex

Define projection method. This will be called many times in the next loop

In [14]:
source_crs = 'epsg:4326'
target_crs = 'epsg:%s' % code

project_WGS_to_UTM = partial(
                    pyproj.transform,
                    pyproj.Proj(init=source_crs),
                    pyproj.Proj(init=target_crs))

Iterate over all graph edges, perform fast spatial intersection, add on IRI data

In [15]:
# define a counter
c = 0

# iterate over the edges in the graph
for u, v, data in G.edges(data = True):
    
    # convert string object to shapely object
    if type(data['Wkt']) == str:
        polygon = loads(data['Wkt'])
    
    # if geometry appears to be a list, unbundle it first. 
    elif type(data['Wkt']) == list:
        data['Wkt'] = gn.unbundle_geometry(data['Wkt'])
        polygon = data['Wkt']
    
    # project shapely object to UTM zone of choice
    polygon_proj = transform(project_WGS_to_UTM, polygon)
    
    # buffer by 25 metres to capture nearby points
    polygon_proj = polygon_proj.buffer(10)
    
    # generate the list of possible matches - the index of the points that intersects the 
    # boundary of the projected polygon
    possible_matches_index = list(sindex.intersection(polygon_proj.bounds))
    
    # use this to .iloc the actual points GeodataFrame
    possible_matches = points_gdf_proj.iloc[possible_matches_index]
    
    # intersect this smaller dataframe with the actual geometry to get an accurate intersection
    precise_matches = possible_matches[possible_matches.intersects(polygon)]
    
    # match on median IRI as a data dictionary object if more than 10 points detected
    if len(possible_matches) > 3:
        data['iri'] = possible_matches.IRIs.mean()
    else:
        data['iri'] = 0
    
    c+=1
    
    if c % 10000 == 0:
        print('edges completed: ',c)

edges completed:  10000
edges completed:  20000
edges completed:  30000
edges completed:  40000
edges completed:  50000
edges completed:  60000
edges completed:  70000
edges completed:  80000
edges completed:  90000
edges completed:  100000
edges completed:  110000
edges completed:  120000
edges completed:  130000


Save down the new graph

In [16]:
gn.save(G, 'IRI_adj', net_pth, pickle = True, nodes = False,  edges = False)