In [42]:
#load required packages
import networkx as nx
import osmnx as ox
import pandas as pd
import numpy as np
import folium
import json
import scipy.stats as stats

## Simple Demonstration of Reliability Measures

In this notebook, we explain what reliability means in terms of travel time in urban road networks. We consider the map of Champaign-Urbana, consider a simple model for travel time distribution on different edges of the road network, generate synthetic data parameters for the travel time distribution, and finally measure the reliability of a route for a given travel time budget.

### Stochasticity of Travel Times

In urban roads, travel times vary a lot due to a lot of factors such as varying traffic density, pedestrain crossings, etc. In such cases, it makes sense to model travel time on each edge as a random variable rather than a deterministic quantity. There are many ways we can model the travel time as a random variable, some of which consider the existing dependency between travel time of different edges. For now, we assume that the travel time between different edges are independent. 

We can obtain the distribution of the random variable modeling the travel time on an edge using collected data. For now, lets assume that the travel time on any edge can be modeled as a Gaussian random variable which is parametrized by a mean $\mu$ and variance $\sigma^2$. We can estimate the mean and variance from travel time samples on an edge. Note that, because of the independence assumption we made in the above paragraph, all these random variables are independent. 

There are a bunch of ways we could obtain raw data to estimate the $\mu$ and $\sigma^2$ of the travel time distribution on an edge. Some of them are not easy, and some other might not have data specific to Champaign Urbana. To get started, we generate synthetic data for mean and variance of travel time on each edge below and store it in the original networkx file.

In [53]:
## Import the CU data
filepath = 'champaign_processed.graphml'
G = ox.load_graphml(filepath)
for _,data in G.nodes(data=True):
    data['pop_score_tract'] = float(data['pop_score_tract'])
    data['pop_score_block'] = float(data['pop_score_block'])    
for i in G.nodes:
    Neighbour = list(G.neighbors(i))
    for item in Neighbour:
        route = nx.shortest_path_length(G,i,item,weight = 'length')
        if route <=10 and G.nodes[i]['pop_score_block'] != -1:
            G.nodes[item]['pop_score_block'] = -1
 #change data types of some attributes
DePop = []
for i in G.nodes:
    if G.nodes[i]['pop_score_block'] == -1:
        G.nodes[i]['pop_score_block'] = 0
        DePop.append(i)
for _,data in G.nodes(data=True):
    data['pop_score_tract'] = float(data['pop_score_tract'])
    data['pop_score_block'] = float(data['pop_score_block'])    
## Add travel time mean and variance on each edge
for u,v,data in G.edges(data=True):
    travel_time = data['travel_time']
    data['traveltime_mean'] = travel_time
    data['traveltime_var'] = 0.3*travel_time

### Reliability of Travel Time

For a given travel time budget $\tau$, the reliability of a route $R$ from origin to destination can be given as the probability of reaching the destination within the given travel time budget using that route. Recall that we have distributions of travel time on individual edges and a route is basically a series of edges. Since we assumed that the random variables are independent, the travel time distribution of a route can be defined as the sum of random variables defining the travel time on the edges that constitute the route, $T_R = T_{e_1} + T_{e_2} + ... + T_{e_n}$. Given that we have independent Gaussian random variables modeling the travel time on each edge, the parameters of the travel time distribution of the route can be given as 

$$
\mu_R = \mu_{e_1} + \mu_{e_2} + ... + \mu_{e_n}
$$

and 

$$
\sigma^2_R = \sigma^2_{e_1} + \sigma^2_{e_2} + ... + \sigma^2_{e_n}
$$

Note that we were able to obtain such a simple expression only because of our assumptions. Now, given the paramters of the normal distribution modeling the travel time on a route $R$, the reliability of that route can be formally given as

$$
Reli(R) = P(T_R \leq \tau) = \Phi(\frac{\tau - \mu_R}{\sigma^2_R})
$$

where $\Phi(t)$ denotes the cumulative distribution function of the standard normal variable. Reliability, which is a probability measure, takes values between 0 and 1. Below is a function to calculate the reliability of a route for a given travel time budget and a quick demonstration of route reliability.

In [54]:
# function to calculate reliability of a route
def calc_reliability(route, time_budget):

    edges_nodes = zip(route[:-1], route[1:])
    edges_mean = []
    edges_var = []
    
    for pair in edges_nodes:  
        edge_data = G.get_edge_data(*pair)
        edges_mean.append(edge_data[0]['traveltime_mean'])
        edges_var.append(edge_data[0]['traveltime_var'])
    
    route_mean = np.sum(edges_mean)
    route_var = np.sum(edges_var)
    
    snorm_rv = stats.norm()
    reliability = snorm_rv.cdf((time_budget-route_mean)/route_var)
    return reliability

In [55]:
# testing reliability

route = nx.shortest_path(G, 5428059287, 4470454133)

# reliability
reliability = calc_reliability(route,354)
reliability

0.5022710401887467

In [112]:
G.edges[(5912707072, 2424357566, 0)]

{'osmid': 626288554,
 'highway': 'service',
 'oneway': False,
 'length': 160.77800000000002,
 'geometry': <shapely.geometry.linestring.LineString at 0x1c1a0d76220>,
 'speed_kph': 59.677940884057975,
 'travel_time': 9.698739457591065,
 'traveltime_mean': 9.698739457591065,
 'traveltime_var': 2.9096218372773195}

In [117]:
marker = 0
fh = open('dataChampaign.osm.json','w+')
for i in G.edges:
    tempData = {}
    geom = {}
    if G.edges[i]['length'] < G.edges[i]['speed_kph']*1000/3600:
        G.edges[i]['length'] = float('%.3f'%(G.edges[i]['speed_kph']*1000/3600))
    iD = G.edges[i]['osmid']
    if isinstance(iD, list):
        iD = [iD[0],1]
    elif isinstance(iD, int):
        iD = [iD,i[2]]
    tempData['id'] = iD
    tempData['length'] = float('%.3f'%(G.edges[i]['length']))
    tempData['startNodeId'] = [i[0],0]
    tempData['endNodeId'] = [i[1],0]
    p1 = {}
    p2 = {}
    p1['lat'] = float('%.7f'%G.nodes[i[1]]['y'])
    p1['lon'] = float('%.7f'%G.nodes[i[1]]['x'])
    p2['lat'] = float('%.7f'%G.nodes[i[0]]['y'])
    p2['lon'] = float('%.7f'%G.nodes[i[0]]['x'])
    geom['points'] = [p1,p2]
    tempData['geom'] = geom
    tempData['speedLimit'] = float('%.3f'%(G.edges[i]['speed_kph']*1000/3600))
    tempData['lane'] = 1
    hmm = [{"mode": "go", "mean": float('%.3f'%(G.edges[i]['traveltime_mean'])), "cov":float('%.3f'%( G.edges[i]['traveltime_var'])), "prob": 0.85},{"mode": "stop", "mean":float('%.3f'%(G.edges[i]['traveltime_mean']*5)), "cov": float('%.3f'%(G.edges[i]['traveltime_var']/15)), "prob": 1.5E-1}]
    tempData["hmm"]=hmm
    if i[0]!=i[1]:
        fh.write('%s \n' %json.dumps(tempData))
fh.close()

In [118]:
tempData

{'id': [5344445, 0],
 'length': 197.917,
 'startNodeId': [38060031, 0],
 'endNodeId': [38132560, 0],
 'geom': {'points': [{'lat': 40.1361376, 'lon': -88.2771606},
   {'lat': 40.135401, 'lon': -88.275042}]},
 'speedLimit': 13.411,
 'lane': 1,
 'hmm': [{'mode': 'go', 'mean': 14.758, 'cov': 4.427, 'prob': 0.85},
  {'mode': 'stop', 'mean': 73.788, 'cov': 0.295, 'prob': 0.15}]}

In [107]:
python Main.py --source=38098327.0 --dest=38014800.0 --budget=300 --network="data/dataChampaign.osm.json"

SyntaxError: invalid syntax (<ipython-input-107-5dcd44d6a5e2>, line 1)