In [None]:
import numpy as np
import pandas as pd
import geopandas as gpd
import osmnx
import pyrosm
import os
%matplotlib inline

# Network routing

In the previous exercise, we have generated a trip table with origin and destination coordinates. In this exercise, we will make use of OpenStreetMap data to route the individual trips on the networks and show which roads are used by the generated trips.

The following steps will be followed:
- We download OpenStreetMap data for Île-de-France
- We use a tool called `osmium` to out only the city of Paris (to speed up routing) and to bring the raw data in the right format
- We make use of the library `osmnx` to prepare the network data for routing
- We make use of the library to route all generated trips from the previous exercise on the network
- We plot the flow along the road network that is created by the trips

First, we load in the trip information and the municipality data:

In [None]:
df_trips = gpd.read_parquet("data/trips.parquet")
df_municipalities = gpd.read_parquet("data/municipalities.parquet")

**Task**: Filter the municipalities such that only Paris remains in the data set.

In [None]:
### Insert code here
# df_perimeter = 


Now, we merge all municipalities into one polygon and save the dataframe in GeoJson format. We'll need it later for cutting the OpenSteetMap data.

In [None]:
# Merge multiply polygons into one
df_perimeter = df_perimeter.dissolve()

# Write as GeoJson format file. You can have a look at it in QGIS, for instance.
df_perimeter.to_crs("EPSG:4326").to_file("data/perimeter.geojson")

## Preparing OpenStreetMap data

In this part, we will download the OpenStreetMap data, and use `osmium` (installed as a command-line utility in your `conda` environment) to cut and convert it for further processing. Please follow the following steps.

### OpenStreetMap data

The whole OpenStreetMap data set is large, so there are providers of smaller cut-outs. One useful source of such cut-outs is GeoFabrik, which provides per-region data sets for France: http://download.geofabrik.de/europe/france.html

- Download the latest data for Île-de-France in `.osm.pbf` format and put the file next to this notebook.

Linux users may execute the following cell:

In [None]:
if not os.path.exists("data/ile-de-france-latest.osm.pbf"):
    !cd data && wget http://download.geofabrik.de/europe/france/ile-de-france-latest.osm.pbf

Next, we (1) cut the OpenStreetMap data to the selected perimeter (Paris) and, (2) only retain road geometries in the file:

In [None]:
!osmium extract data/ile-de-france-latest.osm.pbf -p data/perimeter.geojson --overwrite -o data/cut.osm.pbf
!osmium tags-filter --overwrite -o data/perimeter.osm.pbf data/cut.osm.pbf w/highway

## Loading the OpenStreetMap data

We are now ready to read the data and make use of it in this notebook. For that, we make use of the `pyrosm` library. Further processing of the data happens using `osmnx` and `networkx`.

In [None]:
# Load our cut perimetecr data
osm = pyrosm.OSM("data/perimeter.osm.pbf")

# Extract nodes and edges from the road network
nodes, edges = osm.get_network(nodes = True, network_type = "driving")

# Convert the data into a graph that can be used with the networkx library
graph = osm.to_graph(nodes, edges, graph_type = "networkx")

Next, we add link speeds and travel times to the network based on OSM information:

In [None]:
osmnx.add_edge_speeds(graph)
osmnx.add_edge_travel_times(graph)

**Task**: Use `osmnx` to transform the network into a `GeoDataFrame` that you can easily manipulate and visualize. Remove unncessary columns such that you arrive at the structure below:

In [None]:
pd.DataFrame({ "u": [], "v": [], "geometry": [] })

In [None]:
df_network = osmnx.graph_to_gdfs(graph, nodes = False, edges = True).reset_index()

### Insert your code here


**Task**: Write out the network in GeoPackage format and have a look at it in QGIS. 

In [None]:
# Insert your code here


### Routing

In order to route the trips in our *trips* dataframe through the road network, we need to assign each origin and destination (by coordinate) to a specififc network node:

In [None]:
# add origin_node column to our data frame
geometry = df_trips["origin_geometry"].to_crs("EPSG:4326")
df_trips["origin_node"] = osmnx.nearest_nodes(graph, geometry.x, geometry.y)

# add destination_node column to our data frame
geometry = df_trips["destination_geometry"].to_crs("EPSG:4326")
df_trips["destination_node"] = osmnx.nearest_nodes(graph, geometry.x, geometry.y)

**Task**: Look at the updated dataframe and note down one combination of origin and destination node. Use `osmnx.shortest_path` to perform a routing through the network:

In [None]:
### Insert your code here

# origin_node = 
# destination_node = 

# route = osmnx.shortest_path(graph, ..., weight = "travel_time")


Print the obtained route. What does it represent?

In [None]:
route

**Task**: To visualize the route, we need to select all links from the network that connect the provided node identifiers. Visualize the resulting data here (calling `plot` on the dataframe) and in QGIS.

In [None]:
# Understand what happens in the following two lines
df_selector = pd.DataFrame({ "u": route[:-1], "v": route[1:] })
df_route = pd.merge(df_network, df_selector, on = ["u", "v"])

### Insert your code here


**Task**: Now, calculate, at least 200 routes or more from your trips data frame. You can pass a list of origin nodes and a list of destination nodes to `osmnx.shortest_path`.

In [None]:
### Insert your code here

#routes = ...


**Task**: We have now obtained a list of routes, one for each trip. A route is simply a list of nodes. By always noting down one node in a route, and its successor, we can obtain a data frame that counts the number of traversals of each edge. Complete the code to have a data frame that shows the number of traversals between to nodes:

In [None]:
pd.DataFrame({ "u": [], "v": [], "count": [] })

In [None]:
# Here we create a list of data frames with the node-to-node traversals
df_count = []

for route in routes:
    df_count.append(pd.DataFrame({ "u": route[:-1], "v": route[1:] }))

# Complete the code to arrive at the count dataframe shown above

### Insert your code here


**Task**: Which two nodes have the largest number of traversals between each other?

In [None]:
### Insert your code here

### SOLUTION START
df_count.sort_values(by = "count", ascending = False)

**Task**: Now merge your network dataframe with the counts dataframe so you have the counts attached to the network geometry. Hint: Perform a *left join* to not remove and network links and fill missing count values with zeros.

In [None]:
pd.DataFrame({ "u": [], "v": [], "geometry": [], "count": [] })

In [None]:
### Insertion your code here


**Task**: Plot the network using the `count` column in this notebook. Find a better representation in QGIS as well.

In [None]:
### Insert code here
# ...


**Example for a representation in QGIS**

![](material/flow_example.png)

**Congratulations!** You should now be able to cut a road network for the course project (Exercise 3.2).