# Practical: RTree and NetworkX

In this practical we will be looking at the various methods for reading geospatial information represented by a
graph network.

In this practical we shall be working on Ordnance Survey Mastermap Integrated Transport Network data that covers the
island of Mersea in Essex. The OS data has been downloaded from Edina Digimap and has been cleaned up in QGIS.
You must use this data in accordance with the Educational User Licence that you agreed to when you signed up to Edina
Digimap.

All OS data is © Crown copyright and database rights 2018 Ordnance Survey.

## RTree

Rtree is a ctypes Python wrapper of libspatialindex that provides a number of advanced spatial indexing features for
the spatially curious Python user.

To use rtree, we need to import the object index:

In [None]:
from rtree import index

After importing the index module, we build an index using its default constructor:

In [None]:
idx = index.Index()

After instantiating the index, we create a bounding box that we insert into the index:

In [None]:
br = (0.0, 0.0, 1.0, 1.0)

We now insert this entry into the index:

In [None]:
idx.insert(0, br)

We also add 10,000 squares of length 0.99 into the index:

In [None]:
for i in range(100):
    for j in range(100):
        idx.insert(i*100 + j, (i, j, i+0.99, j+0.99))

We can query the index using `intersection`. This will return the indexed entries that cross or are contained
within the given query window.

In [None]:
for i in idx.intersection((1.0, 1.0, 2.0, 2.0)):
    print(i)

We can query the index using nearest. This will return the nearest indexed entries to the given query point.
If multiple items are of equal distance to the bounds, both are returned:

In [None]:
for i in idx.nearest((0.8, 0.8), 1):
    print(i)

Note that if an object with the same id is added, the first object will not be replaced by the second one.
If you need unique ids then you should handle them yourself, for example using a set.

# Exercise 33

Create a Polygon using Shapely and index their Minimum Bounding Rectangles, then query the index using a point using
the intersection method.

## Introduction to NetworkX

NetworkX is a Python package for the creation, manipulation, and study of the structure, dynamics, and functions of
complex networks.

We can import `networkx` like this:

In [None]:
import networkx as nx

We start creating a small graph consisting of 5 nodes and 4 edges. Nodes can either be added individually (`add_node`)
or from a list (`add_nodes_from`).

In [None]:
g = nx.Graph()

We add a node:

In [None]:
g.add_node(1)

Or multiple nodes by using a list:

In [None]:
g.add_nodes_from([2, 3, 4, 5])

We can then access the nodes via `nodes`:

In [None]:
g.nodes

We have the option to add attributes to nodes. In this example, we iterate through the nodes in the graph and apply a
value `green` to a `color` attribute key. Then assign a `red` to the `color` attribute in Node 1.

In [None]:
for node in g.nodes:
    g.nodes[node]['color'] = 'green'

g.nodes[1]['color'] = 'red'

We can display the nodes using `draw`. Before using the `draw` method, generate a list of colors that corresponds to
the node `color` attributes.

Note: The standard method for obtaining a value from a dictionary is `dictionary[key]`. An alternative method is to
use the `.get` method which allows for a default value if the key does not exist.

In [None]:
node_colors = []
for node in g.nodes:
    node_colors.append(g.nodes[node].get('color','blue'))
    
nx.draw(g, with_labels=True, node_color=node_colors)

Nodes are only one part of a graph. Edges are the connections between nodes. We can add edges by specifying the two
nodes that the edge connects. Attributes can be added at the same time as the edges are created.

In [None]:
g.add_edge(0, 1, color='blue')
g.add_edges_from([(0,2), (0,3), (0,4), (0,5)], color='purple')

We can now draw the graph with the edges connected. Note that an `edge_colors` list can be generated in much the same
way a `node_colors` list.

In [None]:
node_colors = []
for node in g.nodes:
    node_colors.append(g.nodes[node].get('color','blue'))
    
edge_colors = []
for u, v in g.edges:
    edge_colors.append(g.edges[u, v].get('color', 'black'))

nx.draw(g, with_labels=True, node_color=node_colors, edge_color=edge_colors)

To avoid repeating code every time we need to draw a graph, let's define a function for creating the color lists
`node_colors` and `edge_colors`.

In [None]:
def obtain_colors(graph, default_node='blue', default_edge='black'):
    node_colors = []
    for node in graph.nodes:
        node_colors.append(graph.nodes[node].get('color', default_node))
    edge_colors = []
    for u, v in graph.edges:
        edge_colors.append(graph.edges[u, v].get('color', default_edge))
    return node_colors, edge_colors

Graphs can be joined together in networkx using the `compose` function.

To demonstrate `compose` let's make a new graph `h` with its own color scheme.

In [None]:
h = nx.Graph()

h.add_edges_from([(1, 6), (1, 7), (1, 8), (1, 9)], color='purple')
for node in h.nodes:
    h.nodes[node]['color'] = 'grey'

h.nodes[1]['color']  = 'pink'

We now display this new graph.

In [None]:
node_colors_2, edge_colors_2 = obtain_colors(h)

nx.draw(h, with_labels=True, node_color=node_colors_2, edge_color=edge_colors_2)

Now let's combine the 2 graphs together using compose. We first draw the graph `gh`, where the attributes of `h`
override the attributes of `g`:

In [None]:
gh = nx.compose(g, h)

node_colors_gh, edge_colors_gh = obtain_colors(gh)
nx.draw(gh, with_labels=True, node_color=node_colors_gh, edge_color=edge_colors_gh)

We now draw the graph hg with `g` overriding `h`. Note how Node 1 is pink in `gh` and red in `hg`:

In [None]:
hg = nx.compose(h, g)

node_colors_hg, edge_colors_hg = obtain_colors(hg)
nx.draw(hg, with_labels=True, node_color=node_colors_hg, edge_color=edge_colors_hg)

In `networkx` you can create graphs of a certain shape using built-in functions. We can add a rectangular grid or a
hexagon graph using the code below.

In [None]:
m = nx.grid_2d_graph(3,3)
nx.draw(m, with_labels=True)

We can also build a graph with the shape of hexagons:

In [None]:
n = nx.hexagonal_lattice_graph(3, 3, with_positions=True)

pos = {}
for node in n.nodes:
    pos[node] = n.nodes[node]['pos']

nx.draw(n, pos=pos)

The Dijkstra algorithm can be used to find the shortest distance between two nodes. The algorithm returns a list of
nodes starting with the start node specified.

For the purposes of this exercise we shall be creating a rectangular grid using the functions below.
This rectangular grid differs from the built-in grid in that the node id is an integer rather than a tuple pair.

In [None]:
def get_node_id(r, c, w):
    return w * r + c 

def make_2d_grid(h, w):
    pos = {} # contains a co-ordinate positions to assist drawing
    g = nx.Graph()
    for r in range(h):
        for c in range(w):
            if c != w-1: 
                g.add_edge(get_node_id(r, c, w), get_node_id(r, c+1, w), weight=1.0, length=1.0, color='black')
            if r != h-1: 
                g.add_edge(get_node_id(r, c, w), get_node_id(r+1, c, w), weight=1.0, length=1.0, color='black')
            pos[get_node_id(r, c, w)]= (-1 + (c * 2 / w), 1 - (r * 2 / h))
            g.nodes[get_node_id(r, c, w)]['color'] = 'red'
    return g, pos

Using the `make_2d_grid function` create a 3x3 grid 'r' and its associated position information (`pos`).
This will improve the visual layout of the plotted graph.

In [None]:
r, pos = make_2d_grid(3,3)

node_colors, edge_colors = obtain_colors(r)

nx.draw(r, pos=pos, with_labels=True, font_weight='bold', edge_color=edge_colors, node_color=node_colors, font_color='white')

We can use the `dijkstra_path` method to calculate to shortest path between the nodes 3 and 8. When calculating the
shortest path the `weight` attribute is passed to the `weight` keyword argument.

In [None]:
path = nx.dijkstra_path(r, source=3, target=8, weight="weight")
path

Does the returned path connect nodes 3 and 8?

Next we define a function to color the found path.

In [None]:
def color_path(g, path, color='blue'):
    res = g.copy()
    first = path[0]
    for node in path[1:]:
        res.edges[first, node]['color'] = color
        first = node
    return res

We then use the function to apply the colors to the graph and plot:

In [None]:
r_new = color_path(r, path)
node_colors, edge_colors = obtain_colors(r_new)

nx.draw(r_new, pos=pos, with_labels=True, font_weight='bold', edge_color=edge_colors, node_color=node_colors, font_color='white')

A Digraph is a graph in which the edges are directed from a start node to an end node. 

We use the `to_directed` method on the rectangular grid `r` created in the previous code cells to create a new
digraph `d`.

In [None]:
d = r.to_directed()

node_colors, edge_colors = obtain_colors(d)
nx.draw(d, pos=pos, with_labels=True, font_weight='bold', edge_color=edge_colors, font_color='white')

We use the `remove_edge` method to remove an edge in a specified direction. Then, we plot this again. Notice that now
there is only one arrow on the edge connecting nodes 4 and 5.

In [None]:
d.remove_edge(4,5) 

In [None]:
node_colors, edge_colors = obtain_colors(d)

nx.draw(d, pos=pos, with_labels=True, font_weight='bold', edge_color=edge_colors, font_color='white')

Now we recalculate the shortest path using this new grid:

In [None]:
path_1 = nx.dijkstra_path(d, source=3, target=8, weight="weight")
path_1

We now color the graph and path. However, the previously defined function `obtain_colors` is no longer ideal because
the opposing edge remains of the original colour.

In [None]:
d_1 = color_path(d, path_1)

node_colors, edge_colors = obtain_colors(d_1)

nx.draw(d_1, pos=pos, with_labels=True, font_weight='bold', edge_color=edge_colors, node_color=node_colors, font_color='white')

# Exercise 34

Create a 5 by 5 hexagonal graph, then compute and color the shortest path from node 2 to node 20. 

## Network Analysis of Integrated Traffic Network

For the final step we shall perform a Dijkstra Analysis on the road network of Mersea Island.

The ITN data has been extracted form a GML file that is downloadable from Edina Digimap. Having been cleaned up it has
been saved as a JSON file for convenient reading in this exercise. Open the JSON file in a text editor to view the
information that has been extracted.

The information has been divided up into four sections:

- `roadlinks` - a dictionary of roadlinks indexed by feature ID containing:
   - 'start' - the feature ID of the start node;
   - 'end' - the feature ID of the end node;
   - 'natureOfRoad' - e.g. Single Carriageway;
   - 'descriptiveTerm' - e.g. Local Street;
   - 'length' - the length of the roadlink between nodes;
   - 'coords' - a linestring of BNG coordinates from start node to end node.
- `roadnodes` - a dictionary of roadnodes indexed by feature ID containing:
    - 'coords' - the BNG coordinate of the node;
- `road` - a dictionary of "roads" indexed by feature ID containing:
    - 'Primary' - True if the road is a primary route;
    - 'roadName' - the name of the road e.g. "M25" or "Gower Street";
    - 'links' - the feature IDs of roadlinks that make up the road.
- `routeinfo` - a dictionary of oneway route indexed by feature ID
    - 'oneway' - a list of tuples containing feature ID of roadlinks and a direction indicator;
    - A `'+'` indicates that the direction of permitted travel is from start node to end node of the roadlink;
    - A `'-'` indicates that the direction of permitted travel is from end node to start node of the roadlink.

In [None]:
import os
import json 

mersea_itn_json = os.path.join('8 - Material', 'itn', 'mersea_itn.json')
with open(mersea_itn_json, 'r') as f:
    mersea_itn=json.load(f)

We now create the graph from the dictionary loaded from the JSON file. For this exercise we shall be creating a
simple undirected graph. Any parallel edge will be overwritten as the graph is built up.

In [None]:
g = nx.Graph()
road_links = mersea_itn['roadlinks']
for link in road_links:
    g.add_edge(road_links[link]['start'], road_links[link]['end'], fid = link, weight = road_links[link]['length'])

We now inspect the graph. 

Note: This plot is unlikely to be meaningful.

In [None]:
nx.draw(g, node_size=1)

We shall now find the shortest path between two random nodes on this island road network.

In [None]:
start = 'osgb5000005124619786'
end = 'osgb4000000029329827'

path = nx.dijkstra_path(g, source=start, target=end, weight='weight')
path

We will use the `color_path` function that we created earlier to color the graph network and then plot it:

In [None]:
g_1 = color_path(g, path, 'red')

node_colors, edge_colors = obtain_colors(g_1)

nx.draw(g_1, node_size=1, edge_color=edge_colors, node_color=node_colors)

The final step of this part is to create a GeoDataFrame of the shortest path and then display it on top of a raster.
We shall be using the following packages and a background map of Mersea island.

In [None]:
import rasterio
import numpy as np
import geopandas as gpd
import matplotlib.pyplot as plt
from cartopy import crs
from shapely.geometry import LineString

mersea_background = os.path.join('8 - Material', 'oml', 'oml-raster_2683809.tif')

The first step is to iterate through each of the nodes on the calculated shortest path. We assign the first node to the
variable `first_node`. Then, starting with the second node, we find the `fid` of road link that connects the
`first_node` and `node`. Knowing the roadlink `fid`, we can find the coordinates and make a `shapely` LineString object.
The final step of each iteration is to set `first_node` so that it can be used in the next iteration.

On each iteration we append the feature id and the geometry to two lists `links` and `geom` which are used to build
the `path_gpd` GeoDataFrame.

In [None]:
links = [] # this list will be used to populate the feature id (fid) column
geom  = [] # this list will be used to populate the geometry column

first_node = path[0]
for node in path[1:]:
    link_fid = g.edges[first_node, node]['fid']
    links.append(link_fid)
    geom.append(LineString(road_links[link_fid]['coords']))
    first_node = node

shortest_path_gpd = gpd.GeoDataFrame({'fid': links, 'geometry': geom})

Let's now check how this route looks like:

In [None]:
shortest_path_gpd.plot()

In order to view the route, load the background map and apply the colormap to the array.

In [None]:
background = rasterio.open(mersea_background)
back_array = background.read(1)
palette = np.array([value for key, value in background.colormap(1).items()])
background_image = palette[back_array]
bounds = background.bounds
extent = [bounds.left, bounds.right, bounds.bottom,  bounds.top]
display_extent = [bounds.left+200, bounds.right-200, bounds.bottom+600, bounds.top-600]

In [None]:
fig = plt.figure(figsize=(3,3), dpi=300)
ax = fig.add_subplot(1,1,1, projection=crs.OSGB())

ax.imshow(background_image, origin='upper', extent=extent, zorder=0)

shortest_path_gpd.plot(ax=ax, edgecolor='blue', linewidth=0.5, zorder=2)

ax.set_extent(display_extent, crs=crs.OSGB())

# Exercise 35

Compute and show the shortest path between the most southerly location and the most northerly location in the
Mersea Island.