![Coverpage](Coverpage.png)

In [None]:
github link

__https://github.com/oscarhogan/Assignment2__

__Lab 5__

In [None]:
import pickle

with open('network_analysis/github_users.p', 'rb') as f:
    G = pickle.load(f)
G

In this instance we are importing a subset of interactions between certain github users, under the label 'github_users.p' This data will be then subsetted again as to create a graph of networks between a series of Github users, represented by edges between nodes

In [None]:
print(len(G))
print(type(G.nodes()))
print(list(G.edges(data=True))[-1])
print(list(G.nodes(data=True))[0]) 
print(type(list(G.edges(data=True))[-1][2]))

The function 'print(len(G))' gives the size of the graph associated with the data from github, while 'print(type(G.nodes()))' gives you the data type of the nodes. In this instance the print function is also used to subset the element from the first row of the nodes column as well as subsetting the last element from the edges column.

In [None]:
list(G.degree)
list(G.edges)

Listed are the degrees and edges in the 'github_users.p' dataset. Each degree represents a user of github with the edges being indicative of interactions between them, presumubly collaborations between github accounts; push, pulls etc. This is done using the basic list function.

In [None]:
import networkx as nx
import matplotlib.pyplot as plt
import pickle

with open('network_analysis/github_users.p', 'rb') as f:
    G = pickle.load(f)

degree_centrality = nx.degree_centrality(G)
list(degree_centrality)

for node, centrality in degree_centrality.items():
    print(f"Node {node}: Degree Centrality = {centrality:.3f}")

Degree Centrality is a measure used to determine the relative importance of nodes within a graph. The measure is a product of the number of neighbours a node has divided by the total amount of neighbours a node could have. In this case given the sheer number of nodes the degree centrality values are small being overwhelmingly concentrated below 0.001. This is demonstrated upon creating a histogram using the imported matplotlib package with the concentration of values largely situated in the first bin, with a slim proportion of outliers reaching DC values of above 0.005.

In [None]:
plt.hist(degree_centrality.values(), bins=10, alpha=0.7, color='orange')
plt.title('Degree Centrality Histogram')
plt.xlabel('Degree Centrality')
plt.ylabel('Frequency')
plt.show()

![a](a.png)

Below I am creating a new edge selection borne solely of the edges 'u1','u10' and 'u3' under the label 'Goat'. A small subselection involving only three distinct edges allows for analysis of graphs demonstrating the relative centrality of each based on the number of adjacent edges, external to the original selection.

In [None]:
Goat=G.edges(["u1","u10","u3"])

Goat

In [None]:
Gh_sub = nx.DiGraph()
len(Gh_sub)

In [None]:
Gh_sub.add_edges_from(Goat) 
len(degree_centrality)
list(degree_centrality)

Visualised is a figure demonstrating the edges including the three subsetted nodes 'u3', 'u1' and 'u10'. Observation of the graph demonstrates the fact that the three nodes in question have differing degrees of centrality, with the node 'u3' being responsible for the vast majority of edges represented in the figure. 'u1' has only a few edges whereas 'u10' holds a very minor degree of centrality having only two neighbouring nodes.

In [None]:
plt.figure(figsize=(6, 6))
nx.draw(Gh_sub, with_labels=True)
plt.show()

![b](b.png)

The next step is to import the nxviz package as to create a sequence of graphs giving alternate visualisations to relationship between the subsetted nodes. The MatrixPlot clearly demonstrates the fact that most of the edges are associated with a single node, whereas the other subsetted nodes of a relatively diminished importance.

In [None]:
import nxviz as nv
from datetime import datetime, date

nv.MatrixPlot(Gh_sub)
plt.show()

![c](c.png)

Generating an ArcPlot perhaps most pertinently showcases the relationship between the subsetted nodes and their degree centrality with a single node being responsible for the overwhelming majority of edges.

In [None]:
nv.ArcPlot(Gh_sub)
plt.show()

![d](d.png)

The CircosPlot mirrors this, but in this instance is perhaps a less useful visualisation method given the fact that the one node holds a degree centrality far in excess of the others, obscuring the readability.

In [None]:
nv.CircosPlot(Gh_sub)
plt.show()

![e](e.png)

__Challenge 2__

In [None]:
import osmnx as ox
from IPython.display import Image

In [None]:
O = ox.graph_from_place("Oxford", network_type="drive")
fig, ax = ox.plot_graph(O)

![f](f.png)

In [None]:
M = ox.utils_graph.get_undirected(O)
D = ox.utils_graph.get_digraph(O)
gdf_nodes, gdf_edges = ox.graph_to_gdfs(O)

For further analysis it is advantageous to convert our MultiDiGraph output into a geodataframe output such that the nodes are given point geometries in accordance with the CRS EPSG:4326.

In [None]:
gdf_nodes.head()

In [None]:
O_proj = ox.project_graph(O)
nodes_proj, edges_proj = ox.graph_to_gdfs(O_proj, edges=True, nodes=True)
graph_area_o = nodes_proj.unary_union.convex_hull.area
graph_area_o

In this instance I identify the area of the graph in square metres, labelling it as 'graph_area_o' by using the spatial extent of the nodes

In [None]:
ox.basic_stats(O_proj, area=graph_area_o, clean_int_tol=15)

In this instance I am generating a more presentable schematic diagram of the road network in Oxford including some additional features. The figure size is set to 10 for ease of observation while the edge colour is specified as green providing a nice contrast with the dark background. Additionally I've chosen to keep nodes invisible as to provide a clean visual flourish while increasing the edge line width to 1.5, maximising legibility. In the same pane I've also set 'O' to include the capacity to calculate travel speeds and travel times, with the intention being toward having the capacity to demonstrate travel times between a set of specified coordinate locations as in navigation applications such as Google Maps.

In [None]:
place = {"city": "Oxford","country": "UK"}
O = ox.graph_from_place(place, network_type="drive", truncate_by_edge=True)
fig, ax = ox.plot_graph(O, figsize=(6, 6), node_size=0, edge_color="green", edge_linewidth=1.5)

O = ox.speed.add_edge_speeds(O)
O = ox.speed.add_edge_travel_times(O)

![g](g.png)

Here I identify a two sets of coordinates on the O graph. One is designated as the origin and the other the destination.

In [None]:
orig = ox.distance.nearest_nodes(O, X=-1.210433, Y=51.733963)   
dest = ox.distance.nearest_nodes(O, X=-1.274409, Y=51.750201) 

Then, using the shortest_path method and applying the travel_time weight I can calulate the optimum path between a pair of specified points in Oxford. The outcome is given below, with the red line showcasing in a rudimentry sense, the optimum pathway. Clarification is needed in the sense that the map ceases to consider specific traffic regulations associated with the edges.

In [None]:
route = ox.shortest_path(O, orig, dest, weight="travel_time")
fig, ax = ox.plot_graph_route(O, route, node_size=0)

![h](h.png)

In [None]:
import networkx as nx
import matplotlib.pyplot as plt

In [None]:
degree_centrality = nx.degree_centrality(O)

for node, centrality in degree_centrality.items():
    print(f"Node {node}: Degree Centrality = {centrality:.3f}")

Betweeness centrality is a slightly more elaborate method of graph analysis founded on the relationship between nodes and their adjacent neighbours. Betweeness centrality measures the frequency at which the identified node is situated on the shortest possible pathway between a pair of further-flung non adjacent nodes. Thus a greater degree of variation is anticipated, based on an understanding that Oxford has a rather typical urban form in that many key roadway routes converge toward a central point which funnels traffic circulation. As such nodes toward this central location ought to have a higher betweeness centrality. The code layout is borrowed largely from that used prior with the 'betweenness_centrality' method replacing that where 'degree_centrality' once was.

In [None]:
betweenness_centrality = nx.betweenness_centrality(O)

for node, centrality in betweenness_centrality.items():
    print(f"Node {node}: Betweenness Centrality = {centrality:.3f}")


Here we are using the plot_figure_ground method to create a nice iconographic map outcome of the central portion of Oxford. This could see use as symbology for an application or dashboard. Before applying the main code for generation we need ton alter a few settings which determine the size, resolution and format the outcome is saved as. In this instance the outcome is to be saved in my images folder and as a 'png' file format. The size is to be set at 240 pixels and a resolution of 40 dots per inches is given.

In [None]:
img_folder = "images"
extension = "png"
size = 240
dpi = 40

This code is responsible for generating the outcome itself. A limited degree of rationale has been implemented toward determining the street widths. In acknowledgment of Oxfords largely pedestrianised nature I've increased the width of edges not open to vehicular traffic as they are prolific in the central area. The given point value is an area centered on a core section of the high street. The image itself is rendered in accordance with the paradigms set prior with with the height and width given as the aforementioned 240 size value, to give a square plot.

In [None]:
street_widths = {
    "footway": 0.9,
    "steps": 0.9,
    "pedestrian": 0.9,
    "path": 0.9,
    "track": 0.9,
    "service": 2,
    "residential": 3,
    "primary": 5,
    "motorway": 6,
}
place = "Oxford"
point = (51.751766, -1.260737)
fp = f"./{img_folder}/{place}.{extension}"
fig, ax = ox.plot_figure_ground(
    point=point,
    filepath=fp,
    network_type="all",
    street_widths=street_widths,
    dpi=dpi,
    save=True,
    show=False,
    close=True,
)
Image(fp, height=size, width=size)

![i](i.png)

In [None]:
weight = "length"
O = ox.graph_from_place("Oxford", network_type="drive")
orig = list(O.nodes)[0]
dest = list(O.nodes)[-1]
route = ox.shortest_path(O, orig, dest, weight=weight)

The mission in this instance is to produce an interactive map of the Oxford locale visualising the edge extent. This is achieved simply by creating a GeoDataframe in which the nodes are specified to 'false' such that only edges are visualised. The outcome is labelled as 'edges' are subject to the explore method, with the cartodbpositron tile being specified as it provides ease of observation.

In [None]:
edges = ox.graph_to_gdfs(O, nodes=False)
edges.explore(tiles="cartodbpositron")

![j](j.png)

A similar process is undertaken in this instance but instead in inverse, with the edges being labelled as 'false' to prevent their visualisation. The same tile-set is used as prior for the sake of consistency. When generating nodes one has to the also specify radius size, with a value of 4 given as to preserve visability but also allow for distinction in areas where a large number of nodes are clustered.

![l](l.png)

In [None]:
nodes = ox.graph_to_gdfs(O, edges=False)
nodes.explore(tiles="cartodbpositron", marker_kwds={"radius": 4})

![k](k.png)

This final interactive map displays both the nodes and edges simultaneously while also visualising the betweenness_centrality for each of node. Use of the 'set_node_attributes' allows for the visualisation of a tertiary variable on the spatial plane. Both edges and nodes are set to true ensuring both are visualised while in this instance I've gone ahead and changed the tileset to the dark counterpart as it allows for easier for greater contrast with the high betweeness_centrality values allowing for an immediate distinction to be drawn. As hypothesised above, the higher betweeness centrality values are agglomerated toward a central location upon which many of the radial edges converge. Additionally more peripheral nodes which occupy key points of convergance hold rather high values. One could see how analysis like this could be of paramount importance for transport planners especcially when considering road byelaws or routeway alterations.

In [None]:
nx.set_node_attributes(O, nx.betweenness_centrality(O, weight="length"), name="bc")
nodes, edges = ox.graph_to_gdfs(O, edges=True, nodes =True)
m = edges.explore(color="skyblue", tiles="cartodbdarkmatter")
bosh = nodes.explore(m=m, column="bc", marker_kwds={"radius": 4})
bosh

![l](l.png)

I have then gone ahead an saved the desired O geopackage and graphl onto my computer as I can then load into a different format. For example below I load the gpkg file from my computer as a geopandas dataframe such that the nodes geometry can be hypothetically incoportated into a wider study involving other imported data.

In [None]:
ox.save_graph_geopackage(O, filepath="./data/mynetworkbest.gpkg")
ox.save_graphml(O, filepath="./data/mynetworkbest.graphml")

In [None]:
import pandas as pd
import geopandas as gpd

In [None]:
data.head()

In [None]:
data = gpd.read_file("./data/mynetworkbest.gpkg")
data.explore('bc')

![m](m.png)

Intruigingly only the nodes seem to have loaded in, albeit upon dragging the cursor over the nodes a 'street_count' value is given indicative of the number of edges so I'm unsure as to whats happening. The table also gives a betweeness centrality value indicative of a network.

In [None]:
import osmnx as ox
place = "Oxford, UK"
tags = {"building": True}
oxfordbuld = ox.features_from_place(place, tags)
oxfordbuld.head()

The final portion of this challenge involves use of the 'plot_footprints' method which gives a figure outcome based on the distribution of buildings and recreational open space on OpenStreetMap. The outcome below maps all the buildings and recreational open space in Oxford giving an interesting visualisation of the whole city.

In [None]:
fig, ax = ox.plot_footprints(oxfordbuld, figsize=(8, 6))

![n](n.png)