# Creating and visualizing our network graph
After all the work it took to arrive at our list of nodes and edges, this notebook will (I hope) seem surprisingly straightforward—though it will probably leave us with a lot of questions for potential future work.

## 1 - Reloading our data
Rather than running through the code from the last notebook all over again, we'll just reload the data from the .csv files we saved.

### 1a - Connect to Google Drive

In [None]:
from google.colab import drive
drive.mount('/gdrive')

### 1b - Get our nodes list
We'll open the file of nodes that you saved to your output directory and save the information in it to a simple dictionary.

In [None]:
import csv

node_labels = {}
#Uncomment line 8 and comment out line 9 if you didn't get to the point of 
#saving data from the last notebook

# with open('/gdrive/MyDrive/rbs_digital_approaches_2021/data/2021_s1_d2_emergency_nodefile.csv', 'r') as nodefile :
with open('/gdrive/MyDrive/rbs_digital_approaches_2021/output/nodes.csv', 'r') as nodefile :
  nodereader = csv.DictReader(nodefile, delimiter=',', quotechar='"')
  for row in nodereader :
    node_id = int(row['id'])
    label = row['label']
    node_labels.setdefault(node_id, label)

for k, v in node_labels.items() :
  print(k, v)

### 1c - Get our edges table
Just as with our nodes file, except we'll save this information to a list of 3-tuples.

(Note that in both cells, I've converted numerical data to integers to avoid headaches later.)

In [None]:
edges = []

#Uncomment line 5 and comment out line 6 if you didn't get to the point of 
#saving data from the last notebook
# with open('/gdrive/MyDrive/rbs_digital_approaches_2021/data/2021_s1_d2_emergency_edgesfile.csv', 'r') as edgesfile :
with open('/gdrive/MyDrive/rbs_digital_approaches_2021/output/edges.csv', 'r') as edgesfile :
  edgesreader = csv.DictReader(edgesfile, delimiter=',', quotechar='"')
  for row in edgesreader :
    edges.append((int(row['from']), int(row['to']), int(row['weight'])))

for edge in edges :
  print(edge)

## 2 - Creating a network graph with `Networkx`
We import the `networkx` package, then create a new graph object (`G`). 

By using `Graph()`, I've created this as a "undirected" graph, that is one that does not assume any kind of directionality to the relationship between the nodes: all relationships are reciprocal

We then add our list of tuples for our edges with weights as weighted edges.

In [None]:
import networkx as nx

In [None]:
G = nx.Graph()
G.add_weighted_edges_from(edges)
#Need to connect William Bowyer to the one "Sr. and Jr. partnership"
G.add_weighted_edges_from([(41,42,1)])

Believe it or not, we just created a network graph in two lines (plus, uh, however many thousands of lines of code are `Networkx`). `Networkx` can do some basic visualization of its network graphs using the `matplotlib` package, so let's just check to see that we do, in fact have something.

In [None]:
from matplotlib.pyplot import figure
figure(figsize=(10, 10))
nx.draw_networkx(G, with_labels=False)

## 3 - Examining the network before visualizing it
Now, that's not much to look at, I'll grant you. We'll arrive at a more attractive visualization (and there's still more we could do, to be sure), but let's pause a minute before we do.

It's easy to be dazzled by visualizations of network graphs, but it's important to bear in mind that a visualization isn't the network, itself. It's just—well, a visualization of the network. We want to think about the underlying data that are being represented so that we can consider how to make our visualization provide real insight into those data.

Print the next cell, and consider just how sparse this information looks at first blush—and yet, in a real sense, *that's* the "network."

In [None]:
print(G.nodes(data=True))
print(G.edges(data=True))

### 3.a - Adding a little more information to the network
I don't know about you, but I don't feel like I'm very well equipped to draw great insight from a list of tuples, by themselves. Let's add a little bit more information to that blank dictionary that's attached to each node, beginning with the labels we brought in with our node list.

In [None]:
for id, label in node_labels.items() :
  G.add_node(id, id=id, label=label)

print(G.nodes(data=True))

### 3.b - Calculating network algorithms
`Networkx` can make a lot more sense of these tuples than I can. The package can calculate lots of different network metrics for us so that we can begin learning more about the nature of our network and the kinds of connections that we see.

**Note:** I do not consider myself very well-versed in network theory, generally, or in `networkx` specifically. I feel sure there are more elegant way to get at and display some of these values than I've managed to figure out, but perhaps this can get us started.

#### 3.b.1 - Calculating degree
The first algorithm we'll use is "degree," which is simply a count of the number of connections a node has to any other nodes in the network. 

The first cell calculates the degree of all the nodes in the graph and then adds those values to the dictionary of attributes for each node (as we did with labels, above. 

The one after that constructs a list of the degree values and the node labels, then prints them out in reverse sorted order, showing us which nodes in the graph have the greatest number of connections to other nodes.

In [None]:
degrees = dict(nx.degree(G))
nx.set_node_attributes(G, name='degree', values=degrees)
print(G.nodes(data=True))

In [None]:
show_degrees = []
for node, data in G.nodes(data=True) :
  show_degrees.append([data['degree'], data['label'], data['id']])
for show_degrees in reversed(sorted(show_degrees)) :
  print(show_degrees)

#### 3.b.2 - Calculating betweenness centrality
Next we'll calculate betweenness centrality, which calculates how likely a given node is to fall on the shortest path between other nodes in the graph. Betweenness can help us to identify not just which nodes have the most connections, but which nodes are most likely to be in influential or important positions in the network.

(The next two cells take the same approach as the ones above, first calculating betweenness, then displaying betweenness measures for each node, sorted in reverse order.)


In [None]:
betweenness = dict(nx.betweenness_centrality(G))
nx.set_node_attributes(G, name='betweenness', values=betweenness)
print(G.nodes(data=True))

In [None]:
show_betweenness = []
for node, data in G.nodes(data=True) :
  # print(data['label'])
  show_betweenness.append([data['betweenness'], data['label'], data['id']])
for show_betweenness in reversed(sorted(show_betweenness)) :
  print(show_betweenness)

#### 3.b.3 - Thinking about shortest paths
Betweenness is really about which nodes are most likely to connect other nodes—which nodes lie on the shortest path between other given nodes in the network. 

`Networkx` allows us to take a look at shortest paths directly. The next two cells use two different functions related to measuring shortest paths. `shortest_path()` simply returns the first shortest path (I believe weight may be a consideration in ranking them). 

I've had `Networkx` calculate the shortest path between two nodes (these happen to be G. Strahan and J. and J. Knapton, whose rankings on the above algorithms caught my eye, and whom I knew not to have been directly connected in any imprint statement in this set from 1730). The output of the first cell should come as no surprise. It seems like it might be interesting to note, though, that there are, in fact, several equally short paths between Strahan and the Knaptons in this graph.

I've included the ids for the nodes in the displays of degree and betweenness, above, so you might try plugging in other node ids to see what the shortest paths look like: how does your intuition about the meaning of somebody's betweenness centrality compare to what the shortest path calculations show?

In [None]:
print(nx.shortest_path(G, source=40, target=64))

In [None]:
print([p for p in nx.all_shortest_paths(G, source=40, target=64)])

#### 3.b.4 - Feel free to read the docs and try out some more
As I've said, I'm no expert on network theory or on `Networkx`. If you know about networks, you might find it interesting to look through the [documentation of `Networkx's` algorithms](https://networkx.org/documentation/stable/reference/algorithms/index.html), add a few code cells here, and see if you can calculate a different measure that you think might tell us something about this network from 1730.

## 4 - Okay, let's visualize the graph, already
We'll be using `Bokeh` to try to improve on the basic visualization we got with `matplotlib`, above. `Bokeh` is a *very* sophisticated visualization package for Python that I have made some shift to use for visualizing our network graph (with much Googling and use of StackOverflow, it must be said.) I also benefited greatly from an [excellent guide](https://melaniewalsh.github.io/Intro-Cultural-Analytics/Network-Analysis/Making-Network-Viz-with-Bokeh.html) by Melanie Walsh (currently at Cornell University). Still, there seems to be *lots* more that `Bokeh` can do.

We'll create two visualizations. The second will be a variation on the first that will incorporate community detection to make the communities that `Networkx` detects in the graph visible through color.

The code for `Bokeh` visualization doesn't lend itself very well to being broken up into separate cells, so I've added comments to try to explain as best I'm able what each part of the code is doing.

**Note:** You may need to run the code in the cells twice to get the visualizations to display properly.

In [None]:
#Import several components of bokeh that will be used in both visualizations—
#not all components are necessary for this first one
import bokeh.io
from bokeh.io import show
bokeh.io.output_notebook()
from bokeh.plotting import figure, from_networkx
from bokeh.models import (BoxSelectTool, Circle, EdgesAndLinkedNodes, HoverTool,
                          MultiLine, NodesAndLinkedEdges, Plot, Range1d, TapTool, 
                          PanTool, WheelZoomTool)
from bokeh.palettes import Blues8, Reds8, Purples8, Oranges8, Viridis8, Spectral8
from bokeh.transform import linear_cmap

#Create a plot object with some basic settings. Note that we are including
#a toolbar and indicating that our graph should be scaled up on both axes
#to fit the available space
plot = figure(title='Bowyer 1730 network', 
              tools='', toolbar_location='above', sizing_mode='scale_both')

#Create a tool to display information about a node when the pointer hovers over
#it. 
node_hover_tool = HoverTool(tooltips=[('', '@label'),('id', '@id')])

#Add selected tools to our plot object: the node_hover_tool and _edge_hover_tool
#we just created, plus tools for panning around the visualization and using
#the mouse's scroll wheel to zoom in and out 
plot.add_tools(node_hover_tool, TapTool(), BoxSelectTool(), PanTool(), WheelZoomTool())

#Create a graph from the networkx graph we created
graph = from_networkx(G, nx.spring_layout, scale=3, center=(0,0))

#Set attributes for sizing our nodes according to their degree
#This is a parameter that could probably be tweaked in an effort to 
#make a pleasing but still faitful visualization of the graph
node_size_attrs = {}
for node in G.nodes():    
    node_size = G.degree[node] * .5
    node_size_attrs[node] = node_size
nx.set_node_attributes(G, node_size_attrs, 'node_size')

#Render the nodes in the graph using the node_size and node_color attributes we
#created. Change the color of the nodes when they are selected or hovered over
graph.node_renderer.glyph = Circle(size="node_size", fill_color='cornflowerblue')
graph.node_renderer.selection_glyph = Circle(size="node_size", fill_color=Spectral4[2])
graph.node_renderer.hover_glyph = Circle(size="node_size", fill_color=Spectral4[1])

#Render the edges in the graph—same idea as the rendering of nodes, above.
#h/t: https://stackoverflow.com/a/49749123
graph.edge_renderer.data_source.data["line_width"] = [G.get_edge_data(a,b)['weight'] for a, b in G.edges()]
graph.edge_renderer.glyph = MultiLine(line_color="#CCCCCC", line_alpha=0.8, line_width={'field': 'line_width'})
graph.edge_renderer.selection_glyph = MultiLine(line_color=Spectral4[2], line_width={'field': 'line_width'})
graph.edge_renderer.hover_glyph = MultiLine(line_color=Spectral4[1], line_width={'field': 'line_width'})

#Set how the graph should behave when a node is selected or inspected—highlight
#the nodes and linked edges
graph.selection_policy = NodesAndLinkedEdges()
graph.inspection_policy = NodesAndLinkedEdges()

#Add our graph to the plot object
plot.renderers.append(graph)

#Show the plot object
show(plot)

### 4.a - With community detection
This will render the same graph, but adding a visualization of communities that `Networkx` detects using an add-on module.

In [None]:
#h/t https://melaniewalsh.github.io/Intro-Cultural-Analytics/Network-Analysis/Making-Network-Viz-with-Bokeh.html
from networkx.algorithms import community
communities = community.greedy_modularity_communities(G)

import bokeh.io
from bokeh.io import show
bokeh.io.output_notebook()
from bokeh.plotting import figure, from_networkx
from bokeh.models import (BoxSelectTool, Circle, EdgesAndLinkedNodes, HoverTool,
                          MultiLine, NodesAndLinkedEdges, Plot, Range1d, TapTool, 
                          PanTool, WheelZoomTool)
from bokeh.palettes import Blues8, Reds8, Purples8, Oranges8, Viridis8, Spectral8
from bokeh.transform import linear_cmap

plot = figure(title='Bowyer 1730 network', 
              tools='', toolbar_location='above', sizing_mode='scale_both')

node_hover_tool = HoverTool(tooltips=[('', '@label'),('id', '@id')])
plot.add_tools(node_hover_tool, TapTool(), BoxSelectTool(), PanTool(), WheelZoomTool())

graph = from_networkx(G, nx.spring_layout, scale=3, center=(0,0))

#Handle styling of detected communities
#h/t https://melaniewalsh.github.io/Intro-Cultural-Analytics/Network-Analysis/Making-Network-Viz-with-Bokeh.html
# Create empty dictionaries
modularity_class = {}
modularity_color = {}
#Loop through each community in the network
for community_number, community in enumerate(communities):
    #For each member of the community, add their community number and a distinct color
    for name in community: 
        modularity_class[name] = community_number
        modularity_color[name] = Spectral8[community_number]

# Add modularity class and color as attributes from the network above
nx.set_node_attributes(G, modularity_class, 'modularity_class')
nx.set_node_attributes(G, modularity_color, 'modularity_color')

color_by_this_attribute = 'modularity_color'
#Pick a color palette — Blues8, Reds8, Purples8, Oranges8, Viridis8
color_palette = Viridis8


node_size_attrs = {}
for node in G.nodes():    
    node_size = G.degree[node] * .5
    node_size_attrs[node] = node_size
nx.set_node_attributes(G, node_size_attrs, 'node_size')

graph.node_renderer.glyph = Circle(size="node_size", fill_color=color_by_this_attribute)
graph.node_renderer.selection_glyph = Circle(size="node_size", fill_color=Spectral4[2])
graph.node_renderer.hover_glyph = Circle(size="node_size", fill_color=Spectral4[1])

graph.edge_renderer.data_source.data["line_width"] = [G.get_edge_data(a,b)['weight'] for a, b in G.edges()]
graph.edge_renderer.glyph = MultiLine(line_color="#CCCCCC", line_alpha=0.8, line_width={'field': 'line_width'})
graph.edge_renderer.selection_glyph = MultiLine(line_color=Spectral4[2], line_width={'field': 'line_width'})
graph.edge_renderer.hover_glyph = MultiLine(line_color=Spectral4[1], line_width={'field': 'line_width'})

graph.selection_policy = NodesAndLinkedEdges()
graph.inspection_policy = NodesAndLinkedEdges()

plot.renderers.append(graph)

show(plot)

## 5 - Next steps
As I suggested above, the visualizations here are just scratching he surface of what `Bokeh` can do. If you were preparing network graphs of this sort for a web-based project, you would probably end up using different tools. The [D3](https://d3js.org) might be one possibility to explore.

This notebook has only touched on the kinds of things that we might discover from network analysis, but should at least give us a starting point for discussion.