# Networks 

In this lab you'll do some exercises to familiarise yourself with network properties and concepts.

In [None]:
%matplotlib inline
!pip3 install networkx

In [None]:
import itertools
from collections import Counter
import networkx as nx
import matplotlib.pyplot as plt

**Exploring classes in python.** <br>
<p>
dir(class_instance) will show you the available attributes of a class. <br>
attributes include the methods (functions), and variables of the class.<br>
networkx is a class, and both Graph() and DiGraph() are classes.
</p>

Run the following cell to see which variables and methods the networkx class provides:


In [None]:
dir(nx)

<p>Ok, thats a lot of attributes. </p>

We will work with **Graph()** (undirected) and **DiGraph()** (directed) graph classes today. Let's see what attributes the Graph() class has:


In [None]:
graphA = nx.Graph()
dir(graphA)

We can try a few of these out to see how they work. Run the cell below.

In [None]:
print(graphA.is_directed())
print(graphA.size())

graphA.add_edge('A', 'B')
print(graphA.size())

Rather than printing out DiGraph's methods, the following cell will list the attributes which are **exclusive** to Graph or DiGraph.

In [None]:
graphA = nx.Graph()
graphB = nx.DiGraph()

graphA_attributes = set(dir(graphA))
graphB_attributes = set(dir(graphB))

print('\nGraphA (Graph) exclusive attributes:')
print(list(graphA_attributes - graphB_attributes))

print('\nGraphB (DiGraph) exclusive attributes:')
print(list(graphB_attributes - graphA_attributes))


<br>We can see that the directed graph class has a few more methods than the undirected graph.
<br>This is because the directed graph needs to record information about where the edges are pointing, rather than just if an edge exists. 

<br>**Exercise 1(a):**

This exercise is to do by hand, on paper. Given the undirected graph drawn below, write down the adjacency matrix.

<img src="img/small_graph_undirected.png">

**Exercise 1(b):**

This exercise is to do by hand, on paper. Given the directed graph drawn below, write down the adjacency matrix.

<img src="img/small_graph_directed.png">

**Exercise 2:**

Create the above graphs in networkx. use the graph_object.add_edge() method to add edges. 
<br>An example showing how to draw GraphA, and show other representations of the data is given below.

Define an **undirected** networkx graph object and add nodes/edges:

In [None]:
graphA = #???


Define a **directed** networkx graph object and add nodes/edges:

In [None]:
graphB = #???


<br>The following 4 cells show different representations of our graph:<br>

In [None]:
nx.draw_spring(graphA, with_labels=True, node_size=1200, node_color='#eeeeff', edge_color='red')

In [None]:
nx.adjacency_matrix(graphA)

In [None]:
print(nx.adjacency_matrix(graphA))

In [None]:
print(graphA.nodes())
nx.to_numpy_matrix(graphA)

network

<br>networkx seems to be using 'numpy' - a popular python library, to store graph data. 
<br>numpy allows matrix and vector operations to be performed quickly and efficiently. This makes sense if our network gets very big! 

Lets also check graphB to see if we created it correctly. Print the adjacency matrix for `graphB` (as above) in the folowing cell:

In [43]:
# your commands here.

# nx.draw_spring....
# nx.adjacency_matrix...


**Exercise 3:**

Complete the function below to find the degree distrbution for any given graph. You can use the networkx method `graph.degree()`, which returns the number of edges connecting to each node. You should return a tuple of two lists: the first list contains all observed vertex degree values in the graph, and the second contains the counts showing how often a vertex with that degree was observed.

For instance, calling `degree_distribution()` on `graphA` above could return

```([1, 2, 3], [1, 2, 1])```

meaning that there is one vertex with degree 1 (D), two vertices with degree 2 (A and B), and one vertex with degree 3 (C).

These two lists will give us a handy form for plotting the degree distribution.

In [None]:
# Here's the networkx function `graph.degree()`:
graphA.degree('C')

In [None]:
def degree_distribution(graph):
    """
    For the networkx graph provided, return a tuple of lists, where
    the first list gives all observed vertex degrees, and the second list gives
    the corresponding vertex counts.
    """
    # Complete this function

Once you have this function, you can draw the degree distribution with a scatter plot:

In [None]:
# Graph A:
degrees, counts = degree_distribution(graphA)
fig, ax = plt.subplots()
ax.scatter(degrees, counts)

Here are some graphs of types described in lectures. You can generate other graph types with networkx functions described at https://networkx.github.io/documentation/stable/reference/generators.html

A random (Erdos-Renyi) graph:

In [None]:
# 600 nodes, probability of each edge 0.4
random_graph = nx.fast_gnp_random_graph(600, 0.4)

A scale-free graph:

In [None]:
# 600 nodes
scale_free_graph = nx.scale_free_graph(600)

If you are finding the degree distribution correctly, you can plot the distributions for these different graph types:

In [None]:
degrees, counts = degree_distribution(random_graph)
fig, ax = plt.subplots()
ax.scatter(degrees, counts)

In [None]:
degrees, counts = degree_distribution(scale_free_graph)
fig, ax = plt.subplots()
ax.scatter(degrees, counts)

The plot for the scale-free graph doesn't look very clear as the relationship shown in lectures is on a log-log scale. Try using `ax.set_xscale('log')` and `ax.set_yscale('log')` on your plot to see this relationship more clearly.

**Exercise 4:**

Complete the function below to implement the clustering coefficient calculation described in lectures. This function does exist in networkx, but don't use it - implement it yourself. You can however use the `graph.neighbors()` method from networkx to find all the neighbours of a given node.

You can check that your answer gives the same result as the networkx function `nx.clustering()`.

In [None]:
def clustering_coefficient(graph, node_label):
    """
    Calculate and return the clustering coefficient for a node in an undirected graph.
    The clustering coefficient is the number of edges between neighbors 
    divided by the possible number of edges between neighbors.
    """
    # Complete this function

In [None]:
# Should give 0.333333
clustering_coefficient(graphA, 'C')