### CS4423 - Networks
Angela Carnevale  
School of Mathematical and Statistical Sciences  
University of Galway

# Assignment 4

Provide answers to the problems in the boxes provided.  

The buttons at the top of the page can be used to **create
more boxes if needed**.
The type of box can be changed from `Code` to `Markdown`.
`Code` boxes take (and execute) `python` code.
`Markdown` boxes take (and format nicely) **text input**.
In this way, you can provide answers, ask questions, 
or raise issues, in words.

When finished, please print this notebook into a **PDF** file and submit this to
**Canvas**.

**Deadline** is Tuesday 26 March at 5pm.

## Setup

This is a `jupyter` notebook.   You can open and interact
with the notebook through Binder.
Or, you can
install and use `jupyter` as a `python` package on your own laptop or PC.  

The following command loads the `networkx` package into the current session.  
The next command specifies some standard options that can be useful for drawing graphs.  

In order to execute the code in a box,
use the mouse or arrow keys to highlight the box and then press SHIFT-RETURN.

In [None]:
import networkx as nx
import pandas as pd
import matplotlib.pyplot as plt
opts = { "with_labels": True, "node_color": 'y' }

Should it ever happen that the notebook becomes unusable, start again with a fresh copy.

## 1.  The Brain-of-a-Worm Network (30 marks)

The [connectome](https://en.wikipedia.org/wiki/Connectome) of an organism is a comprehensive map of
all neural connections between the neurons in the brain.   [C. Elegans](https://en.wikipedia.org/wiki/Caenorhabditis_elegans) is a small worm (1mm long)
whose neural network has been completely determined by the South African biologist [Sydney Brenner](https://en.wikipedia.org/wiki/Sydney_Brenner), who won a Nobel prize for this work in 2002.

An *undirected* connected version of this network on 279 nodes with 2287 connections is available from the 
book's [website](http://www.complex-networks.net/) and is copied here into a single file [`c_elegans_undir.net`](c_elegans_undir.net).  This file is in the [pajek](http://mrvar.fdv.uni-lj.si/pajek/) format,
and can be imported into this notebook with the `nx.read_pajek` command.  This command
constructs a *multigraph*, which can easily be converted into a (simple) graph by applying 
the `nx.Graph` constructor to it.

In [None]:
G = nx.read_pajek("data/c_elegans_undir.net")
G = nx.Graph(G)

**Don't draw this graph!** It is too big to produce a meaningful picture and could compromise your notebook.

## Tasks

Use appropriate `networkx` methods to determine:
* the number of vertices $n$ of `G`,
* the number of edges $m$ of `G`,
* the characteristic path length $L$ of `G`,
* the clustering coefficient $C$ of `G`,
* the number of triangles $n_{\Delta}$ in `G` and
* the transitivity $T = 3 n_{\Delta} / n_{\wedge}$ of `G`.

In [None]:
## your code here

In [None]:
## your code here

## 2. Degree Distribution (20 marks)

Let's have a look at the degree distribution of our brain-of-a-worm graph $G$:

In [None]:
histG = nx.degree_histogram(G)
dfG = pd.DataFrame(histG)
dfG.plot.bar(figsize=(15, 7))

Any graph $G$ with $n$ nodes and $m$ edges (including ours above!) can be compared to a random $G(n, m)$
graph with the same parameters, or to a random $G(n, p)$ graph with parameter $p = m/\binom{n}{2}$.
One attribute of interest is the [**degree distribution**](https://en.wikipedia.org/wiki/Degree_distribution) of $G$.  We know that the degree distribution 
of a random $G(n, p)$ graph is [binomial](https://en.wikipedia.org/wiki/Binomial_distribution).  How does the worm's brain compare depicted above to that? For a better comparison, you will need to plot the degree distribution of a suitable random graph.

For example, with  $n = 100$ and $m = 292$, one can generate a random graph and
plot its node degrees as a histogram, as follows:

In [None]:
R = nx.gnm_random_graph(100, 292)
hist = nx.degree_histogram(R)
df = pd.DataFrame(hist)
df.plot.bar()

Does that look like a binomial distribution?

## Tasks

* For parameters $n$ and $m$ chosen identical to those of the worm brain graph `G`,
construct a random $G(n, m)$ graph `R`.

* Determine and plot the degree histogram of `R`.

* In your own words, describe the difference in appearance between the first plot in this section (degree distribution of the graph $G$) and the one just produced (degree distribution of the graph $R$). Please use the comment box below.

In [None]:
## your code here

In [None]:
## your code here

... your comments here ...

## 3. Small World Models (20 marks)

Random graphs in the Watts-Strogatz model are obtained from a regular
$(n, d)$-circle graph by randomly rewiring all edges with a given probability $p$.
Such a graph can be generated with the command `nx.watts_strogatz_graph(n, 2*d, p)`
(note that the second argument is actually `2*d`).

In [None]:
n, d, p = 16, 3, 0.16
G = nx.watts_strogatz_graph(n, 2*d, p)
nx.draw_circular(G, **opts)

Watts-Strogatz random graphs are supposed to be like real world networks in the sense that they
combine relatively *short* characteristic path lengths with relatively *high* clustering coefficients.

## Tasks

*  For values $n = 1000$ and $d = 6$, produce a sequence of 51 $(n, d, p)$-WS graphs
for different values of $p$ between $0$ and $1$
(including the extreme cases $p = 0$ and $p = 1$).  Compute and compare their
(graph) clustering coefficients and their characteristic path lengths.
(Use smaller values of $n$ if $n = 1000$ turns out to be too demanding on resources.)

* In your own words, in which range of values of $p$ do the generated graphs
indeed have high clustering and short paths?

<span style="color:red"> Hint: use a for loop and change $p$ as some simple function of the looping variable. If you create a list $V$ in which you store, say, the probability $p$, the clustering $C$ and the shortest path length $L$ then you can visualise $V$ by using `pandas' using the following:</span>
    
`pd.DataFrame(V,columns = ['p','C','L'])`

In [None]:
### your code here

... your comments here ...

## 4. Directed Networks: connectivity and examples (30 marks)

Recall that a **directed network** is $G=(X,E)$ where $X$ is a set of nodes and $E\subset X\times X$ is a set of edges. Here the edges are **ordered pairs** of nodes and can be represented as arrows. The 'DiGraph' constructor in 'networkx' can be used to work with directed networks. For example, we can construct (and draw) the following directed network on $4$ nodes:

In [None]:
D = nx.DiGraph()
D.add_edges_from([(0,1),(1,2),(2,0),(2,3)])

In [None]:
nx.draw(D,**opts)

Also recall the notions of **weak** and **strong** **connectivity** for directed networks. The digraph $D$ has a single weakly connected component, and two strongly connected components:

In [None]:
list(nx.weakly_connected_components(D))

In [None]:
list(nx.strongly_connected_components(D))

## Tasks

1. Give an example of a digraph of order $n=8$ which is weakly connected and whose strongly connected components all contain a single node. Explain why this is the case or perform computations to show that this is the case. (You can define/describe the graph by writing comments or implement this graph in `networkx`.)

In [None]:
### your code here

.... your comments here ...

2. Give an example of a digraph of order $n=8$  which is weakly **and** strongly connected. Explain why this is the case or perform computations to show that this is the case. (You can define/describe the graph by writing comments or implement this graph in `networkx`.)

In [None]:
### your code here

.... your comments here ...

3. A **source** in a digraph is a node with the property that all edges involving this node are **outgoing**. In other words, it is a node with positive out-degree and in-degree equal to $0$. In your own words, explain why a weakly connected directed network containing a source has at least two strongly connected components.

... your comments here...