### CS4423 - Networks
Angela Carnevale  
School of Mathematical and Statistical Sciences  
University of Galway

# Assignment 3

Provide answers to the problems in the boxes provided.  

The buttons at the top of the page can be used to **create
more boxes if needed**.
The type of box can be changed from `Code` to `Markdown`.
`Code` boxes take (and execute) `python` code.
`Markdown` boxes take (and format nicely) **text input**.
In this way, you can provide answers, ask questions, 
or raise issues, in words.

When finished, please print this notebook into a **PDF** file and submit this to
**Canvas**.

**Deadline** is Tuesday, March 12 at 5pm.

## Setup

This is a `jupyter` notebook.   You can open and interact
with the notebook through Binder.

Or, you can
install and use `jupyter` as a `python` package on your own laptop or PC.  

The following command loads the `networkx` package into the current session.  
The next command specifies some standard options that can be useful for drawing graphs.  

In order to execute the code in a box,
use the mouse or arrow keys to highlight the box and then press SHIFT-RETURN.

In [None]:
import networkx as nx
import pandas as pd
import matplotlib.pyplot as plt
opts = { "with_labels": True, "node_color": 'y' }
opts2 = { "with_labels": True, "node_color": 'm' }

Should it ever happen that the notebook becomes unusable, start again with a fresh copy.

## 1.  ER Model $A$

**Definition (ER Model $A$: Uniform Random Graphs).**
Let $n \geq 1$, let $N = \binom{n}{2}$ and let $0 \leq m \leq N$.

The model $G(n, m)$ consists of the ensemble of graphs $G$
with $n$ nodes $X = \{0, 1, \dots, n{-}1\}$, and $m$ randomly selected
edges, chosen uniformly from the $N$ possible edges.


Model $A$ random graphs in `networkx` can be generated with the function `nx.gnm_random_graph(n, m)`,
where parameter $n$ gives the number of nodes and parameter $m$ the (exact) number of edges of the graph. For example:

In [None]:
plt.figure(figsize=(12,5))
G = nx.gnm_random_graph(16, 15)
nx.draw(G, **opts)

## Tasks (25+5 marks)

1. Draw $2$ random graphs sampled from model $A$ with $n = 25$ nodes and $m = 40$ edges. For each graph:
  * Determine the number of triads and of triangles.
  * Determine the clustering coefficient $C$ and the transitivity $T$.


In [None]:
### your code here

In [None]:
### your code here

In [None]:
### your code here

... comments here ...

2. With $n$ and $m$ as in Task 1. above, is it possible to sample a graph that is a tree? (Justify your answer)

.... comments here ...

## 2.  ER Model $B$

**Definition (ER Model $B$: Binomial Random Graphs).**
Let $n \geq 1$, let $N = \binom{n}{2}$ and let $0 \leq p \leq 1$.


The model $G(n, p)$ consists of the ensemble of graphs $G$
with $n$ nodes $X = \{0, 1, \dots, n{-}1\}$, and each of the $N$
possible edges chosen with probability $p$.

Model $B$ random graphs in `networkx` can be generated with the function `nx.gnp_random_graph(n, p)`,
where parameter $n$ gives the number of nodes and parameter $p \in [0, 1]$ the edge probability. For example:

In [None]:
plt.figure(figsize=(12,5))
G = nx.gnp_random_graph(16, 0.125)
print("#edges: ", G.number_of_edges())
nx.draw(G, **opts2)

## Tasks (25+5 marks)

1. Draw $2$ random graphs sampled from model $B$ with $n = 25$ nodes and edge probability $p$ such that the _expected_ size of the graph is $40$. For each graph:
  * Determine the number of triads and of triangles.
  * Determine the clustering coefficient $C$ and the transitivity $T$.



In [None]:
### your code here

In [None]:
### your code here

In [None]:
### your code here

... comments here...

2. With $n$ and $p$ as in Task 1. above, is it possible to sample a graph that is a tree? (Justify your answer)

.... comments here ...

## 3. Degree Distribution

The **degree distribution** of a graph $G = (X, E)$ is the probability distribution of the node degrees of the graph $G$, i.e. the function $p \colon \mathbb{N}_0 \to \mathbb{R}$ defined by
$$
p_k = \frac{n_k}{n},
$$
where $n = |X|$ is the total number of nodes in $G$, and $n_k$ is the number of nodes of degree $k$.
(Note hat $\sum_k p_k = 1$.)

In `networkx`, the numbers $n_k$ can be determined by the function `nx.degree_histogram`.
Then `python` list comprehension can be used to compute the numbers $p_k$ from those.
And those numbers, turned into a `pandas` dataframe, can be plotted nicely and quickly.

In [None]:
G = nx.gnp_random_graph(16, 0.125)
n = G.number_of_nodes()
histogram = nx.degree_histogram(G)
distribution = [x/n for x in histogram]
df = pd.DataFrame(distribution)
df.plot()

The degree distribution of a model $B$ random graph is known to follow a **binomial distribution** 
$\mathrm{Bin}(n-1, p)$ of the
form 
$$
p_k = \binom{n-1}{k} p^k (1-p)^{n-1-k}
$$

Using the formula
$$
\binom{n}{k} = \frac{n \cdot (n-1) \dotsm (n-k+1)}{1 \cdot 2 \dotsm k}
$$
in `python`, the binomial coefficient $\binom{n}{k}$ can be computed with the following function:

In [None]:
def binomial(n, k):
    prd, top, bot = 1, n, 1
    for i in range(k):
        prd = (prd * top) // bot
        top, bot = top - 1, bot + 1
    return prd

The binomial distribution $\mathrm{Bin}(n, p)$ can then be defined as: 

In [None]:
def b_dist(n, p, k):
    return binomial(n, k) * p**k * (1-p)**(n-k)

In order to compare the degree distribution of a random graph $G$ on $16$ points
to the corresponding binomial distribution, one can compute and plot the values
of $\mathrm{Bin}(16, p)$ for a suitable value of $p$, and $k$ ranging from $0$
to the highest node degree in $G$, as follows.

In [None]:
len(histogram)

In [None]:
n, p = 16, 0.0125
bb = [b_dist(n-1, p, k) for k in range(len(histogram))]
df = pd.DataFrame(bb)
df.plot()

In the limit $n \to \infty$ (keeping the expected average degree $p (n-1)$ constant), the binomial distribution $\mathrm{Bin}(n-1, p)$ is well approximated by
the **Poisson distribution** defined by
$$
p_k = e^{-\lambda} \frac{\lambda^k}{k!},
$$
where $\lambda = p (n-1)$.

Using the functions `exp` and `factorial` from `python`'s `math` library, one can
compute the Poisson distribution with the follwing `python` function:

In [None]:
from math import exp, factorial
def p_dist(l, k):
    return exp(-l) * l**k / factorial(k)

## Tasks (40 marks)

* Create a model $B$ random graph on $n = 200$ points, 
with edge probability $p = 0.025$, and plot its degree distribution.


In [None]:
### your code here

* Create a model $B$ random graph on $n = 500$ points, 
with edge probability $p = 0.01$, and plot its degree distribution.


* Compute and plot the binomial distribution that corresponds to a random model $B$ graph
on $n = 500$ points with $p = 0.01$ and $k$ ranging from $0$
to the highest node degree in $G$.

In [None]:
### your code here

In [None]:
### your code here

* Repeat the previous two steps on more nodes: create a model $B$ random graph on $n = 2000$ points, 
with edge probability $p = 0.0025$, and plot its degree distribution.


* Compute and plot the binomial distribution that corresponds to a random model $B$ graph
on $n = 2000$ points with $p = 0.0025$ and $k$ ranging from $0$
to the highest node degree in $G$.

In [None]:
### your code here

In [None]:
### your code here

* Now compute and plot the Poisson distribution that corresponds to a random model $B$ graph
on $n = 2000$ points with $p = 0.0025$.

* Compare the plots you obtained: why do some of them show similar profiles? 

In [None]:
### your code here

... your comments here ...