### CS4423 - Networks
Prof. Götz Pfeiffer<br />
School of Mathematics, Statistics and Applied Mathematics<br />
NUI Galway

#### 7. Power Laws and Scale-Free Graphs

# Lecture 17:  Configuration Model

Many real world graphs have small world behaviour (high clustering and short paths) and
also a power law degree distribution, due to the presence of nodes of exceptionally high degree.
Random graphs do have short paths a but not high clustering; their degree distribution is not described by
a power law.
The small world graphs in the Watts-Strogatz model have high clustering and short paths, but again,
not a power law degree distribution.

So how can one generate a model of a network that does have a power law degree distribution?

Here, we start with a power law, and build a random graph whose degrees are distributed exactly
by this power law:  quite amazingly, the Configuration Model allows the generation of a random graph for
any prescribed degree sequence.

In the next lecture, we'll see a more dynamic approach to this question.


In [None]:
import random
from collections import Counter
import numpy as np
import matplotlib.pyplot as plt
import networkx as nx
opts = { "with_labels": True, "node_color": 'y' }

## The Configuration Model

* In principle, a random graph can be generated in such 
a way that it has any prescribed degree sequence.

<div class="alert alert-warning">
    
**Definition: Configuration Model.**

* Choose numbers $d_i$, $i \in X$,
so that $\sum d_i = 2m$ is an even number.

* Then regard each degree $d_i$ as $d_i$ **stubs** (half-edges) attached to node $i$.

* Compute a random **matching** of pairs of stubs
and build a graph on $X$ with those (full) edges.
    
</div>

<div class="alert alert-success">

**Example**.  Suppose that $X = \{0, 1, 2, 3, 4\}$
and that we want those nodes to have degrees 
$d_0 = 3$, $d_1 = 2$ and $d_2 = d_3 = d_4 = 1$.

This gives a list of stubs $(0, 0, 0, 1, 1, 2, 3, 4)$
where each node $i$ appears as often as its degree $d_i$
requires.

A random shuffle of that list is
$(0, 2, 3, 0, 1, 0, 4, 1)$.

One way to construct a matching is to simply cut this list in half
and match entries of the first half with corresponsing entries in the second half.
$$
\begin{array}{cccc}
0 & 2 & 3 & 0\\
1 & 0 & 4 & 1
\end{array}
$$
Note that $\sum d_i = 8 = 2m$ yields $m = 4$ edges $0-1$, $2-0$, $3-4$, and $0-1$ ...
    
</div>

### A Quick Implementation

In [None]:
degrees = [3, 2, 1, 1, 1]

* Recall that, in Python, list addresses start at $0$,
and `networkx` default node names do likewise.
Let's adopt this convention here.

* Now entry $3$ in position $0$ of the list `degrees` stands for
$3$ entries $0$ in the list of stubs, to be constructed.
Entry $2$ in position $1$ stands for $2$ entries $1$ in
the list of stubs and so on.
In general, entry $d$ in position $i$ stands for
$d$ entries $i$ in the list of stubs.

* Python's list arithmetic (using `m * a` for `m` *repetitions* of a list `a`
and `a + b` for the *concatenation* of lists `a` and `b`)
can be used to quickly convert a degree sequence into a list of stubs as follows.

In [None]:
stubs = [degrees[i] * [i] for i in range(len(degrees))]
stubs

In [None]:
stubs = sum(stubs, [])
stubs

* Let's call this process the **stubs list** of a list of integers and wrap it
  into a `python` function

In [None]:
def stubs_list(a):
    return sum([a[i] * [i] for i in range(len(a))], [])

In [None]:
stubs_list(degrees)

* How to randomly shuffle this list?
The wikipedia page on [random permutations](https://en.wikipedia.org/wiki/Random_permutation#Knuth_shuffles)
recommends a simple algorithm for shuffling the elements of a list `a` in place:
loop over the positions $k$ of the entries in the list, swapping `a[k]` with `a[j]` for some $j \geq k$
(where possibly $j = k$ and the swap has no visible effect).

In [None]:
def knuth_shuffle(a):
    l = len(a)
    for k in range(l):
        j = random.randrange(k, l)
        a[j], a[k] = a[k], a[j]

In [None]:
a = [1,2,3]
knuth_shuffle(a)
a

* Let's test whether this shuffle produces uniformly random outcomes by applying it large number of times
to a short list, while keeping track of which permutation occurs how often ...

In [None]:
shuffles = {}
for i in range(10000):
    a = [1,2,3]
    knuth_shuffle(a)
    key = tuple(a)
    shuffles[key] = shuffles.get(key, 0) + 1
    
print(shuffles)

* So it looks like each possible outcome is approximately equally likely.

* Python's `random` module already contains
a function `shuffle` which does exactly this.

In [None]:
a = [1,2,3]
random.shuffle(a)
a

* Let's test whether this `shuffle` produces uniformly random outcomes ...

In [None]:
shuffles = {}
for i in range(10000):
    a = [1,2,3]
    random.shuffle(a)
    key = tuple(a)
    shuffles[key] = shuffles.get(key, 0) + 1
    
print(shuffles)

* Again, equally likely outcomes.

* So we shuffle the stubs ...

In [None]:
random.shuffle(stubs)
stubs

* Then we match pairs, by cutting the list of
stubs into halves and transposing the resulting array
of 2 rows ...

In [None]:
m = len(stubs) // 2

In [None]:
edges = [stubs[:m], stubs[m:]]
edges

In [None]:
edges = list(zip(*edges))
edges

In [None]:
G = nx.Graph(edges)

In [None]:
G.number_of_edges()

In [None]:
nx.draw(G, **opts)

In [None]:
G.edges()

* All in all, a configuration model can be built as follows.

In [None]:
def configuration(degrees):
    m = sum(degrees) // 2  # should check if sum(degrees) is even ...
    stubs = stubs_list(degrees)
    random.shuffle(stubs)
    edges = list(zip(stubs[:m], stubs[m:]))
    return nx.Graph(edges)

In [None]:
G = configuration([3,2,1,1,1])
nx.draw(G, **opts)
print(G.edges())

## Power Law Degree Distribution

A variant of the `powerlaw` from last time can generate the stubs directly.

In [None]:
def powerstubs(m, p):
    
    # distribute 2*m according to a power law
    l = 0
    x = [l]
    for i in range(2*m-1):
        if random.random() < p:
            l += 1
            x.append(l)
        else:
            k = random.choice(x)
            x.append(k)
            
    return x

Let's create $400$ stubs for approximately $200$ nodes (with $p = \frac12$).

In [None]:
p = powerstubs(200, 0.5)
print(p)

Use Python's `collections.Counter` to count how often each node occurs, i.e., to determine the node degrees.

In [None]:
k = Counter(p)
print(k)

Using another counter on the degrees yields the degree distribution $p_k = n_k/n$.

In [None]:
nk = Counter(k.values())
print(nk)

Let's check whether the $n_k$ follow a power law - they really should, by construction.

In [None]:
xy = np.array(list(nk.items())).T
print(xy)

In [None]:
plt.figure(figsize=(12,8))
plt.loglog(*xy, 'oc')

Finally, let's construct a graph as a configuration model with the given list of degrees.

In [None]:
G = configuration(list(k.values()))
plt.figure(figsize=(12,8))
nx.draw(G, **opts)

Maybe it's better to focus on the giant component.

In [None]:
print([len(c) for c in nx.connected_components(G)])

In [None]:
print(list(nx.connected_components(G)))

In [None]:
C = G.subgraph(max(nx.connected_components(G), key=len))
plt.figure(figsize=(12,8))
nx.draw(C, **opts)

In [None]:
print(nx.degree_histogram(C))

In `networkx`, configuration models can be generated with the function `nx.configuration_model`.

In [None]:
G = nx.configuration_model(list(k.values()))
plt.figure(figsize=(12,8))
nx.draw(G, **opts)

## Code Corner

### `collections`

* `Counter`: [[doc]](https://docs.python.org/3/library/collections.html#collections.Counter)

### `random`

* `random`: [[doc]](https://docs.python.org/3/library/random.html#random.shuffle)


* `randrange`: [[doc]](https://docs.python.org/3/library/random.html#random.randrange)


* `choice`: [[doc]](https://docs.python.org/3/library/random.html#random.choice)


* `shuffle`: [[doc]](https://docs.python.org/3/library/random.html#random.shuffle)

### `networkx`

* `configuration_model`: [[doc]](https://networkx.github.io/documentation/stable/reference/generated/networkx.generators.degree_seq.configuration_model.html)

## Exercises

1. In your own words, describe and justify how `stubs_list` and `Counter` are mutually inverse processes.
How exactly can you recover the stubs from a `Counter` object?  How can you recover the stubs from the
degree distribution?

1. Experiment with different values for `m` and `p` in the `powerstubs` function. In your own words, how does the 
resulting graph generated as a random configuration model from
a power law degree distribution look different from a random graph in the ER-model $G(r, p)$?

1. Generate some configuration models with power law degree distributions for various choices
of $\gamma$ between $2$ and $3$.  Compute their transitivity, clustering and average shortest pathlengths.
Do these values indicate small world behaviour?