# P04-02: The Watts-Strogatz Model

*May 14 2020*

In the second unit we implement the Watts-Strogatz model and explore how its parameters influence those characteristics that are associated with the small-world property of (social) networks.

In [1]:
import pathpy as pp
import numpy as np

import seaborn as sns
import matplotlib.pyplot as plt

plt.style.use('default')
sns.set_style("whitegrid")

# The Watts-Strogatz model


In `pathpy` the Watts-Strogatz model is implemented in the function `pp.generators.Watts_Strogatz`:

With this, we can generate and plot random realisations with $n=100$ nodes and different rewiring probabilities of $p=0$, $p=0.04$ and $p=1$. We use the implementation of a lattice layout in `pathpy`, which allows us to arrange nodes in a ring topology.

In [2]:
n = pp.generators.Watts_Strogatz(100, 3, 0.5)
n.plot(layout=l, width=800, height=800)

RuntimeError: dictionary keys changed during iteration

# The small-world regime of the Watts-Strogatz model

In order to explore the parameter regime for which the Watts-Strogatz model generates small-world networks, i.e. networks that combine a small diameter with a large clustering coefficient, we generate a number of random Watts-Strogatz networks for a fixed value of $n=100$ and a number of logarithmically-spaced values of the rewiring parameter $p$. For each value of $p$ generate 20 samples and calculate the average shortest path length and the average global clustering coefficient across these samples.

In [None]:
def plot_small_world(n, p_range, samples=1):
    ccs = []
    apls = []
    
    for p in p_range:
        cc = 0
        apl = 0
        for i in range(samples):
            ws = pp.generators.random_graphs.Watts_Strogatz(n, s=3, p=p)
            cc += pp.statistics.avg_clustering_coefficient(ws)
            apl += pp.algorithms.avg_path_length(ws)
        ccs.append(cc/samples)
        apls.append(apl/samples)
        print('Finished for p = {0}'.format(p))
        
    plt.clf()
    x = plt.plot(p_range, ccs)
    plt.tick_params(axis='both', which='major', labelsize=20)
    plt.tick_params(axis='both', which='minor', labelsize=20)     
    plt.grid(True)
    plt.xscale('log')
    plt.xlabel('$p$', fontsize=20)
    plt.ylabel('average clustering coefficient', fontsize=20)
    plt.subplots_adjust(bottom=0.15)
    plt.subplots_adjust(left=0.25)
    plt.show()
    
    plt.clf()
    x = plt.plot(p_range, apls)
    plt.tick_params(axis='both', which='major', labelsize=20)
    plt.tick_params(axis='both', which='minor', labelsize=20)     
    plt.grid(True)
    plt.xscale('log')
    plt.xlabel('$p$', fontsize=20)
    plt.ylabel('average shortest path length', fontsize=20)
    plt.subplots_adjust(bottom=0.15)
    plt.subplots_adjust(left=0.25)
    plt.show()

We now plot the average shortest path length and the average clustering coefficient (both on the y-axis) against the rewiring probability $p$ (we use a log-scale for the x-axis):

In [None]:
p_range = np.logspace(-3, 0, 10)
plot_small_world(200, p_range, samples=10)

# Empirical small-world networks

We finally test for which of the three empirical networks we find a small-world property. For this, we compare their average shortest path lengths and clustering coefficients to that of random networks with the same macrostate. If the average shortest path lengths are similar to that of a random network (or even smaller) and the clustering coefficient is much larger than expected at random, we call the network a small-world network. 

In [5]:
n_gentoo = pp.io.sql.read_network('networks.db', sql='SELECT DISTINCT source, target FROM gentoo', directed=False, loops=False).largest_connected_component()
n_highschool = pp.io.sql.read_network('networks.db', sql='SELECT DISTINCT source, target FROM highschool', directed=False, loops=False).largest_connected_component()
n_lotr = pp.io.sql.read_network('networks.db', sql='SELECT DISTINCT source, target FROM lotr', directed=False, loops=False).largest_connected_component()

In [6]:
r_gentoo = pp.generators.ER_np_randomize(n_gentoo).largest_connected_component()

print('cc_e = ', pp.statistics.avg_clustering_coefficient(n_gentoo))
print('cc_r = ', pp.statistics.avg_clustering_coefficient(r_gentoo))
print('<l_e> = ', pp.algorithms.avg_path_length(n_gentoo))
print('<l_r> = ', pp.algorithms.avg_path_length(r_gentoo))

ZeroDivisionError: division by zero

In [7]:
r_highschool = pp.generators.ER_np_randomize(n_highschool).largest_connected_component()

print('cc_e = ', pp.statistics.avg_clustering_coefficient(n_highschool))
print('cc_r = ', pp.statistics.avg_clustering_coefficient(r_highschool))
print('<l_e> = ', pp.algorithms.avg_path_length(n_highschool))
print('<l_r> = ', pp.algorithms.avg_path_length(r_highschool))

ZeroDivisionError: division by zero

In [8]:
r_lotr = pp.generators.ER_np_randomize(n_lotr).largest_connected_component()

print('cc_e = ', pp.statistics.avg_clustering_coefficient(n_lotr))
print('cc_r = ', pp.statistics.avg_clustering_coefficient(r_lotr))
print('<l_e> = ', pp.algorithms.avg_path_length(n_lotr))
print('<l_r> = ', pp.algorithms.avg_path_length(r_lotr))

ZeroDivisionError: division by zero