# Small World Network

## The Small World Phenopmenon

* The world is small in the sense that "short" paths exists between almost any two people.
* How short are tehse paths?
* How can we measure their length?


## Milgram Small World Experiment

Set up ion 1960s

* 296 randomly chosen 'starters' asked to forward a letter to a 'target' person
* target was a stock broker in Boston
* Instructions for starter:
   * send letter to target if you know him on a first name basis
   * if you do not know target, send letter and instructions to someone you know on a first name basis who is more likely to know the target
* Some information about the target, such as city, and occupation, was provided.

#### Results:

* 64 out of the 296 letters reached the target.
* median chain length was 6 (consistent with the phrase "six degrees of separation')

**Key Points**

* A relatively large percentage (>20%) of letters reached target.
* Paths were relatively short.
* People were able to find these short paths.


![](images/4-3.1.png)

## Small world of instant message

* Nodes: 240 million active users on Microsoft Instant Messenger
* Edges: Users engaged in two -way communciaiton over a one-month period.
* Estimated median path length of 7.

![](images/4-3.2.png)

## Small World of Facebook

![](images/4-3.3.png)

* Global network: average path lengh in 2008 was **5.28** and in 2011 it was **4.74**
* Path are even shorter if network is restricted to US only.

## Clustering Coefficient

**Local clustering coefficient of a node:**

The local clustering coefficient of a node is the fraction of pairs of the node's friends that are friends with each other. It's roughly talks about, are there lots of triangles are not? And when we take the average local clustering coefficient of all the nodes.

* Facebook 2011: High average CC (decreases with degree)
* MIcrosoft Instant Message: Average CC of 0.13
* IMDB actor network: Average CC 0.78

IN a random graph, the average clustering coefficient would be much smaller.

* social networks tend to have a high clustering coefficient and two, they tend to have small average path length.

![](images/4-3.4.png)

## Path Length and Clustering

* Social networks tend to have high clustering coefficient and small average path length
* Can we think of a network generative mdoel that has these two properties?

* How about the Preferential Attachment Model?



In [4]:
import networkx as nx
G=nx.barabasi_albert_graph(1000,4)
print("average CC:" , nx.average_clustering(G))
print("shortest path:" , nx.average_shortest_path_length(G))

average CC: 0.03421348054868503
shortest path: 3.1736976976976976


Above has small average CC

What if we vary the number of nodes (n) of the number of edges per new node (m)?

![](images/4-3.5.png)

* Small average shortest path: high degree nodes act as hubs and connect many paiurs of nodes
* For a fixed m, clustering coefficient becomes very small as the number of nodes increases
* No mechanism in the Preferential Attachment model favors triangle formation.

## Small World Model

**Motivation**: Real networks exhibit high clustering coefficient and small average shortest paths. Can we think of a model that achieves both of these properties?

small-world model:

* Start with a ring of n nodes, where each node is connected to its k nearest neighbors
* Fix a parameter $p \in [0,1]$
* Consider each edge $(u,v)$. WIth probability $p$, select a node $w$ at random and reqire the edge $(u,v)$ so it becomes  $(u,w)$ `

**Example**

k=2, p=0.4



In [25]:
%matplotlib widget
import matplotlib.pyplot as plt
plt.figure(1)
G=nx.Graph()
G.add_edges_from([('A','B'),('B','C'),('C','D'),('D','E'),('E','F'),('F','G'),('G','H'),('H','I'),('I','J'),('J','K'),('K','L'),('L','A')])
pos = nx.spring_layout(G)
nx.draw_networkx(G, pos)                    

FigureCanvasNbAgg()

In [26]:
plt.figure(2)
G1=nx.Graph()
G1.add_edges_from([('A','B'),('B','C'),('C','D'),('D','E'),('E','F'),('F','G'),('G','H'),('H','I'),('I','J'),('G','K'),('K','L'),('L','A')])
pos = nx.spring_layout(G1)
nx.draw_networkx(G1, pos) 

FigureCanvasNbAgg()

In [27]:
plt.figure(3)
G1=nx.Graph()
G1.add_edges_from([('A','B'),('B','C'),('C','D'),('D','E'),('E','F'),('F','G'),('G','H'),('B','I'),('I','J'),('G','K'),('K','L'),('L','A')])
pos = nx.spring_layout(G1)
nx.draw_networkx(G1, pos) 

FigureCanvasNbAgg()

In [28]:
plt.figure(4)
G1=nx.Graph()
G1.add_edges_from([('A','B'),('B','C'),('C','D'),('D','E'),('J','F'),('F','G'),('G','H'),('H','I'),('I','J'),('G','K'),('K','L'),('L','A')])
pos = nx.spring_layout(G1)
nx.draw_networkx(G1, pos) 

FigureCanvasNbAgg()

In [29]:
plt.figure(5)
G1=nx.Graph()
G1.add_edges_from([('A','B'),('B','C'),('H','D'),('D','E'),('J','F'),('F','G'),('G','H'),('H','I'),('I','J'),('G','K'),('K','L'),('L','A')])
pos = nx.spring_layout(G1)
nx.draw_networkx(G1, pos) 

FigureCanvasNbAgg()

Rewire

In [None]:
plt.figure(4)
G1=nx.Graph()
G1.add_edges_from([('A','B'),('J','C'),('C','D'),('D','E'),('J','F'),('F','G'),('G','H'),('H','I'),('I','J'),('G','K'),('K','L'),('L','A')])
pos = nx.spring_layout(G1)
nx.draw_networkx(G1, pos) 

## Small World Model

![](images/4-3.6.png)

**Regular Lattice** (p=0): no edge is rewired.

**Random Network** (p=1): all edges are rewired.

**Small World Network**: (0 < p < 1) some edges are rewired. Network conservers some local structure but has some randomness.

![](images/4-3.7.png)

What is teh average clustering coefficient and shortest path of a small wor;d network?

It depends on parameters $k$ and $p$.

As $p$ increases from 0 to 0.01:

* Average shortest path decreases rapidly.
* Average clustering coefficient decreases slowly

An instance of a network of 1000 nodes, k=6, and p=0.04 has

* 8.99 average shortest path.
* 0.53 average clustering coefficient.


## Small World Model in NetworkX

`watts_strogatz_graph(n,k,p)` returns a small world network with n nodes, starting with a lattice with each node connected to its k nearest neighbors, and rewiring probability p.



In [30]:
G=nx.watts_strogatz_graph(1000,k=6,p=0.04)
degrees=G.degree()
degree_values=sorted(set(dict(degrees).values()))
histogram=[list(dict(degrees).values()).count(i)/float(nx.number_of_nodes(G)) for i in degree_values]
plt.figure(figsize=(8,8))
plt.bar(degree_values, histogram)
plt.xlabel('Degree')
plt.ylabel('Fraction of Nodes')
plt.show()

FigureCanvasNbAgg()

The rewiring probabilities is very small so most of the edges aren't going to be rewired. So most of the nodes are going to stay with their degree of 6 that they had in the beginning when the ring was first formed. And because there's no mechanism that sort of makes some nodes accumulate a very large degree, then none of them do.

* No power law degree distribution.
* SInce most edges are not reqired, most nodes have degree of 6.
* Since deges are reqired uniformly at random, no node accumulated very high degree, like in the preferential attachment model

Variants of the small world model in NetworkX:

* Small wor;d networks can be disconnected, whith is sometime undesirable.

`connected_watts_strogatz_graph (n, k, p, t)` runs `watts_strogatz_graph(n,k,p)` up to t times, until it returns a connected small wor;d network.

* `newman_watts_strogatz_graph(n,k,p)` runs a model similar to the small world model, but rather than reqiring edges, new edges are added with probability $p$

# Summary

* Real social networks appear to have small shortest paths between nodes and high clustering coefficient.
* The preferential attachment model produces networks with small shortest paths but very small clustering coefficient
* The small world most starts with a ring lattice with nodes connected to k nearest neighbors (high local clustering), and it reqires edges with probability p.
* For small values of p, small world networks have small averageshortest path and high clustering coefficient, matching what we observe in real networks.
* However, the degree distribution of small world network is not a power law.
* On NetworkX, you can use `watts_stogatz_graph(n, K, P)` and other variants to produce small world networks.
