<a href="https://colab.research.google.com/github/JordanDCunha/On-Complexity/blob/main/Chapter6.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

## 6.1 Introduction

- This chapter analyzes data from an **online social network**.

- A **Watts–Strogatz (WS) graph** is used as an initial model:
  - WS graphs have **small-world properties** similar to real networks
  - They exhibit:
    - Short path lengths
    - High clustering
  - However, WS graphs have **low variability in node degree**
    - Most nodes have a similar number of neighbors
    - This does not match real social network data

- This limitation motivates the **Barabási–Albert (BA) model**:
  - BA graphs capture **high variability in node degree**
  - They model the presence of **highly connected hubs**
  - BA graphs have:
    - Short path lengths
    - But **low clustering**, unlike true small-world networks

- The chapter compares:
  - WS graphs
  - BA graphs
  - Their strengths and limitations as **explanatory models** for small-world networks

- Code for this chapter is provided in:
  - `chap06.ipynb`
  - Instructions for using the code are in **Section 1.4**


## 6.2 Social Network Data

- Watts–Strogatz (WS) graphs were designed to model **real-world networks**
  - Examples from the original paper:
    - Film actors
    - Power grids
    - Neural networks (*C. elegans*)

- These networks showed:
  - **High clustering**
  - **Short average path lengths**
  - Key properties of **small-world networks**

- In this section, we analyze a **Facebook social network** dataset
  - Nodes represent users
  - Edges represent friendship connections


## Facebook Dataset (SNAP)

- Source:
  - **Stanford Network Analysis Project (SNAP)**

- Dataset details:
  - 4,039 users (nodes)
  - 88,234 friendships (edges)
  - Users labeled from `0` to `4038`

- File format:
  - One edge per line
  - Each line contains two user IDs


In [None]:
def read_graph(filename):
    G = nx.Graph()
    array = np.loadtxt(filename, dtype=int)
    G.add_edges_from(array)
    return G


## read_graph Function Explanation

- Purpose:
  - Read an edge list from a file
  - Construct a NetworkX graph

- Steps:
  - Create an empty graph
  - Load the file into a NumPy array using `loadtxt`
  - Add edges to the graph using `add_edges_from`

- `dtype=int`:
  - Ensures node labels are integers


In [None]:
fb = read_graph('facebook_combined.txt.gz')
n = len(fb)
m = len(fb.edges())
n, m


## Dataset Size Verification

- `n`:
  - Number of nodes in the graph
  - Result: **4039**

- `m`:
  - Number of edges (friendships)
  - Result: **88,234**

- These values match the dataset documentation


## Small-World Properties to Measure

To determine whether the Facebook network is a small world, we compute:

- **Clustering coefficient (C)**
  - Measures how tightly nodes cluster together

- **Average path length (L)**
  - Measures how many steps it takes to connect two users


## Efficient Clustering Estimation

- Exact clustering is slow for large graphs
  - Time complexity: proportional to \( n \times k^2 \)
    - `n`: number of nodes
    - `k`: average number of neighbors

- NetworkX provides an **approximation** method using random sampling


In [None]:
from networkx.algorithms.approximation import average_clustering
average_clustering(fb, trials=1000)


## average_clustering Explanation

- Estimates the **network average clustering coefficient**
- Uses random sampling for efficiency
- `trials=1000`:
  - Number of samples used in the estimate


In [None]:
def sample_path_lengths(G, nodes=None, trials=1000):
    if nodes is None:
        nodes = list(G)
    else:
        nodes = list(nodes)

    pairs = np.random.choice(nodes, (trials, 2))
    lengths = [nx.shortest_path_length(G, *pair)
               for pair in pairs]
    return lengths


## sample_path_lengths Explanation

- Inputs:
  - `G`: the graph
  - `nodes`: nodes to sample from
    - If `None`, sample from the entire graph
  - `trials`: number of random node pairs

- Process:
  - Randomly select pairs of nodes
  - Compute shortest path length for each pair

- Output:
  - List of sampled path lengths


In [None]:
def estimate_path_length(G, nodes=None, trials=1000):
    return np.mean(sample_path_lengths(G, nodes, trials))


## estimate_path_length Explanation

- Purpose:
  - Estimate the **average path length (L)**

- Method:
  - Sample random shortest paths
  - Return the mean path length


In [None]:
C = average_clustering(fb)
L = estimate_path_length(fb)
C, L


## Results and Interpretation

- Clustering coefficient:
  - **C ≈ 0.61**
  - Indicates strong local clustering

- Average path length:
  - **L ≈ 3.7**
  - Very short for a network with over 4000 users

- Conclusion:
  - The Facebook network exhibits **small-world behavior**


## Next Step

- Goal:
  - Construct a **Watts–Strogatz graph**
  - Match the observed:
    - Clustering coefficient
    - Average path length

- This will allow comparison between:
  - Real social network data
  - WS model behavior


## 6.3 WS Models

- Goal:
  - Build **Watts–Strogatz (WS) graphs** that resemble the Facebook network
  - Match two small-world properties:
    - **High clustering (C)**
    - **Low average path length (L)**

- We start by matching the **average degree** of the Facebook dataset


In [None]:
k = int(round(2 * m / n))
k


## Average Degree Calculation

- Given:
  - `n`: number of nodes
  - `m`: number of edges

- Each edge connects **two nodes**, so:
  \[
  k \approx \frac{2m}{n}
  \]

- Result:
  - `k = 44`
  - Each node will have ~44 neighbors in the WS model


## WS Graph with p = 0 (Ring Lattice)

- When `p = 0`:
  - No edges are rewired
  - The graph is a **regular ring lattice**


In [None]:
lattice = nx.watts_strogatz_graph(n, k, 0)


## Ring Lattice Results

- Clustering coefficient:
  - **C ≈ 0.70**
  - Higher than the Facebook dataset (0.61)

- Average path length:
  - **L ≈ 46**
  - Much larger than the dataset (3.7)

- Conclusion:
  - High clustering ✔
  - Path length too long ✘


## WS Graph with p = 1 (Random Graph)

- When `p = 1`:
  - All edges are randomly rewired
  - The graph behaves like a **random graph**


In [None]:
random_graph = nx.watts_strogatz_graph(n, k, 1)


## Random Graph Results

- Average path length:
  - **L ≈ 2.6**
  - Even shorter than the dataset

- Clustering coefficient:
  - **C ≈ 0.011**
  - Extremely low

- Conclusion:
  - Short paths ✔
  - Clustering far too low ✘


## Choosing an Intermediate p Value

- Idea:
  - Find a value of `p` that balances:
    - Short path length
    - High clustering

- By trial and error:
  - `p = 0.05` works well


In [None]:
ws = nx.watts_strogatz_graph(n, k, 0.05, seed=15)


## WS Graph with p = 0.05 Results

- Clustering coefficient:
  - **C ≈ 0.63**
  - Very close to Facebook (0.61)

- Average path length:
  - **L ≈ 3.2**
  - Close to Facebook (3.7)

- This WS graph closely matches the dataset


## Final Interpretation

- Ring lattice:
  - High clustering
  - Very long paths

- Random graph:
  - Very short paths
  - Almost no clustering

- WS graph (p = 0.05):
  - High clustering
  - Short paths

- Conclusion:
  - **Watts–Strogatz graphs successfully model small-world networks**
  - A small amount of randomness creates realistic social network behavior


## 6.4 Degree

- If the WS graph is a good model of the Facebook network, it should match:
  - The **average degree**
  - The **variability (spread) of degrees** across nodes

- Looking only at the mean is not enough
  - We also need to examine the **distribution of degrees**


In [None]:
def degrees(G):
    return [G.degree(u) for u in G]


## Degree Function Explanation

- Input:
  - `G`: a NetworkX graph

- Output:
  - A list of node degrees
  - One degree value per node

- This list allows us to:
  - Compute statistics (mean, standard deviation)
  - Build degree distributions


## Mean Degree Comparison

- WS model:
  - Mean degree ≈ **44**

- Facebook dataset:
  - Mean degree ≈ **43.7**

- Conclusion:
  - The **average degree matches well**


## Standard Deviation Comparison

- WS model:
  - Standard deviation ≈ **1.47**

- Facebook dataset:
  - Standard deviation ≈ **52.4**

- Conclusion:
  - The WS model fails to capture the **degree variability**


## Why Mean and Std Are Not Enough

- Two graphs can have:
  - The same mean degree
  - Very different structures

- To see the real difference:
  - We must look at the **entire degree distribution**


## Probability Mass Function (PMF)

- We represent degree distributions using a **PMF**
- PMF = Probability Mass Function

- A PMF maps:
  - Degree value → fraction of nodes with that degree


In [None]:
G = nx.Graph()
G.add_edge(1, 0)
G.add_edge(2, 0)
G.add_edge(3, 0)

nx.draw(G)


## Example Graph Explanation

- Node `0` is connected to:
  - Nodes `1`, `2`, and `3`

- Degrees:
  - Node 0 → degree 3
  - Nodes 1, 2, 3 → degree 1


In [None]:
degrees(G)


## Degree List Interpretation

- Degree list:



- Meaning:
- One node has degree 3
- Three nodes have degree 1


In [None]:
from thinkstats2 import Pmf

Pmf(degrees(G))


## PMF Interpretation (Example)

- PMF result:



- Interpretation:
- 75% of nodes have degree 1
- 25% of nodes have degree 3


## Degree Distribution of Facebook Network


In [None]:
pmf_fb = Pmf(degrees(fb))
pmf_fb.Mean(), pmf_fb.Std()


## Facebook Degree Statistics

- Mean degree ≈ **43.69**
- Standard deviation ≈ **52.41**

- Interpretation:
  - Huge variability in number of friends


## Degree Distribution of WS Model


In [None]:
pmf_ws = Pmf(degrees(ws))
pmf_ws.mean(), pmf_ws.std()


## WS Degree Statistics

- Mean degree ≈ **44**
- Standard deviation ≈ **1.47**

- Interpretation:
  - Most nodes have almost the same number of neighbors


## Plotting the Degree Distributions


In [None]:
thinkplot.Pdf(pmf_fb, label='Facebook')
thinkplot.Pdf(pmf_ws, label='WS graph')


## Interpreting Figure 6.1

- WS model:
  - Degrees tightly clustered around ~44
  - Very little variation

- Facebook dataset:
  - Many users with very few friends
  - A small number of users with **extremely many friends**

- This type of distribution is called:
  - **Heavy-tailed**


## Key Takeaways

- WS graphs:
  - Match average degree ✔
  - Fail to match degree variability ✘

- Real social networks:
  - Have **heavy-tailed degree distributions**
  - A few hubs, many low-degree nodes

- Motivation:
  - We need a new model that explains this variability
  - → Leads to the **Barabási–Albert (BA) model**


## 6.5 Heavy-tailed Distributions

- Heavy-tailed distributions are common in:
  - Social networks
  - Biology
  - Economics
  - Complexity science

- They will appear repeatedly throughout this book


## Visualizing Heavy Tails

- Heavy-tailed distributions are hard to interpret on a linear scale
- A clearer picture appears when we:
  - Plot the distribution on a **log–log scale**

- This transformation:
  - Emphasizes the **tail**
  - Highlights rare but extreme values


## Log–Log Scale Intuition

- On a log–log plot:
  - The x-axis shows `log(k)`
  - The y-axis shows `log(PMF(k))`

- Large values of `k` become easier to compare
- Small probabilities become visible


In [None]:
thinkplot.Pdf(pmf_fb, label='Facebook')
thinkplot.Pdf(pmf_ws, label='WS graph')

thinkplot.Config(xscale='log', yscale='log',
                 xlabel='Degree (k)',
                 ylabel='PMF(k)',
                 legend=True)


## Figure 6.2 Interpretation

- Facebook network:
  - The distribution extends far to the right
  - Indicates the presence of **hubs**
  - A small number of nodes have extremely high degree

- WS model:
  - Distribution drops off quickly
  - Very few high-degree nodes


## Power Law Distributions

- A distribution follows a **power law** if:

  $$
  \text{PMF}(k) \sim k^{-\alpha}
  $$

- Where:
  - `PMF(k)` = fraction of nodes with degree `k`
  - `α` = positive constant
  - `~` means “asymptotically proportional to”


## Log Transformation of a Power Law

- Taking the logarithm of both sides:

  \[
  \log(\text{PMF}(k)) \sim -\alpha \log(k)
  \]

- Result:
  - A **straight line** on a log–log plot
  - Slope = **−α**


## What Figure 6.2 Suggests

- For large values of `k`:
  - Facebook degree distribution is approximately linear on log–log axes

- This suggests:
  - A **power law relationship**
  - Or at least a heavy-tailed distribution


## Important Distinction

- All **power law** distributions are heavy-tailed
- But:
  - Not all heavy-tailed distributions are power laws

- Heavy-tailed means:
  - Extreme values occur more often than expected


## Model Mismatch Problem

- WS model:
  - ✔ High clustering
  - ✔ Short path lengths
  - ✘ Narrow degree distribution

- Facebook data:
  - ✔ High clustering
  - ✔ Short path lengths
  - ✔ Heavy-tailed degree distribution


## Motivation for a New Model

- WS graphs fail to explain:
  - Why some nodes become extremely well connected

- This mismatch motivates:
  - A model that explains **degree variability**
  - Especially the emergence of hubs


## Transition to the Next Section

- This leads to the **Barabási–Albert (BA) model**
- The BA model explains:
  - Heavy-tailed degree distributions
  - Through a process called **preferential attachment**


## 6.6 Barabási–Albert (BA) Model

- Introduced by **Barabási and Albert (1999)**
- Paper: *“Emergence of Scaling in Random Networks”*

- Studied real-world networks such as:
  - Movie actor collaboration graphs
  - The World Wide Web
  - Electrical power grids

- Found that many real networks have:
  - **Heavy-tailed degree distributions**


## Degree Distribution Analysis

- Measure:
  - Degree `k` of each node
  - `PMF(k)`: probability a node has degree `k`

- Plot:
  - `PMF(k)` vs `k` on a **log–log scale**

- Observation:
  - Straight line for large `k`
  - Indicates a **heavy-tailed** distribution


## Key Features of the BA Model

The BA model differs from the WS model in three key ways:


### 1. Growth

- The graph **does not start fixed**
- Begins with a small initial graph
- Nodes are added **one at a time**


### 2. Preferential Attachment

- New nodes prefer to attach to:
  - Nodes that already have many connections

- This is known as:
  - **“Rich get richer”**
  - Or **preferential attachment**


### 3. Power-Law Degree Distribution

- The resulting degree distribution:
  - Obeys a **power law**
  - Is heavy-tailed

- Networks with this property are often called:
  - **Scale-free networks**


## Generating a BA Graph with NetworkX


In [None]:
ba = nx.barabasi_albert_graph(n=4039, k=22)


## BA Graph Parameters

- `n = 4039`
  - Number of nodes (same as Facebook dataset)

- `k = 22`
  - Number of edges each new node attaches with
  - Matches average edges per node in the dataset


## Degree Statistics Comparison

- Average degree in BA model:
  - Approximately **44**
  - Very close to the dataset average

- Degree standard deviation:
  - Lower than Facebook
  - Much higher than WS model
  - A significant improvement over WS


## Degree Distribution (Log–Log Plot)

- BA model distribution:
  - Closely matches Facebook data in the tail
  - Appears linear on a log–log plot

- Deviations occur:
  - For small values of `k`
  - Common limitation of the BA model


## Small World Properties of BA Model

- Average path length (L):
  - Very small
  - Even smaller than Facebook network

- This confirms:
  - BA graphs have **short path lengths**


## Clustering Limitation

- Clustering coefficient (C):
  - Much lower than Facebook data
  - Significantly worse than WS model

- Conclusion:
  - BA model fails to reproduce **high clustering**


## Model Comparison Summary

| Property | Facebook | WS Model | BA Model |
|--------|----------|---------|---------|
| Path Length (L) | Short | Short | Very Short |
| Clustering (C) | High | High | Low |
| Degree Distribution | Heavy-tailed | Narrow | Heavy-tailed |


## Key Takeaway

- WS model:
  - Explains small-world structure
  - Fails to explain degree variability

- BA model:
  - Explains degree variability and hubs
  - Fails to explain clustering

- No single model captures everything


## Fill-in-the-Blank Answer (6.6.1)

The three features that distinguish a BA model from a WS model are:

**growth**, **preferential attachment**, and a degree distribution that obeys a **power law**.


## 6.7 Generating BA Graphs

- Previously, BA graphs were generated using NetworkX
- Now we examine **how the Barabási–Albert model works internally**
- This version of the algorithm is simplified for readability


In [None]:
def barabasi_albert_graph(n, k):

    G = nx.empty_graph(k)
    targets = list(range(k))
    repeated_nodes = []

    for source in range(k, n):
        G.add_edges_from(zip([source]*k, targets))

        repeated_nodes.extend(targets)
        repeated_nodes.extend([source] * k)

        targets = _random_subset(repeated_nodes, k)

    return G


## Function Parameters

- `n`
  - Total number of nodes in the final graph

- `k`
  - Number of edges each new node creates
  - Becomes the **average degree / 2**


## Initial Graph Setup

- Start with:
  - `k` nodes
  - No edges

- This ensures:
  - New nodes always have existing nodes to connect to


## Key Data Structures

### `targets`
- List of nodes the next new node will connect to
- Initially:
  - The original `k` nodes
- Later:
  - A random subset chosen using preferential attachment

### `repeated_nodes`
- A list where:
  - Each node appears once per connected edge
- Nodes with higher degree appear more often
- Enables **preferential attachment**


## Main Loop Explanation

- Loop over new nodes from `k` to `n-1`

For each new node (`source`):

1. Add edges to all nodes in `targets`
2. Update `repeated_nodes`:
   - Add each target once
   - Add the new node `k` times
3. Select new targets using `_random_subset`


## Why Preferential Attachment Works

- New targets are chosen from `repeated_nodes`
- Nodes with higher degree:
  - Appear more often
  - Have higher probability of being selected

- Result:
  - **“Rich get richer”** behavior


In [None]:
def _random_subset(repeated_nodes, k):
    targets = set()
    while len(targets) < k:
        x = random.choice(repeated_nodes)
        targets.add(x)
    return targets


## _random_subset Explanation

- Randomly selects `k` **unique nodes**
- Uses a `set` to:
  - Automatically discard duplicates
- Sampling from `repeated_nodes` ensures:
  - Selection probability is proportional to node degree


## Summary of BA Graph Generation

- Nodes are added one at a time
- Each new node connects to `k` existing nodes
- Connection probability depends on existing degree
- Resulting graph:
  - Has a **heavy-tailed (power-law) degree distribution**
  - Models hub formation in real networks


## 6.8 Cumulative Distributions

- Degree distributions are often shown using **PMFs on log-log scales**
- This is common in power-law analysis, but it has drawbacks:
  - PMFs can be noisy
  - Tails are hard to interpret visually

- A better alternative is the **Cumulative Distribution Function (CDF)**


## Cumulative Distribution Function (CDF)

- A CDF maps a value `x` to:
  - The fraction of values **less than or equal to x**

- For degree distributions:
  - CDF(k) = fraction of nodes with degree ≤ k


In [None]:
def cumulative_prob(pmf, x):
    ps = [pmf[value] for value in pmf if value <= x]
    return np.sum(ps)


## cumulative_prob Explanation

- Inputs:
  - `pmf`: probability mass function
  - `x`: value to evaluate

- Process:
  - Sum probabilities for all values ≤ x

- Output:
  - Cumulative probability up to x


In [None]:
cumulative_prob(pmf_fb, 25)


## Interpreting the Result

- Result ≈ 0.506
- Meaning:
  - About **50% of users have 25 or fewer friends**
- Therefore:
  - The **median degree ≈ 25**


## Why CDFs Are Better Than PMFs

- CDFs are:
  - Smoother
  - Less noisy
  - Easier to interpret

- Once familiar, CDFs give a clearer picture of:
  - Distribution shape
  - Tail behavior


In [None]:
from thinkstats2 import Cdf
cdf_fb = Cdf(degrees(fb), label='Facebook')


## Computing the CDF

- `Cdf` takes a list of values (degrees)
- Produces a cumulative distribution object
- Labels are useful for plotting comparisons


In [None]:
thinkplot.Cdf(cdf_fb)


## Interpreting Figure 6.4 (CDF Plot)

- Shows degree CDF for:
  - Facebook dataset
  - WS model
  - BA model

- X-axis:
  - Logarithmic scale

- Observations:
  - WS model differs significantly from data
  - BA model is closer, but still imperfect


## Complementary CDF (CCDF)

- Defined as:
  - CCDF(x) = P(X > x)

- Useful because:
  - If PMF follows a power law
  - CCDF also follows a power law


## CCDF and Power Laws

- For a power-law distribution:
  - CCDF(x) ∝ x^(−α)

- On a log-log plot:
  - CCDF appears as a straight line
  - Slope = −α


## Interpreting Figure 6.5 (CCDF Plot)

- CCDF plotted on a log-log scale
- Comparison:
  - Facebook data
  - WS model
  - BA model

- Observations:
  - WS model does **not** match the tail
  - BA model matches the tail reasonably well
    - Especially for degree > 20


## Key Takeaways

- CDFs and CCDFs are better tools than PMFs for:
  - Heavy-tailed distributions
  - Power-law analysis

- Results:
  - WS model fails to match degree distribution
  - BA model captures the tail behavior well


## 6.9 Explanatory Models

- We began studying networks with **Milgram’s Small World Experiment**
- Key observation:
  - Social networks have **surprisingly short path lengths**
  - Often described as *“six degrees of separation”*

- When we observe something surprising, a natural question is:
  - **Why does this happen?**


## What Is an Explanatory Model?

- One way to answer “why” questions is with an **explanatory model**
- Figure 6.6 illustrates the logical structure of such a model

- An explanatory model:
  - Does not claim to be a perfect copy of reality
  - Attempts to capture **essential features** of a system


## Logical Structure of an Explanatory Model

1. In a real system **S**, we observe a phenomenon **P** that needs explanation

2. We construct a model **M** that is analogous to **S**
   - Elements of **M** correspond to elements of **S**

3. Using simulation or mathematical analysis:
   - The model **M** exhibits a behavior **Q**
   - Behavior **Q** is analogous to **P**

4. We conclude:
   - **S exhibits P because M is similar to S and M exhibits Q**


## Argument by Analogy

- At its core, an explanatory model is an **argument by analogy**

- Logic:
  - If two systems are similar in some ways
  - They may be similar in other ways as well

- This kind of reasoning can be:
  - Intuitive
  - Insightful
  - Persuasive


## Important Limitation

- Explanatory models are **not mathematical proofs**
- They do not guarantee that the explanation is *true*
- They only show that the explanation is *plausible*


## Abstraction in Models

- All models:
  - Leave out details
  - “Abstract away” features considered unimportant

- For any real system:
  - Many different models are possible
  - Each model emphasizes different features


## Competing Explanations

- Different models may:
  - Explain the same phenomenon
  - Emphasize different mechanisms

- When multiple models explain the same observation:
  - Which one is correct?
  - Or are multiple explanations valid?


## Small World Phenomenon as an Example

- The **small world effect** has multiple explanations
- Two major models:
  - Watts–Strogatz (WS) model
  - Barabási–Albert (BA) model


## WS Model Explanation

- The WS model suggests networks are small because:
  - Nodes form **tightly clustered groups**
  - A few **weak ties** connect different clusters

- Result:
  - High clustering
  - Short average path lengths


## BA Model Explanation

- The BA model suggests networks are small because:
  - Some nodes become **high-degree hubs**
  - Hubs grow through **preferential attachment**

- Result:
  - Heavy-tailed degree distributions
  - Efficient global connectivity


## Final Insight

- In young scientific fields:
  - The problem is often **not too few explanations**
  - But **too many plausible explanations**

- Explanatory models help us think clearly
- But they must be:
  - Compared
  - Tested
  - Interpreted with care
