# Module 4: Watts-Strogatz Small-World Networks

In this notebook, you'll learn:
- What makes a network "small-world"
- The Watts-Strogatz model and rewiring algorithm
- How small-world properties emerge
- Why this matters for neural networks

In [None]:
import sys
sys.path.insert(0, '../..')

import numpy as np
import matplotlib.pyplot as plt
import networkx as nx

from src.topology import (
    watts_strogatz_graph,
    ring_lattice,
    clustering_coefficient,
    average_path_length,
    small_world_coefficient,
)
from src.visualization import plot_graph, plot_ws_rewiring

np.random.seed(42)

## 4.1 The Six Degrees of Separation

In 1967, Stanley Milgram conducted the "small-world experiment":
- People in Nebraska tried to send letters to someone in Boston
- They could only send to people they knew personally
- On average, it took only **6 steps**!

This is the **small-world phenomenon**: most pairs of nodes in real networks are connected by surprisingly short paths.

## 4.2 The Problem: Regular vs Random Networks

Consider two extremes:

**Regular Lattice (like a ring):**
- High clustering (friends of friends are friends)
- Long path lengths (takes many hops to reach distant nodes)

**Random Graph:**
- Low clustering (connections are random)
- Short path lengths (random shortcuts everywhere)

Real networks have BOTH high clustering AND short paths. How?

In [None]:
# Let's compare a regular ring lattice with a random graph
n = 30
k = 4

# Ring lattice
adj_ring = ring_lattice(n, k)
cc_ring = clustering_coefficient(adj_ring)
apl_ring = average_path_length(adj_ring)

# Random graph with similar density
p = k / n  # approximate same number of edges
G_random = nx.gnm_random_graph(n, int(adj_ring.sum() / 2))
adj_random = nx.to_numpy_array(G_random)
cc_random = clustering_coefficient(adj_random)
apl_random = average_path_length(adj_random, disconnected_value=float('inf'))

print("Ring Lattice:")
print(f"  Clustering: {cc_ring:.4f}")
print(f"  Avg Path Length: {apl_ring:.2f}")
print("\nRandom Graph:")
print(f"  Clustering: {cc_random:.4f}")
print(f"  Avg Path Length: {apl_random:.2f}")

In [None]:
# Visualize both
fig, axes = plt.subplots(1, 2, figsize=(14, 6))

# Ring lattice
G_ring = nx.from_numpy_array(adj_ring)
pos_ring = nx.circular_layout(G_ring)
nx.draw(G_ring, pos_ring, ax=axes[0], node_size=50, node_color='#1f77b4',
        edge_color='#cccccc', width=0.5)
axes[0].set_title(f'Ring Lattice\nC={cc_ring:.3f}, L={apl_ring:.1f}')

# Random graph
pos_random = nx.circular_layout(G_random)
nx.draw(G_random, pos_random, ax=axes[1], node_size=50, node_color='#ff7f0e',
        edge_color='#cccccc', width=0.5)
axes[1].set_title(f'Random Graph\nC={cc_random:.3f}, L={apl_random:.1f}')

plt.tight_layout()
plt.show()

## 4.3 The Watts-Strogatz Model

In 1998, Watts and Strogatz proposed a simple algorithm:

1. Start with a **ring lattice** (each node connected to k nearest neighbors)
2. For each edge, with probability **beta**, rewire it to a random node

The key insight: **A few random shortcuts dramatically reduce path length while preserving most of the clustering!**

In [None]:
# Visualize the rewiring process at different beta values
fig = plot_ws_rewiring(n=20, k=4, betas=[0, 0.01, 0.1, 0.3, 1.0])
plt.show()

## 4.4 The Small-World Sweet Spot

Let's see how clustering (C) and path length (L) change with beta:

In [None]:
from src.visualization import plot_small_world_metrics

# This may take a moment to compute
betas = [0, 0.001, 0.01, 0.05, 0.1, 0.2, 0.3, 0.5, 0.7, 1.0]
fig = plot_small_world_metrics(betas, n=100, k=4, n_trials=3)
plt.show()

## 4.5 Exercise: Find the Small-World Regime

From the plot above, at what range of beta values do we get:
- High clustering (C close to C(0))
- Low path length (L much lower than L(0))

**Your answer:** (fill in below)

In [None]:
# TODO: Test different beta values and find the small-world regime
# A good small-world network has sigma > 1

test_betas = [0.05, 0.1, 0.2, 0.3]

for beta in test_betas:
    adj = watts_strogatz_graph(100, 4, beta, seed=42)
    sigma, details = small_world_coefficient(adj, n_random=3)
    print(f"beta={beta}: sigma={sigma:.2f} (C={details['C']:.3f}, L={details['L']:.2f})")

## 4.6 Why Small-World Networks Matter for Neural Networks

Small-world topology provides:

1. **Efficient information flow**: Short paths = gradients can flow quickly during backpropagation
2. **Modular structure**: High clustering = local processing of related features
3. **Sparse connectivity**: Fewer connections than fully-connected = fewer parameters
4. **Biological plausibility**: Real neural networks (brains) exhibit small-world properties

In the upcoming modules, we'll use WS topology to:
- Initialize sparse neural network connectivity
- Connect modules in multi-modal architectures
- Dynamically rewire networks during training

## Key Takeaways

1. **Small-world networks** have high clustering AND short paths
2. The **Watts-Strogatz model** creates them by rewiring a fraction (beta) of edges
3. The **sweet spot** is typically beta in [0.05, 0.3]
4. Small-world topology is ideal for **sparse neural networks**

**Next:** Continue to `02_topology_comparison.ipynb` to compare different network topologies!