# Week 6: Dynamics on Networks — Assignment

**Learning objectives** — In this assignment you will:

- Implement one step of the stochastic SIR model
- Run Monte Carlo SIR simulations and aggregate results
- Implement random and targeted immunization strategies
- Compare the effectiveness of immunization approaches

## Grading

| Section | Part | Function | Points |
|---------|------|----------|--------|
| 1 | SIR Step | `sir_step(G, S, I, R, beta, gamma, rng)` | 15 |
| 2 | Monte Carlo | `monte_carlo_sir(G, beta, gamma, n_seeds, n_runs, max_steps)` | 20 |
| 3 | Random Immunization | `random_immunize(G, fraction)` | 15 |
| 4 | Targeted Immunization | `targeted_immunize(G, fraction)` | 20 |
| 5 | Acquaintance Immunization | `acquaintance_immunize(G, fraction, rng)` | 20 |
| — | Written Questions | — | 10 |
| | **Total** | | **100** |

## Before You Start

This assignment builds on the Week 6 lab. Make sure you are comfortable with:

- **SIR model** — S → I → R with infection rate β and recovery rate γ (Lab Section 2)
- **Stochastic network SIR** — each infected node independently tries to infect susceptible neighbors; recovery is also probabilistic (Lab Section 4)
- **Monte Carlo approach** — run many simulations and aggregate (mean, std) to characterize stochastic outcomes (Lab Section 4)
- **Network robustness** — the Molloy-Reed criterion and the robustness paradox (Week 4 Lab, Section 11)
- **Immunization strategies** — random, targeted (remove hubs), and acquaintance (friendship paradox) (Lab Section 7)

Sections 1-2 ask you to implement SIR step-by-step. Sections 3-5 implement three immunization strategies. Review the lab’s `network_sir()` and `acquaintance_immunize()` functions for the algorithm logic.

In [None]:
import networkx as nx
import numpy as np
import matplotlib.pyplot as plt
from netsci.loaders import load_graph
from netsci.utils import SEED

In [None]:
G_air = load_graph("airports")
G_fb = load_graph("facebook")

---
## Section 1: SIR Step (15 pts)

Implement a single time step of the stochastic SIR model on a network.

At each step:
1. Each infected node tries to infect each susceptible neighbor with probability `beta`
2. Each infected node recovers with probability `gamma`

Return the new (S, I, R) sets.

In [None]:
def sir_step(G, S, I, R, beta, gamma, rng):
    """Perform one SIR time step.

    Parameters
    ----------
    G : nx.Graph
    S : set — susceptible nodes
    I : set — infected nodes
    R : set — recovered nodes
    beta : float — infection probability per edge
    gamma : float — recovery probability per node
    rng : np.random.Generator

    Returns
    -------
    (set, set, set) — new (S, I, R)
    """
    # YOUR CODE HERE
    raise NotImplementedError()

In [None]:
# --- Validation ---
# Conservation law: S + I + R = N always
_rng = np.random.default_rng(SEED)
_nodes = list(G_fb.nodes())
_N = len(_nodes)
_I0 = set(_rng.choice(_nodes, size=3, replace=False))
_S0 = set(_nodes) - _I0
_R0 = set()

_S, _I, _R = sir_step(G_fb, _S0, _I0, _R0, beta=0.1, gamma=0.1, rng=_rng)
assert isinstance(_S, set) and isinstance(_I, set) and isinstance(_R, set)
assert len(_S) + len(_I) + len(_R) == _N, "Conservation law violated: S+I+R != N"
# No overlap
assert len(_S & _I) == 0 and len(_S & _R) == 0 and len(_I & _R) == 0
print(
    f"After 1 step: S={len(_S)}, I={len(_I)}, R={len(_R)} (total={len(_S) + len(_I) + len(_R)})"
)

# With gamma=1.0, all infected should recover in one step
_rng2 = np.random.default_rng(SEED)
_S2, _I2, _R2 = sir_step(
    G_fb, _S0.copy(), _I0.copy(), _R0.copy(), beta=0.0, gamma=1.0, rng=_rng2
)
# All originally infected should have recovered (beta=0 means no new infections)
assert len(_I2) == 0, "With gamma=1.0 and beta=0.0, infected set should be empty"
print("Section 1 passed!")

---
## Section 2: Monte Carlo SIR (20 pts)

Run multiple SIR simulations and return aggregated results.

Return a dict with:
- `"mean_peak_infected"`: mean of the peak infected fraction across runs
- `"mean_total_infected"`: mean of the total ever-infected fraction (I+R at end)
- `"curves"`: list of infected-count lists (one per run)

Create the RNG inside the function with `np.random.default_rng(SEED)`.

In [None]:
def monte_carlo_sir(G, beta, gamma, n_seeds=3, n_runs=20, max_steps=200):
    """Run Monte Carlo SIR simulations.

    Parameters
    ----------
    G : nx.Graph
    beta : float
    gamma : float
    n_seeds : int — number of initially infected nodes
    n_runs : int — number of simulations
    max_steps : int — maximum time steps per run

    Returns
    -------
    dict with 'mean_peak_infected', 'mean_total_infected', 'curves'
    """
    # YOUR CODE HERE
    raise NotImplementedError()

In [None]:
# --- Validation ---
_mc = monte_carlo_sir(G_fb, beta=0.1, gamma=0.1, n_seeds=3, n_runs=20, max_steps=100)
assert "mean_peak_infected" in _mc
assert "mean_total_infected" in _mc
assert "curves" in _mc
assert len(_mc["curves"]) == 20
assert 0 <= _mc["mean_peak_infected"] <= 1.0
assert 0 <= _mc["mean_total_infected"] <= 1.0
print(f"Mean peak infected: {_mc['mean_peak_infected']:.2%}")
print(f"Mean total infected: {_mc['mean_total_infected']:.2%}")

# With gamma=1.0, epidemic should die fast (very few total infected)
_mc_fast = monte_carlo_sir(
    G_fb, beta=0.05, gamma=1.0, n_seeds=3, n_runs=10, max_steps=50
)
assert _mc_fast["mean_total_infected"] < 0.15, (
    f"High gamma should suppress epidemic, got {_mc_fast['mean_total_infected']:.2%}"
)
print(f"High gamma: total infected = {_mc_fast['mean_total_infected']:.2%}")
print("Section 2 passed!")

---
## Section 3: Random Immunization (15 pts)

Remove a random fraction of nodes from the graph. Use `seed=SEED` for reproducibility.
Return a **copy** of the graph with nodes removed.

In [None]:
def random_immunize(G, fraction):
    """Remove a random fraction of nodes (immunization).

    Parameters
    ----------
    G : nx.Graph
    fraction : float (0 to 1)

    Returns
    -------
    nx.Graph — copy with nodes removed
    """
    # YOUR CODE HERE
    raise NotImplementedError()

In [None]:
# --- Validation ---
_g_imm = random_immunize(G_air, 0.1)
assert isinstance(_g_imm, nx.Graph)
# Should have removed ~10% of nodes
_expected_n = G_air.number_of_nodes() - int(G_air.number_of_nodes() * 0.1)
assert _g_imm.number_of_nodes() == _expected_n, (
    f"Expected {_expected_n} nodes, got {_g_imm.number_of_nodes()}"
)
# Original graph should be unchanged
assert G_air.number_of_nodes() == 500
print(f"Random immunization (10%): {_g_imm.number_of_nodes()} nodes remain")
print("Section 3 passed!")

---
## Section 4: Targeted Immunization (20 pts)

Remove the highest-degree nodes first. Return a **copy** of the graph with nodes removed.

In [None]:
def targeted_immunize(G, fraction):
    """Remove the highest-degree nodes (targeted immunization).

    Parameters
    ----------
    G : nx.Graph
    fraction : float (0 to 1)

    Returns
    -------
    nx.Graph — copy with hubs removed
    """
    # YOUR CODE HERE
    raise NotImplementedError()

In [None]:
# --- Validation ---
# Note: this cell depends on Section 3 (random_immunize) being implemented
_g_targ = targeted_immunize(G_air, 0.1)
assert isinstance(_g_targ, nx.Graph)
_expected_n = G_air.number_of_nodes() - int(G_air.number_of_nodes() * 0.1)
assert _g_targ.number_of_nodes() == _expected_n

# The highest-degree node should have been removed
_top_node = max(G_air.nodes(), key=lambda n: G_air.degree(n))
assert _top_node not in _g_targ.nodes(), "Highest-degree node should be removed"

# Targeted should reduce max degree more than random
_max_deg_targ = (
    max(d for _, d in _g_targ.degree()) if _g_targ.number_of_nodes() > 0 else 0
)
_g_rand = random_immunize(G_air, 0.1)
_max_deg_rand = (
    max(d for _, d in _g_rand.degree()) if _g_rand.number_of_nodes() > 0 else 0
)
print(f"Max degree after random immunization:   {_max_deg_rand}")
print(f"Max degree after targeted immunization: {_max_deg_targ}")

# Original graph unchanged
assert G_air.number_of_nodes() == 500
print("Section 4 passed!")

In [None]:
# Compare immunization effectiveness (this may take ~30 seconds)
fractions = [0.0, 0.05, 0.10, 0.15, 0.20]
results_rand, results_targ = [], []

for f in fractions:
    if f == 0.0:
        mc = monte_carlo_sir(
            G_air, beta=0.05, gamma=0.1, n_seeds=3, n_runs=10, max_steps=100
        )
        results_rand.append(mc["mean_peak_infected"])
        results_targ.append(mc["mean_peak_infected"])
    else:
        g_r = random_immunize(G_air, f)
        g_t = targeted_immunize(G_air, f)
        mc_r = monte_carlo_sir(
            g_r, beta=0.05, gamma=0.1, n_seeds=3, n_runs=10, max_steps=100
        )
        mc_t = monte_carlo_sir(
            g_t, beta=0.05, gamma=0.1, n_seeds=3, n_runs=10, max_steps=100
        )
        results_rand.append(mc_r["mean_peak_infected"])
        results_targ.append(mc_t["mean_peak_infected"])

with plt.style.context("seaborn-v0_8-muted"):
    fig, ax = plt.subplots(figsize=(7, 5))
    ax.plot([f * 100 for f in fractions], results_rand, "o-", label="Random")
    ax.plot([f * 100 for f in fractions], results_targ, "s-", label="Targeted")
    ax.set_xlabel("% immunized")
    ax.set_ylabel("Mean peak infected")
    ax.set_title("Immunization Comparison")
    ax.legend()
    fig.tight_layout()
    plt.show()

---
## Section 5: Acquaintance Immunization (20 pts)

Implement **acquaintance immunization** using the friendship paradox:

1. Repeat until the desired fraction of nodes is immunized:
   - Pick a random node
   - Pick a random neighbor of that node
   - Mark the neighbor as immunized
2. Remove all immunized nodes from the graph
3. Return a **copy** of the graph with immunized nodes removed

This strategy preferentially targets hubs without requiring global network knowledge.

In [None]:
def acquaintance_immunize(G, fraction, rng=None):
    """Acquaintance immunization via the friendship paradox.

    Parameters
    ----------
    G : nx.Graph
    fraction : float (0 to 1) — fraction of nodes to immunize
    rng : np.random.Generator, optional

    Returns
    -------
    nx.Graph — copy with immunized nodes removed
    """
    # YOUR CODE HERE
    raise NotImplementedError()

In [None]:
# --- Validation ---
_rng = np.random.default_rng(SEED)
_g_acq = acquaintance_immunize(G_air, 0.1, rng=_rng)
assert isinstance(_g_acq, nx.Graph)
_expected_n = G_air.number_of_nodes() - int(G_air.number_of_nodes() * 0.1)
assert _g_acq.number_of_nodes() == _expected_n, (
    f"Expected {_expected_n} nodes, got {_g_acq.number_of_nodes()}"
)
# Original graph unchanged
assert G_air.number_of_nodes() == 500

# Acquaintance should target higher-degree nodes than random
_g_rand = random_immunize(G_air, 0.1)
# Average degree of removed nodes should be higher for acquaintance
_removed_acq = set(G_air.nodes()) - set(_g_acq.nodes())
_removed_rand = set(G_air.nodes()) - set(_g_rand.nodes())
_avg_deg_removed_acq = np.mean([G_air.degree(n) for n in _removed_acq])
_avg_deg_removed_rand = np.mean([G_air.degree(n) for n in _removed_rand])
assert _avg_deg_removed_acq > _avg_deg_removed_rand, (
    f"Acquaintance should remove higher-degree nodes: acq={_avg_deg_removed_acq:.1f} vs rand={_avg_deg_removed_rand:.1f}"
)
print(
    f"Avg degree of removed nodes — Random: {_avg_deg_removed_rand:.1f}, Acquaintance: {_avg_deg_removed_acq:.1f}"
)
print("Acquaintance immunization preferentially targets hubs!")
print("Section 5 passed!")

In [None]:
# Full comparison of all three strategies (this may take ~1 minute)
fractions_all = [0.0, 0.05, 0.10, 0.15, 0.20]
res_rand_all, res_targ_all, res_acq_all = [], [], []

for f in fractions_all:
    if f == 0.0:
        mc = monte_carlo_sir(
            G_air, beta=0.05, gamma=0.1, n_seeds=3, n_runs=10, max_steps=100
        )
        res_rand_all.append(mc["mean_peak_infected"])
        res_targ_all.append(mc["mean_peak_infected"])
        res_acq_all.append(mc["mean_peak_infected"])
    else:
        g_r = random_immunize(G_air, f)
        g_t = targeted_immunize(G_air, f)
        g_a = acquaintance_immunize(G_air, f, rng=np.random.default_rng(SEED))
        mc_r = monte_carlo_sir(
            g_r, beta=0.05, gamma=0.1, n_seeds=3, n_runs=10, max_steps=100
        )
        mc_t = monte_carlo_sir(
            g_t, beta=0.05, gamma=0.1, n_seeds=3, n_runs=10, max_steps=100
        )
        mc_a = monte_carlo_sir(
            g_a, beta=0.05, gamma=0.1, n_seeds=3, n_runs=10, max_steps=100
        )
        res_rand_all.append(mc_r["mean_peak_infected"])
        res_targ_all.append(mc_t["mean_peak_infected"])
        res_acq_all.append(mc_a["mean_peak_infected"])

with plt.style.context("seaborn-v0_8-muted"):
    fig, ax = plt.subplots(figsize=(7, 5))
    ax.plot([f * 100 for f in fractions_all], res_rand_all, "o-", label="Random")
    ax.plot([f * 100 for f in fractions_all], res_acq_all, "^-", label="Acquaintance")
    ax.plot([f * 100 for f in fractions_all], res_targ_all, "s-", label="Targeted")
    ax.set_xlabel("% immunized")
    ax.set_ylabel("Mean peak infected")
    ax.set_title("Three Immunization Strategies Compared")
    ax.legend()
    fig.tight_layout()
    plt.show()

print("Acquaintance should outperform random and approach targeted.")

---
## Written Questions (10 pts)

### Question 1 (5 pts)

During an epidemic simulation, at what phase is the **variance across runs** highest —
at the start, at the peak, or at the end? Why?
Think about what is random in each phase.

*Hints to guide your thinking:*
- *At the **start**, only a few nodes are infected — what is random? (Which neighbors get infected first.)*
- *At the **peak**, the outcome depends on early random events that have compounded — some runs explode, others fizzle. Is the peak height or timing more variable?*
- *At the **end**, most runs have converged (nearly everyone is recovered). Why does variance shrink here?*
- *Think of it like coin flips: one flip has maximum uncertainty, but after 1000 flips the average is very stable.*

**Your Answer:**



### Question 2 (5 pts)

**Acquaintance immunization** works like this: pick a random person, then vaccinate
one of their random friends (not the person themselves).

Why might this strategy outperform purely random immunization,
even though we don't know the degree sequence?

*Hints to guide your thinking:*
- *The **friendship paradox** states: "your friends have, on average, more friends than you do." Why is this true mathematically? (Hint: high-degree nodes appear as "friends" of many people.)*
- *If you pick a random person and then vaccinate one of their friends, you are sampling from a degree-biased distribution. What kind of nodes does this over-sample?*
- *Compare: random immunization samples nodes uniformly (proportional to 1), while acquaintance immunization samples proportional to degree. Which is more likely to hit a hub?*

**Your Answer:**

