# Week 1: Introduction to Networks — Assignment

**Learning objectives** — In this assignment you will:

- Build graphs from edge lists programmatically
- Compute basic graph statistics from scratch
- Implement a degree histogram without using `nx.degree_histogram`
- Construct an adjacency matrix from an edge list
- Check connectivity and compute diameter

## Grading

| Section | Part | Function | Points |
|---------|------|----------|--------|
| 1 | Graph Construction | `build_graph(edge_list, directed)` | 15 |
| 2 | Basic Statistics | `graph_stats(G)` | 20 |
| 3 | Degree Histogram | `degree_histogram(G)` | 20 |
| 4 | Adjacency Matrix | `adjacency_from_edges(n, edge_list)` | 20 |
| 5 | Connectivity | `is_connected_and_diameter(G)` | 15 |
| — | Written Questions | — | 10 |
| | **Total** | | **100** |

## Before You Start

This assignment builds directly on concepts from the Week 1 lab. Make sure you are comfortable with:

- **Building graphs** with `nx.Graph()`, `add_nodes_from()`, and `add_edge()` (Lab Section 1)
- **Degree** — the number of edges connected to a node (Lab Section 5)
- **Shortest paths** — fewest hops between two nodes via `nx.shortest_path()` (Lab Section 5)
- **Adjacency matrix** — a grid where entry (i, j) is 1 if nodes i and j are connected (Lab Section 4)

If any of these feel unfamiliar, revisit the corresponding lab section before proceeding.

In [None]:
import networkx as nx
import numpy as np
import matplotlib.pyplot as plt
from netsci.loaders import load_graph
from netsci.utils import SEED, graph_summary

In [None]:
G_karate = load_graph("karate")
graph_summary(G_karate)
print()
G_les = load_graph("lesmis")
graph_summary(G_les)

---
## Section 1: Graph Construction (15 pts)

Write a function that takes a list of `(source, target)` tuples and returns a NetworkX graph.
If `directed=True`, return a `DiGraph`; otherwise return a `Graph`.

In [None]:
def build_graph(edge_list, directed=False):
    """Build a NetworkX graph from an edge list.

    Parameters
    ----------
    edge_list : list of (source, target) tuples
    directed : bool, default False
        If True, return a DiGraph.

    Returns
    -------
    nx.Graph or nx.DiGraph
    """
    # YOUR CODE HERE
    raise NotImplementedError()

In [None]:
# --- Validation ---
_edges = [(0, 1), (1, 2), (2, 3), (3, 0)]
_g = build_graph(_edges, directed=False)
assert isinstance(_g, nx.Graph) and not isinstance(_g, nx.DiGraph)
assert _g.number_of_nodes() == 4
assert _g.number_of_edges() == 4

_dg = build_graph(_edges, directed=True)
assert isinstance(_dg, nx.DiGraph)
assert _dg.number_of_edges() == 4
print("Section 1 passed!")

---
## Section 2: Basic Statistics (20 pts)

Write a function that returns a dictionary with four basic graph statistics:
- `"nodes"`: number of nodes
- `"edges"`: number of edges
- `"density"`: graph density (ratio of actual edges to possible edges)
- `"avg_degree"`: average degree across all nodes

In [None]:
def graph_stats(G):
    """Compute basic statistics for graph G.

    Parameters
    ----------
    G : nx.Graph

    Returns
    -------
    dict with keys: 'nodes', 'edges', 'density', 'avg_degree'
    """
    # YOUR CODE HERE
    raise NotImplementedError()

In [None]:
# --- Validation ---
_stats = graph_stats(G_karate)
assert _stats["nodes"] == 34
assert _stats["edges"] == 78
assert abs(_stats["density"] - nx.density(G_karate)) < 1e-6
assert abs(_stats["avg_degree"] - (2 * 78 / 34)) < 1e-6
print(f"Karate stats: {_stats}")
print("Section 2 passed!")

---
## Section 3: Degree Histogram (20 pts)

Implement `degree_histogram(G)` **without** using `nx.degree_histogram`.
Return a list where the *i*-th element is the number of nodes with degree *i*.
The list length should be `max_degree + 1`.

For example, if degrees are `[1, 2, 2, 3]`, return `[0, 1, 2, 1]`.

In [None]:
def degree_histogram(G):
    """Compute the degree histogram of G.

    Parameters
    ----------
    G : nx.Graph

    Returns
    -------
    list of int
        hist[i] = number of nodes with degree i.
        Length is max_degree + 1.
    """
    # YOUR CODE HERE
    raise NotImplementedError()

In [None]:
# --- Validation ---
_hist = degree_histogram(G_karate)
_expected = nx.degree_histogram(G_karate)
assert _hist == _expected, f"Got {_hist}, expected {_expected}"

_hist_les = degree_histogram(G_les)
assert _hist_les == nx.degree_histogram(G_les)
print("Section 3 passed!")

---
## Section 4: Adjacency Matrix (20 pts)

Build the adjacency matrix of an **undirected** graph from scratch.
Given `n` (number of nodes, labeled 0 to n-1) and an edge list,
return an `n x n` NumPy array where `A[i][j] = 1` if there is an edge between i and j, and 0 otherwise.

In [None]:
def adjacency_from_edges(n, edge_list):
    """Construct an adjacency matrix from an edge list.

    Parameters
    ----------
    n : int
        Number of nodes (labeled 0 to n-1).
    edge_list : list of (int, int) tuples
        Undirected edges.

    Returns
    -------
    np.ndarray of shape (n, n) with dtype float
    """
    # YOUR CODE HERE
    raise NotImplementedError()

In [None]:
# --- Validation ---
_edges_k = list(G_karate.edges())
_A = adjacency_from_edges(34, _edges_k)
_A_nx = nx.to_numpy_array(G_karate)
assert _A.shape == (34, 34)
assert np.allclose(_A, _A_nx), "Adjacency matrix does not match NetworkX output"
# Check symmetry
assert np.allclose(_A, _A.T), "Matrix should be symmetric for undirected graph"
print("Section 4 passed!")

---
## Section 5: Connectivity and Diameter (15 pts)

Write a function that checks whether a graph is connected and, if so, computes its **diameter**
(the longest shortest path between any pair of nodes).

If the graph is not connected, return `(False, None)`.

In [None]:
def is_connected_and_diameter(G):
    """Check connectivity and compute diameter.

    Parameters
    ----------
    G : nx.Graph

    Returns
    -------
    (bool, int or None)
        (is_connected, diameter). Diameter is None if not connected.
    """
    # YOUR CODE HERE
    raise NotImplementedError()

In [None]:
# --- Validation ---
_conn, _diam = is_connected_and_diameter(G_karate)
assert _conn is True
assert _diam == nx.diameter(G_karate)
print(f"Karate: connected={_conn}, diameter={_diam}")

# Test with a disconnected graph
_disc = nx.Graph()
_disc.add_edges_from([(0, 1), (2, 3)])
_conn2, _diam2 = is_connected_and_diameter(_disc)
assert _conn2 is False
assert _diam2 is None
print("Section 5 passed!")

---
## Written Questions (10 pts)

Answer the following questions in the markdown cells below. These are designed to prepare you for the oral exam.

### Question 1 (5 pts)

Les Miserables has 254 edges (more than Karate's 78), yet its density is *lower* than Karate's.
Why is this? What role does the number of nodes play?

*Hints to guide your thinking:*
- *Density is defined as actual edges divided by **possible** edges. How does the number of possible edges grow with n?*
- *For an undirected graph with n nodes, the maximum number of edges is n(n-1)/2. Compute this for both networks.*
- *Les Mis has 77 nodes → 2,926 possible edges. Karate has 34 nodes → 561 possible edges. How does 254/2926 compare to 78/561?*

**Your Answer:**



### Question 2 (5 pts)

What does **diameter** mean in a social network context?
If you removed the highest-degree node from the Karate Club, would you expect the diameter to increase, decrease, or stay the same? Why?

*Hints to guide your thinking:*
- *Diameter is the longest shortest path — the "worst case" number of hops between any two people.*
- *In a social context, a diameter of 5 means that even the two most "distant" people in the club can reach each other through at most 5 handshakes.*
- *Node 33 has degree 17 (connects to half the club) — it is the highest-degree node. If you remove it, think about which pairs of nodes would lose their shortest path and need to find a longer detour.*

**Your Answer:**

