# Chapter 15: Graph Fundamentals

> *"Graphs are the hidden structures behind networks, relationships, and connections. They model the world as nodes and edges, revealing patterns invisible to other abstractions."* — Anonymous

---

## 15.1 Introduction to Graphs

A **graph** is a mathematical structure consisting of a set of **vertices** (or nodes) and a set of **edges** connecting pairs of vertices. Graphs model relationships between entities, making them one of the most versatile data structures in computer science.

### 15.1.1 Why Graphs Matter

```
┌─────────────────────────────────────────────────────────────────────┐
│                    IMPORTANCE OF GRAPHS                              │
├─────────────────────────────────────────────────────────────────────┤
│                                                                      │
│  1. SOCIAL NETWORKS: Users as vertices, friendships as edges        │
│                                                                      │
│  2. WEB GRAPH: Web pages as vertices, hyperlinks as edges           │
│                                                                      │
│  3. TRANSPORTATION NETWORKS: Cities as vertices, roads as edges     │
│                                                                      │
│  4. COMPUTER NETWORKS: Devices as vertices, connections as edges    │
│                                                                      │
│  5. DEPENDENCY GRAPHS: Tasks as vertices, dependencies as edges     │
│     (build systems, scheduling)                                      │
│                                                                      │
│  6. KNOWLEDGE GRAPHS: Entities as vertices, relationships as edges  │
│     (Google Knowledge Graph, Wikidata)                               │
│                                                                      │
│  7. BIOLOGICAL NETWORKS: Proteins as vertices, interactions as edges│
│                                                                      │
│  8. RECOMMENDATION SYSTEMS: Users and items as vertices,            │
│     interactions as edges (bipartite graphs)                         │
│                                                                      │
└─────────────────────────────────────────────────────────────────────┘
```

---

## 15.2 Graph Terminology and Properties

```python
def graph_terminology():
    """
    Essential terms used in graph theory.
    """
    print("Graph Terminology")
    print("=" * 70)
    print("""
    • Vertex (Node): A fundamental unit of the graph.
    • Edge (Arc): A connection between two vertices.
    • Directed Graph (Digraph): Edges have a direction (u→v).
    • Undirected Graph: Edges have no direction (u—v).
    • Weighted Graph: Edges have associated weights/costs.
    • Unweighted Graph: All edges have equal (implicit) weight.
    
    • Degree (of vertex): Number of edges incident to it.
        - In directed graphs: indegree (incoming) and outdegree (outgoing).
    • Path: A sequence of vertices where consecutive vertices are connected by edges.
    • Cycle: A path that starts and ends at the same vertex with no repeated edges/vertices.
    • Simple Graph: No multiple edges between same pair, no self-loops.
    • Multigraph: Allows multiple edges between same vertices.
    
    • Connected Graph: Every vertex reachable from every other (undirected).
    • Strongly Connected: In directed graph, every vertex reachable from every other via directed paths.
    • Weakly Connected: Underlying undirected graph is connected.
    
    • Component: Maximal connected subgraph.
    • Tree: Connected acyclic graph.
    • Forest: Collection of trees.
    • Complete Graph (K_n): Every pair of vertices connected by an edge.
    • Bipartite Graph: Vertices can be split into two sets such that every edge crosses between sets.
    • Subgraph: Graph formed from subset of vertices and edges.
    
    • Sparse Graph: Number of edges is much less than n² (|E| ≈ O(n)).
    • Dense Graph: Number of edges is close to n² (|E| ≈ O(n²)).
    """)

graph_terminology()
```

---

## 15.3 Graph Representations

Choosing the right representation is crucial for algorithm efficiency. The three most common representations are:

- **Adjacency Matrix** – 2D array of size V×V.
- **Adjacency List** – Array of lists (or sets) for each vertex.
- **Edge List** – List of (u, v) or (u, v, weight) triples.

### 15.3.1 Adjacency Matrix

```python
class AdjacencyMatrix:
    def __init__(self, num_vertices, directed=False):
        self.n = num_vertices
        self.directed = directed
        # Initialize V x V matrix with 0 (or infinity for weighted)
        self.matrix = [[0] * num_vertices for _ in range(num_vertices)]

    def add_edge(self, u, v, weight=1):
        """Add edge from u to v with given weight (default 1 for unweighted)."""
        self.matrix[u][v] = weight
        if not self.directed:
            self.matrix[v][u] = weight

    def remove_edge(self, u, v):
        self.matrix[u][v] = 0
        if not self.directed:
            self.matrix[v][u] = 0

    def has_edge(self, u, v):
        return self.matrix[u][v] != 0

    def get_weight(self, u, v):
        return self.matrix[u][v]

    def get_neighbors(self, u):
        """Return list of vertices v such that (u,v) exists."""
        neighbors = []
        for v in range(self.n):
            if self.matrix[u][v] != 0:
                neighbors.append(v)
        return neighbors

    def __repr__(self):
        return '\n'.join([' '.join(map(str, row)) for row in self.matrix])
```

**Time Complexity:**
- Add/Remove/Check Edge: O(1)
- Get Neighbors: O(V)
- Space: Θ(V²)

**Use when:** Graph is dense (V² ~ E) or we need constant-time edge checks.

### 15.3.2 Adjacency List

```python
class AdjacencyList:
    def __init__(self, num_vertices, directed=False):
        self.n = num_vertices
        self.directed = directed
        # List of lists; each inner list stores (neighbor, weight) pairs
        self.adj = [[] for _ in range(num_vertices)]

    def add_edge(self, u, v, weight=1):
        self.adj[u].append((v, weight))
        if not self.directed:
            self.adj[v].append((u, weight))

    def remove_edge(self, u, v):
        # Inefficient: need to scan list; use set for O(1) removal
        self.adj[u] = [(nei, w) for (nei, w) in self.adj[u] if nei != v]
        if not self.directed:
            self.adj[v] = [(nei, w) for (nei, w) in self.adj[v] if nei != u]

    def has_edge(self, u, v):
        for nei, _ in self.adj[u]:
            if nei == v:
                return True
        return False

    def get_weight(self, u, v):
        for nei, w in self.adj[u]:
            if nei == v:
                return w
        return None

    def get_neighbors(self, u):
        return [nei for (nei, _) in self.adj[u]]

    def __repr__(self):
        result = []
        for i in range(self.n):
            result.append(f"{i}: {self.adj[i]}")
        return '\n'.join(result)
```

**Time Complexity:**
- Add Edge: O(1) (amortized)
- Remove/Check Edge: O(degree(u)) worst-case
- Get Neighbors: O(degree(u))
- Space: Θ(V + E)

**Use when:** Graph is sparse, or we need to iterate over neighbors frequently.

### 15.3.3 Edge List

```python
class EdgeList:
    def __init__(self, num_vertices, directed=False):
        self.n = num_vertices
        self.directed = directed
        self.edges = []  # list of (u, v, weight)

    def add_edge(self, u, v, weight=1):
        self.edges.append((u, v, weight))
        if not self.directed and u != v:  # avoid duplicate for self-loop?
            self.edges.append((v, u, weight))

    def remove_edge(self, u, v):
        # Remove all occurrences (inefficient)
        self.edges = [e for e in self.edges if not (e[0] == u and e[1] == v)]
        if not self.directed:
            self.edges = [e for e in self.edges if not (e[0] == v and e[1] == u)]

    def get_neighbors(self, u):
        neighbors = []
        for (src, dst, w) in self.edges:
            if src == u:
                neighbors.append((dst, w))
        return neighbors

    def __repr__(self):
        return '\n'.join(str(e) for e in self.edges)
```

**Time Complexity:**
- Add Edge: O(1)
- Remove/Find Edge: O(E)
- Get Neighbors: O(E)
- Space: Θ(E)

**Use when:** We need to process all edges (e.g., Kruskal's algorithm) and don't need fast neighbor iteration.

### 15.3.4 Comparison

| Representation | Space | Edge Existence | Neighbor Iteration | Add Edge | Remove Edge |
|----------------|-------|----------------|---------------------|----------|-------------|
| Adjacency Matrix | Θ(V²) | O(1) | O(V) | O(1) | O(1) |
| Adjacency List | Θ(V+E) | O(degree) | O(degree) | O(1) | O(degree) |
| Edge List | Θ(E) | O(E) | O(E) | O(1) | O(E) |

**Choice depends on graph density and required operations.**

---

## 15.4 Graph Types

### 15.4.1 Directed vs Undirected

```python
def directed_vs_undirected():
    print("\nDirected vs Undirected Graphs")
    print("=" * 70)
    print("""
    Directed Graph (Digraph):
        • Edges have direction: (u → v) is different from (v → u).
        • Used for: web links, Twitter follow relationships, dependencies.
    
    Undirected Graph:
        • Edges have no direction: (u — v) is same as (v — u).
        • Used for: Facebook friendships, road networks, collaboration graphs.
    """)
```

### 15.4.2 Weighted vs Unweighted

```python
def weighted_vs_unweighted():
    print("\nWeighted vs Unweighted Graphs")
    print("=" * 70)
    print("""
    Weighted Graph:
        • Each edge has a numerical value (weight, cost, distance).
        • Used for: road maps (distance), network flows (capacity).
    
    Unweighted Graph:
        • All edges are equal (implicit weight = 1).
        • Used for: social connections, unweighted relationships.
    """)
```

### 15.4.3 Cyclic vs Acyclic

- **Cyclic Graph:** Contains at least one cycle.
- **Acyclic Graph:** No cycles.
  - Undirected acyclic → **forest** (connected → tree).
  - Directed acyclic → **DAG** (Directed Acyclic Graph), crucial for scheduling, topological sorting.

### 15.4.4 Connected vs Disconnected

- **Connected Graph:** Every vertex reachable from every other (undirected).
- **Disconnected Graph:** Consists of multiple connected components.

For directed graphs:
- **Strongly Connected:** Every vertex reachable from every other via directed paths.
- **Weakly Connected:** Underlying undirected graph is connected.

### 15.4.5 Bipartite Graph

A graph whose vertices can be divided into two disjoint sets U and V such that every edge connects a vertex in U to one in V. No edge connects two vertices in the same set.

**Characterization:** A graph is bipartite if and only if it contains no odd-length cycles.

**Check:** Use BFS to 2-color the graph.

```python
def is_bipartite(graph):
    """
    Check bipartiteness using BFS coloring.
    graph: adjacency list representation (0-indexed vertices)
    """
    n = len(graph)
    color = [-1] * n  # -1 uncolored, 0 and 1 are colors
    for start in range(n):
        if color[start] == -1:
            queue = [start]
            color[start] = 0
            while queue:
                u = queue.pop(0)
                for v in graph[u]:
                    if color[v] == -1:
                        color[v] = 1 - color[u]
                        queue.append(v)
                    elif color[v] == color[u]:
                        return False
    return True
```

### 15.4.6 Complete Graph

Every pair of distinct vertices is connected by an edge. Denoted K_n for n vertices. Number of edges = n(n-1)/2 (undirected).

### 15.4.7 Sparse vs Dense Graphs

- **Sparse:** Number of edges |E| is much less than |V|², often O(|V|) or O(|V| log |V|).
- **Dense:** |E| is close to |V|², i.e., O(|V|²).

The distinction guides algorithm selection: adjacency lists for sparse, matrices for dense.

---

## 15.5 Basic Graph Algorithms

### 15.5.1 Degree Calculation

```python
def degrees(adj_list, directed=False):
    """
    Compute degrees of all vertices.
    Returns:
        if directed: (indeg, outdeg) tuple lists
        else: deg list
    """
    n = len(adj_list)
    if directed:
        indeg = [0] * n
        outdeg = [0] * n
        for u in range(n):
            outdeg[u] = len(adj_list[u])
            for v, _ in adj_list[u]:
                indeg[v] += 1
        return indeg, outdeg
    else:
        deg = [0] * n
        for u in range(n):
            deg[u] = len(adj_list[u])
        return deg
```

### 15.5.2 Connected Components (Undirected)

Using DFS or BFS to label components.

```python
def connected_components(adj_list):
    """
    Find connected components in undirected graph.
    Returns list of component IDs for each vertex.
    """
    n = len(adj_list)
    visited = [False] * n
    component = [-1] * n
    comp_id = 0

    def dfs(u):
        visited[u] = True
        component[u] = comp_id
        for v, _ in adj_list[u]:
            if not visited[v]:
                dfs(v)

    for i in range(n):
        if not visited[i]:
            dfs(i)
            comp_id += 1
    return component
```

### 15.5.3 Cycle Detection

#### Undirected Graph

```python
def has_cycle_undirected(adj_list):
    """
    Detect cycle using DFS with parent tracking.
    """
    n = len(adj_list)
    visited = [False] * n

    def dfs(u, parent):
        visited[u] = True
        for v, _ in adj_list[u]:
            if not visited[v]:
                if dfs(v, u):
                    return True
            elif v != parent:
                return True
        return False

    for i in range(n):
        if not visited[i]:
            if dfs(i, -1):
                return True
    return False
```

#### Directed Graph

Use three colors: 0 = unvisited, 1 = visiting (in recursion stack), 2 = fully processed.

```python
def has_cycle_directed(adj_list):
    n = len(adj_list)
    state = [0] * n  # 0 unvisited, 1 visiting, 2 processed

    def dfs(u):
        state[u] = 1  # visiting
        for v, _ in adj_list[u]:
            if state[v] == 1:  # back edge
                return True
            if state[v] == 0 and dfs(v):
                return True
        state[u] = 2  # processed
        return False

    for i in range(n):
        if state[i] == 0:
            if dfs(i):
                return True
    return False
```

### 15.5.4 Graph Coloring (Greedy)

A simple greedy algorithm to color vertices with as few colors as possible (not necessarily optimal). Used in register allocation, scheduling.

```python
def greedy_coloring(adj_list):
    """
    Assign colors (integers) to vertices such that adjacent vertices have different colors.
    Returns list of colors (0-indexed).
    """
    n = len(adj_list)
    result = [-1] * n
    # Process vertices in order (any order)
    for u in range(n):
        # Find colors of neighbors
        used = set()
        for v, _ in adj_list[u]:
            if result[v] != -1:
                used.add(result[v])
        # Assign smallest available color
        color = 0
        while color in used:
            color += 1
        result[u] = color
    return result
```

---

## 15.6 Graph Density and Optimizations

```python
def density_considerations():
    print("\nGraph Density and Algorithm Choices")
    print("=" * 70)
    print("""
    Sparse Graphs (|E| ≈ |V|):
        • Use adjacency list.
        • Algorithms: DFS/BFS (O(V+E)), Dijkstra with binary heap (O((V+E)log V)),
          Kruskal (O(E log V)).
    
    Dense Graphs (|E| ≈ |V|²):
        • Consider adjacency matrix.
        • Algorithms: Prim with adjacency matrix (O(V²)), Floyd-Warshall (O(V³)).
    
    Very Large Graphs:
        • Use external memory or distributed representations (e.g., adjacency lists on disk).
        • Consider approximate algorithms (random walks, sampling).
    """)
```

---

## 15.7 Applications of Graph Fundamentals

```python
def graph_applications():
    print("\nReal-World Applications")
    print("=" * 70)
    print("""
    1. Social Network Analysis
       - Find connected components (communities)
       - Compute centrality (degree, betweenness)
       - Recommend friends (common neighbors)
    
    2. Web Search
       - PageRank uses directed graph of hyperlinks
       - Detect spam farms (bipartite cores)
    
    3. Transportation
       - Shortest paths (next chapters)
       - Traffic flow (network flow)
    
    4. Dependency Resolution
       - Package managers (DAG of dependencies)
       - Build systems (Make, Bazel)
    
    5. Biology
       - Protein interaction networks (connected components)
       - Metabolic pathways (directed graphs)
    
    6. Computer Networks
       - Routing protocols (OSPF uses graph algorithms)
       - Detect network partitions
    """)
```

---

## 15.8 Summary

```
┌─────────────────────────────────────────────────────────────────────┐
│                    GRAPH FUNDAMENTALS SUMMARY                        │
├─────────────────────────────────────────────────────────────────────┤
│                                                                      │
│  Representations:                                                    │
│    • Adjacency Matrix: O(V²) space, fast edge checks                │
│    • Adjacency List: O(V+E) space, fast neighbor iteration          │
│    • Edge List: O(E) space, good for edge processing                │
│                                                                      │
│  Graph Properties:                                                   │
│    • Directed/Undirected, Weighted/Unweighted, Cyclic/Acyclic       │
│    • Connected, Strongly Connected, Bipartite, Complete             │
│    • Sparse vs Dense                                                 │
│                                                                      │
│  Basic Operations:                                                   │
│    • Add/Remove Edge, Check Existence, Get Neighbors                │
│    • Degree Calculation, Connected Components                        │
│    • Cycle Detection, Graph Coloring                                 │
│                                                                      │
│  Choosing Representation:                                            │
│    • Use adjacency list for most applications (sparse graphs)       │
│    • Use adjacency matrix for dense graphs or when edge existence   │
│      checks are extremely frequent                                   │
│    • Use edge list for algorithms that process all edges (Kruskal)  │
│                                                                      │
└─────────────────────────────────────────────────────────────────────┘
```

---

## 15.9 Practice Problems

### Problem 1: Graph Representation Conversion
Implement functions to convert between adjacency matrix, adjacency list, and edge list.

### Problem 2: Count Connected Components
Given an undirected graph, count the number of connected components. Then, for each component, output its vertices.

### Problem 3: Bipartite Check
Determine if a graph is bipartite. If yes, output the two partitions.

### Problem 4: Cycle Detection
Given a directed graph, detect if it has a cycle. If yes, output one cycle.

### Problem 5: Graph Coloring
Implement the greedy coloring algorithm and test on interval graphs (or random graphs). How many colors does it use compared to the chromatic number?

### Problem 6: Degree Distribution
Compute the degree distribution of a graph (histogram of degrees). For directed graphs, compute indegree and outdegree distributions.

### Problem 7: Is Graph a Tree?
Check if an undirected graph is a tree (connected and acyclic).

### Problem 8: Transpose Graph
Given a directed graph, compute its transpose (reverse all edges).

---

## 15.10 Further Reading

1. **"Introduction to Algorithms" (CLRS)** – Chapter 22 (Elementary Graph Algorithms)
2. **"The Algorithm Design Manual"** by Steven Skiena – Chapter 5 (Graph Traversal)
3. **"Graph Theory"** by Reinhard Diestel – Standard reference for graph theory
4. **"Algorithms"** by Robert Sedgewick – Part 5 (Graph Algorithms)

---

> **Coming in Chapter 16**: **Graph Traversals** – We'll dive deep into BFS, DFS, topological sorting, and strongly connected components.

---

**End of Chapter 15**