# A LeetCoder's Guide to Graph Traversal

## Chapter 1: How to Think in Graphs

Welcome! Many LeetCode problems, even those that don't seem to mention graphs, are secretly graph problems in disguise. The key is to abstract the problem into a set of **nodes** (states or items) and **edges** (transitions or relationships). Once you see the graph, you can unlock a powerful toolbox of well-known algorithms to solve the problem systematically.

### How to Spot a Graph Problem 🕵️‍♀️

Look for these common patterns:

- **States and Transitions:** The problem involves moving from one state to another. A classic example is **Word Ladder (LC #127)**, where each word is a node, and an edge exists between two words if they are one letter apart. You are finding a path from a start state to an end state.

- **Explicit Relationships:** The problem directly describes relationships between objects. This is common in problems involving course prerequisites, social networks, or dependencies. Each object is a node, and the relationship forms an edge.

- **Grid/Matrix Traversal:** Many problems are set on a 2D grid. Think of each cell `(row, col)` as a node. The edges are implicit: they connect a cell to its adjacent neighbors (up, down, left, right). Problems like **Number of Islands (LC #200)** are classic examples.

## Chapter 2: Representing a Graph

Before we can traverse a graph, we need to build it in memory. There are three common ways to represent a graph, but one is dominant in coding interviews.

### Adjacency List

This is the most common and versatile representation for interviews. It's a dictionary (or a hash map) where each key is a node, and its value is a list of its neighbors.

- **Pros:** Space-efficient for sparse graphs (where the number of edges is much less than the number of possible edges). Iterating through a node's neighbors is very fast.
- **Cons:** Checking for the existence of a specific edge `(u, v)` can be slower, taking O(k) time where k is the number of neighbors of `u`.

In [None]:
from collections import defaultdict

def build_adjacency_list(edges):
    """Builds an adjacency list from a list of edges."""
    adj = defaultdict(list)
    for u, v in edges:
        adj[u].append(v)
        # For an undirected graph, add the reverse edge as well
        # adj[v].append(u)
    return adj

# Example Usage:
edges = [['A', 'B'], ['B', 'C'], ['A', 'C'], ['C', 'D']]
adjacency_list = build_adjacency_list(edges)
print(dict(adjacency_list))

### Adjacency Matrix

An $N \times N$ matrix where $N$ is the number of nodes. The entry `matrix[i][j]` is 1 (or the edge weight) if an edge exists from node `i` to node `j`, and 0 otherwise.

- **Pros:** Checking for the existence of an edge `(i, j)` is very fast, an O(1) operation.
- **Cons:** Requires $O(N^2)$ space, which is inefficient for sparse graphs. Adding or removing nodes is costly.

## Chapter 3: The 5-Step Framework for Graph Problems

Use this systematic approach to deconstruct any graph problem you encounter.

1.  **Identify**: Recognize the problem as a graph problem by looking for the patterns mentioned in Chapter 1.
2.  **Model**: Clearly define what the nodes and edges represent. Are edges directed or undirected? Are they weighted?
3.  **Select**: Choose the right traversal algorithm based on the problem's requirements:
    - **DFS**: Path existence, cycle detection, topological sort, finding connected components.
    - **BFS**: Shortest path in an *unweighted* graph.
    - **Dijkstra's**: Shortest path in a *weighted* graph with non-negative weights.
    - **Topological Sort**: Problems with dependencies or prerequisites.
4.  **Implement**: Use a standard boilerplate template for your chosen algorithm. This minimizes bugs and saves time.
5.  **Verify**: Test your implementation with edge cases like empty graphs, disconnected components, and cycles.

## Chapter 4: Depth-First Search (DFS) 🌲

**Intuition**: DFS explores as far as possible down one path before backtracking. Imagine navigating a maze by always taking the first path you see, and when you hit a dead end, you backtrack to the last junction and try the next available path.

**Use Cases**: 
- Finding if a path exists between two nodes.
- Detecting cycles in a graph.
- Finding all connected components.
- Topological sorting (a variation of DFS).

### Boilerplate Code (Recursive & Iterative)

DFS can be implemented elegantly with recursion (using the call stack) or iteratively with an explicit stack.

In [None]:
# Recursive DFS Boilerplate
def dfs_recursive(graph, start_node, visited=None):
    if visited is None:
        visited = set()
    
    visited.add(start_node)
    print(start_node, end=' ')
    
    for neighbor in graph.get(start_node, []):
        if neighbor not in visited:
            dfs_recursive(graph, neighbor, visited)

# Iterative DFS Boilerplate
def dfs_iterative(graph, start_node):
    visited = set()
    stack = [start_node]
    
    while stack:
        node = stack.pop()
        if node not in visited:
            visited.add(node)
            print(node, end=' ')
            # Add neighbors to the stack in reverse to process them in order
            for neighbor in reversed(graph.get(node, [])):
                if neighbor not in visited:
                    stack.append(neighbor)

### LeetCode Case Study: Number of Islands (LC #200)

Given an `m x n` 2D binary grid which represents a map of '1's (land) and '0's (water), return the number of islands. An island is surrounded by water and is formed by connecting adjacent lands horizontally or vertically.

In [None]:
from typing import List

class Solution_NumberOfIslands:
    def numIslands(self, grid: List[List[str]]) -> int:
        if not grid:
            return 0

        rows, cols = len(grid), len(grid[0])
        num_islands = 0

        def dfs(r, c):
            # Boundary and water check
            if r < 0 or c < 0 or r >= rows or c >= cols or grid[r][c] == '0':
                return

            # Mark the current cell as visited by changing it to '0'
            grid[r][c] = '0'

            # Explore neighbors
            dfs(r + 1, c)
            dfs(r - 1, c)
            dfs(r, c + 1)
            dfs(r, c - 1)

        for r in range(rows):
            for c in range(cols):
                if grid[r][c] == '1':
                    # Found a new island, explore it completely
                    dfs(r, c)
                    num_islands += 1
        
        return num_islands

# Example Usage:
solver = Solution_NumberOfIslands()
grid = [
  ["1","1","0","0","0"],
  ["1","1","0","0","0"],
  ["0","0","1","0","0"],
  ["0","0","0","1","1"]
]
print(f"Number of Islands: {solver.numIslands(grid)}") # Expected: 3

## Chapter 5: Breadth-First Search (BFS) 🌊

**Intuition**: BFS explores nodes level by level. It starts at a source node and explores all its immediate neighbors first, then their neighbors, and so on. Imagine ripples spreading out from a stone dropped in a pond.

**Core Strength**: Because of its level-by-level nature, BFS is the go-to algorithm for finding the **shortest path in an unweighted graph**.

### Boilerplate Code

BFS is implemented iteratively using a queue (`deque` in Python) to keep track of the nodes to visit next.

In [None]:
from collections import deque

def bfs(graph, start_node):
    visited = set()
    queue = deque([start_node])
    visited.add(start_node) # Mark as visited when enqueued
    
    while queue:
        node = queue.popleft()
        print(node, end=' ')
        
        for neighbor in graph.get(node, []):
            if neighbor not in visited:
                visited.add(neighbor)
                queue.append(neighbor)

### LeetCode Case Study: Word Ladder (LC #127)

Given two words, `beginWord` and `endWord`, and a dictionary `wordList`, return the length of the shortest transformation sequence from `beginWord` to `endWord`, such that only one letter can be changed at a time and each transformed word must exist in the word list.

In [None]:
from collections import deque
from typing import List

class Solution_WordLadder:
    def ladderLength(self, beginWord: str, endWord: str, wordList: List[str]) -> int:
        wordSet = set(wordList)
        if endWord not in wordSet:
            return 0
        
        queue = deque([(beginWord, 1)]) # (word, length of path)
        visited = {beginWord}
        
        while queue:
            current_word, length = queue.popleft()
            
            if current_word == endWord:
                return length
            
            # Find all possible next words by changing one letter
            for i in range(len(current_word)):
                for char_code in range(ord('a'), ord('z') + 1):
                    char = chr(char_code)
                    next_word = current_word[:i] + char + current_word[i+1:]
                    
                    if next_word in wordSet and next_word not in visited:
                        visited.add(next_word)
                        queue.append((next_word, length + 1))
                        
        return 0 # End word not reachable

# Example Usage:
solver = Solution_WordLadder()
beginWord = "hit"
endWord = "cog"
wordList = ["hot","dot","dog","lot","log","cog"]
print(f"Shortest ladder length: {solver.ladderLength(beginWord, endWord, wordList)}") # Expected: 5

## Chapter 6: Dijkstra's Algorithm ⚖️

**Why BFS Fails**: BFS finds the shortest path in terms of the *number of edges*. It treats every edge as having a weight of 1. When edges have different weights (costs), BFS is no longer guaranteed to find the cheapest path.

**Dijkstra's Algorithm**: Solves the single-source shortest path problem for a **weighted directed graph with non-negative edge weights**.

### Mechanism

Dijkstra's works by maintaining a set of visited nodes and the current shortest known distance to every other node. It uses a **greedy** approach:

1.  Start at the source node, with a distance of 0.
2.  At each step, select the unvisited node with the smallest known distance.
3.  Visit this node and "relax" its neighbors: for each neighbor, check if the path through the current node is shorter than its previously known shortest path. If so, update it.
4.  Repeat until the destination is visited or all reachable nodes have been visited.

A **Priority Queue (min-heap)** is used to efficiently select the unvisited node with the smallest distance in step 2.

### Boilerplate Code
We use Python's `heapq` module to implement the priority queue.

In [None]:
import heapq

def dijkstra(graph, start):
    # distances dictionary to store the shortest distance from start to every other node
    distances = {node: float('inf') for node in graph}
    distances[start] = 0
    
    # Priority queue stores tuples of (distance, node)
    priority_queue = [(0, start)]
    
    while priority_queue:
        current_distance, current_node = heapq.heappop(priority_queue)
        
        # If we found a shorter path already, skip
        if current_distance > distances[current_node]:
            continue
            
        # Explore neighbors
        for neighbor, weight in graph[current_node].items():
            distance = current_distance + weight
            
            # If we found a shorter path to the neighbor, update it
            if distance < distances[neighbor]:
                distances[neighbor] = distance
                heapq.heappush(priority_queue, (distance, neighbor))
                
    return distances

### LeetCode Case Study: Network Delay Time (LC #743)

You are given a network of `n` nodes, labeled from `1` to `n`. You are also given `times`, a list of travel times as directed edges `times[i] = (u, v, w)`, where `u` is the source node, `v` is the target node, and `w` is the time it takes for a signal to travel from source to target. We will send a signal from a given node `k`. Return the time it takes for all `n` nodes to receive the signal. If it is impossible for all `n` nodes to receive the signal, return `-1`.

In [None]:
import heapq
from collections import defaultdict
from typing import List

class Solution_NetworkDelay:
    def networkDelayTime(self, times: List[List[int]], n: int, k: int) -> int:
        # 1. Build the adjacency list
        adj = defaultdict(list)
        for u, v, w in times:
            adj[u].append((v, w))
        
        # 2. Initialize distances and priority queue
        distances = {i: float('inf') for i in range(1, n + 1)}
        distances[k] = 0
        pq = [(0, k)] # (distance, node)
        visited_count = 0
        
        # 3. Dijkstra's Algorithm
        while pq:
            dist, node = heapq.heappop(pq)
            
            if dist > distances[node]:
                continue
            
            visited_count += 1

            for neighbor, weight in adj[node]:
                new_dist = dist + weight
                if new_dist < distances[neighbor]:
                    distances[neighbor] = new_dist
                    heapq.heappush(pq, (new_dist, neighbor))
        
        # 4. Find the result
        max_delay = max(distances.values())
        
        return max_delay if max_delay != float('inf') else -1

# Example Usage:
solver = Solution_NetworkDelay()
times = [[2,1,1],[2,3,1],[3,4,1]]
n = 4
k = 2
print(f"Network Delay Time: {solver.networkDelayTime(times, n, k)}") # Expected: 2

## Chapter 7: Topological Sort 📚

**Intuition**: A topological sort is a linear ordering of nodes such that for every directed edge from node `u` to node `v`, node `u` comes before node `v` in the ordering. Think of it as ordering tasks with dependencies, like course prerequisites. You must take 'Calculus I' before 'Calculus II'.

**Key Constraint**: Topological sort is only possible on a **Directed Acyclic Graph (DAG)**. If a cycle exists (e.g., A depends on B, and B depends on A), no valid ordering is possible.

### Mechanism (Kahn's Algorithm)

Kahn's algorithm is an intuitive, BFS-based approach:

1.  **Compute In-degrees**: For each node, count how many incoming edges it has.
2.  **Initialize Queue**: Add all nodes with an in-degree of 0 to a queue. These are the starting points with no prerequisites.
3.  **Process Nodes**: 
    - Dequeue a node and add it to the sorted list.
    - For each of its neighbors, decrement their in-degree by 1 (since we've "completed" the prerequisite).
    - If a neighbor's in-degree becomes 0, enqueue it.
4.  **Check for Cycles**: If the final sorted list contains fewer nodes than the total number of nodes in the graph, a cycle must exist.

### Boilerplate Code

In [None]:
from collections import defaultdict, deque

def topological_sort(nodes, edges):
    adj = defaultdict(list)
    in_degree = {node: 0 for node in nodes}
    
    for u, v in edges:
        adj[u].append(v)
        in_degree[v] += 1
        
    queue = deque([node for node in nodes if in_degree[node] == 0])
    sorted_order = []
    
    while queue:
        node = queue.popleft()
        sorted_order.append(node)
        
        for neighbor in adj[node]:
            in_degree[neighbor] -= 1
            if in_degree[neighbor] == 0:
                queue.append(neighbor)
                
    if len(sorted_order) == len(nodes):
        return sorted_order
    else:
        return [] # Cycle detected

### LeetCode Case Study: Course Schedule II (LC #210)

There are a total of `numCourses` courses you have to take, labeled from `0` to `numCourses - 1`. You are given an array `prerequisites` where `prerequisites[i] = [ai, bi]` indicates that you must take course `bi` first if you want to take course `ai`. Return the ordering of courses you should take to finish all courses. If there are many valid answers, return any of them. If it is impossible to finish all courses, return an empty array.

In [None]:
from collections import defaultdict, deque
from typing import List

class Solution_CourseSchedule:
    def findOrder(self, numCourses: int, prerequisites: List[List[int]]) -> List[int]:
        # 1. Build adjacency list and in-degree map
        adj = defaultdict(list)
        in_degree = {i: 0 for i in range(numCourses)}
        
        for course, prereq in prerequisites:
            adj[prereq].append(course)
            in_degree[course] += 1
            
        # 2. Initialize queue with courses having no prerequisites
        queue = deque([c for c in range(numCourses) if in_degree[c] == 0])
        sorted_order = []
        
        # 3. Process courses
        while queue:
            prereq = queue.popleft()
            sorted_order.append(prereq)
            
            for course in adj[prereq]:
                in_degree[course] -= 1
                if in_degree[course] == 0:
                    queue.append(course)
                    
        # 4. Check for cycle
        if len(sorted_order) == numCourses:
            return sorted_order
        else:
            return []

# Example Usage:
solver = Solution_CourseSchedule()
numCourses = 4
prerequisites = [[1,0],[2,0],[3,1],[3,2]]
print(f"Course order: {solver.findOrder(numCourses, prerequisites)}") # Expected: [0, 1, 2, 3] or [0, 2, 1, 3]

## Chapter 8: Deep Dive: Traversal on a 2D Matrix/Grid 🗺️

### The Grid is an Implicit Graph

In many problems, the graph isn't given to you as a list of edges. Instead, it's a grid. In this case:
- A **node** is a cell at coordinates `(row, col)`.
- An **edge** is an implicit connection between a cell and its adjacent neighbors (e.g., up, down, left, right).

You don't need to build an explicit adjacency list. You can compute a cell's neighbors on the fly.

### Core Pattern 1: The Directions Array

To avoid messy `if` statements for checking all four neighbors ( `(r+1, c)`, `(r-1, c)`, etc.), use a `directions` array. This makes your code cleaner and less error-prone, especially if you need 8-way movement.

In [None]:
# A clean way to iterate through neighbors
# (dr, dc) = (change in row, change in col)
directions = [(0, 1), (0, -1), (1, 0), (-1, 0)] # Right, Left, Down, Up

### Core Pattern 2: Tracking Visited Cells

To avoid infinite loops and redundant work, you must keep track of which cells you've already visited. There are two main strategies:

#### Method A: Auxiliary `visited` Matrix
Create a separate boolean matrix of the same dimensions as the grid.
```python
visited = [[False] * COLS for _ in range(ROWS)]
```
- **Pros**: Non-destructive. It doesn't modify the original input grid, which is good practice.
- **Cons**: Requires $O(R \cdot C)$ extra space.

#### Method B: In-Place Modification
Modify the input grid itself to mark cells as visited (e.g., changing a '1' to a '#' or a land cell to a water cell).
- **Pros**: $O(1)$ extra space, which can be a requirement in some problems.
- **Cons**: Mutates the input. Always clarify with your interviewer if this is acceptable.

### Grid DFS: Boilerplate

A recursive approach is often the most concise for grid DFS. The function call stack naturally handles the "stack" data structure for you.

In [None]:
class GridDFS_Boilerplate:
    def solve(self, grid):
        if not grid:
            return
        
        self.rows, self.cols = len(grid), len(grid[0])
        self.grid = grid
        self.visited = set() # Use a set of (r, c) tuples for sparse grids
        
        # Start traversal from a starting cell, e.g., (0, 0)
        self.dfs(0, 0)
        
    def dfs(self, r, c):
        # 1. Boundary Check
        if not (0 <= r < self.rows and 0 <= c < self.cols):
            return
            
        # 2. Visited Check (and any other invalid condition)
        if (r, c) in self.visited: # or self.grid[r][c] == 'obstacle':
            return
            
        # Mark as visited
        self.visited.add((r, c))
        
        # --- Process the current cell --- 
        print(f"Visiting ({r}, {c})")
        
        # 3. Explore neighbors
        directions = [(0, 1), (0, -1), (1, 0), (-1, 0)]
        for dr, dc in directions:
            self.dfs(r + dr, c + dc)

### Grid BFS: Boilerplate

An iterative, queue-based approach is required for BFS. This is essential for finding the shortest path in a grid (e.g., shortest maze exit).

In [None]:
from collections import deque

class GridBFS_Boilerplate:
    def solve(self, grid, start_r, start_c):
        if not grid:
            return
        
        rows, cols = len(grid), len(grid[0])
        
        # 1. Initialize queue and visited set
        queue = deque([(start_r, start_c, 0)]) # (row, col, distance)
        visited = {(start_r, start_c)}
        
        directions = [(0, 1), (0, -1), (1, 0), (-1, 0)]
        
        while queue:
            r, c, dist = queue.popleft()
            
            # --- Process the current cell --- 
            print(f"Visiting ({r}, {c}) at distance {dist}")
            
            # 2. Explore neighbors
            for dr, dc in directions:
                new_r, new_c = r + dr, c + dc
                
                # 3. Boundary and Visited Checks
                if (0 <= new_r < rows and 0 <= new_c < cols and
                        (new_r, new_c) not in visited):
                    
                    # Mark visited *immediately* after adding to queue
                    visited.add((new_r, new_c))
                    queue.append((new_r, new_c, dist + 1))

### Common Pitfalls ⚠️

- **Boundary Checks**: Always the first thing you should check in your traversal function. Forgetting `0 <= new_r < ROWS` and `0 <= new_c < COLS` is a common source of errors.

- **Visited Timing (BFS)**: This is a critical and subtle bug. **You must mark a cell as visited immediately after adding it to the queue**, not when you pop it from the queue. If you wait until you pop it, the same cell can be added to the queue multiple times by different neighbors, leading to incorrect results and TLE (Time Limit Exceeded).

## Chapter 9: Other Key Concepts & Patterns

### Cycle Detection

- **Undirected Graph**: During a DFS, if you encounter a neighbor that is already visited but is *not* the immediate parent of the current node, you have found a cycle.
- **Directed Graph**: During a DFS, keep track of nodes currently in the recursion stack. If you encounter a node that is already in the current recursion stack, you have found a back edge, which means there is a cycle. This is often implemented with a three-color system (white: unvisited, gray: visiting, black: visited).

### Bipartite Graphs

A graph is bipartite if its vertices can be divided into two disjoint and independent sets, U and V, such that every edge connects a vertex in U to one in V. In other words, you can "2-color" the graph such that no two adjacent nodes have the same color.

**Implementation**: Use BFS or DFS. Start by coloring the source node with one color (e.g., 1). Then, traverse the graph. For every neighbor of a node, color it with the opposite color. If you ever find a neighbor that is already colored with the *same* color as the current node, the graph is not bipartite.

### LeetCode Examples

- **Cycle Detection**: [Course Schedule (LC #207)](https://leetcode.com/problems/course-schedule/) - This is the canonical problem for detecting cycles in a directed graph.
- **Bipartiteness**: [Is Graph Bipartite? (LC #785)](https://leetcode.com/problems/is-graph-bipartite/) - A direct application of the 2-coloring algorithm.