# 6 Graph


## 1. Build

We can simply use Leetcode conventions to represent graph as nodes and its adjacent matrix, which can be implemented as a list of list in python.

In [3]:
def buildGraph(n_nodes, connections):
    graph = {}
    for node in range(n_nodes):
        graph[node] = []
    
    for node1, node2 in connections:
        graph[node1].append(node2)
        # if build an undirected graph, add the following line
        # graph[node2].append(node1)
    return graph

# test
n_nodes = 5
connections = [[0, 1], [0, 2], [1, 2], [1, 3], [2, 3], [2, 4]]
graph = buildGraph(n_nodes, connections)
print(graph)

{0: [1, 2], 1: [2, 3], 2: [3, 4], 3: [], 4: []}


## 2. Traversal

Traversal can solve problems like find paths given the source node and target node, and also the topological sorting sequences.
- find paths:
    - detect cycles in a graph `G`
        - if no cycle, topological sort is a linear ordering of nodes along a horizontal line so that all directed edges go from left to right.
        - if cycles are present, topological sorting is not possible.
    - find all possible paths from `s` to `t`


### Depth-first Search

- todo: implement using two approaches
    - traversal approach -> similar to backtracking
    - subporblem approach -> recusion with return value

```
visited = {}
onPath = {}

def traverse(G, s):
    // if visited: DO SOMETHING
    if visited[s]: return

    // if not visisted, then mark as visited
    visited[s] = true

    // make a choice:
    // mark s as onpath
    onPath[s] = true

    // recursion
    for v in G.adj[s]:
        traverse(G, v)
    
    // undo the choice: 
    // remove s from onpath
    onPath[s] = false
```

Depth-first search vs backtracking

```
// DFS, focus on node
def dfs(root):
    if not root:
        reutrn
    print(f"enter node {root}")
    for child in root.children:
        dfs(child)
    print(f"leave node {root}")

// Backtracking, focus on branch
def backtrack(root):
    if not root:
        return

    for child in root.children:
        // make a choice

        print(f"from node {root} to node {child}")
        backtrack(child)
        
        //undo a choice
        
        print(f"from node {child} to node {root}")

## BFS



In [1]:
# traverse a graph using BFS, which uses indegree to determine the order of traversal
def bfs(G):
    pass

## Shortest Path
Shortest path is to find the shortest path between two vertex:
- `single source shortest path`: find the shortest paths from a given source vertex `s` to each vertex `t`.
- `single destination shortest path`: find the shortest paths from each source vertex `s` to the destination vertex `t`. We can reverse the direction of each edge, and formulate a `single-source shortest path` problem.
- `single pair shortest path`: find the shortest path for a given pair (`s`, `v`). This problem usually requires to solve `single source shortest path` first.
- `all pairs shortest paths`: find a shortest path from `s` to `t` for every pair of vertices `s` and `t`. Although we can solve this problem by running a single source algorithm once from each vertex, we usually can solve it faster using XXX.


BFS is typically used for shortest paths problems.

DFS is typically used for reacheability problems. When used for shortest path problems, DFS need go over all possible paths from given source and target, and then find the shortest one. 


In [2]:
def shortest_path_bfs(G, s):
    pass

## Shortest Weighted Path

`Dijkstra algorithm` is usually used for detecting the shortest path in a `DAG with nonnegative weights`. Dijkstra algorithm extends BFS for weighted DAG by introducing a `DP table` to store the distance between vertices.

- initialization 
    - a distance table for recording the best distance found so far, and set to `infinity`
    - base case: the distance from the source vertex `s` to itself is set to 0
    - a priority queue to maintain the to-be-explored neighbor vertices, sorted based on their current distance to the given source `s`.
- main loop:
    - pop out the queue, and get the current vertex and its current distance
    - explore its neighbors:
        - for each neighbor vertex, calculate its current distance from the source `s`
        - if the current distance < the best distance found so far in the distance table:
            - update the distance table
            - enqueue
- return the distance table

How to reconstruct the shortest paths based on distance table?

## Bipartition Graph

Check if a graph is bipartition graph.

See Leetcode
- 0886
- 0785

## Union Find

see [here](https://labuladong.github.io/algo/di-yi-zhan-da78c/shou-ba-sh-03a72/bing-cha-j-323f3/).

Leetcode:
- 0323

In [None]:
class UnionFind():
    def __init__(self, n):
        # initialize the parent of each node to itself
        self.parent = [i for i in range(n)]
        # initialize the size of each set to 1
        self.size = [1 for i in range(n)]
        # initialize the number of disjoint sets to n
        self.num_disjoint_sets = n
    
    def find(self, node):
        # find the root of the node by traversing the parent
        # add path compression to make sure find() is O(1)
        # this recursion maintains the height of the tree to be 2 (root<-(node1, node2, ... node n)
        if self.parent[node] != node:
            # path compression: recusively set the parent of the node to the root
            self.parent[node] = self.find(self.parent[node])
        return self.parent[node]
    
    def union(self, node1 , node2):
        # find the root of each node
        root1 = self.find(node1)
        root2 = self.find(node2)
        
        # if the two nodes are already in the same set, do nothing
        if root1 == root2:
            return
        
        # otherwise, merge the two sets
        # here we dont merge the smaller set to the larger set for balancing purpose
        # because we have compression path in find(), the height of the tree is always 2
        self.parent[root1] = root2
        self.size[root2] += self.size[root1]

        # decrease the number of disjoint sets by 1
        self.num_disjoint_sets -= 1
    
    def connected(self, node1, node2):
        # check if two nodes are in the same set
        return self.find(node1) == self.find(node2)
    
    def count(self):
        # 
        return self.num_disjoint_sets

## Applications

### All paths Problem

Leetcode:

- 0797

### Cycle Detection

Leetcode

- 0207 