# Graph Algorithms

In [3]:
import collections
from typing import List


### DFS
* Retains the path in class variable


```
        1
      /   \
     2     3
    / \
   4   5
```

In [2]:

#### Definition for a binary tree node
class TreeNode:
    def __init__(self, val=0, left=None, right=None):
        self.val = val
        self.left = left
        self.right = right

class DFS:
    def __init__(self):
        self.explored = []

    def dfs_traversal(self, root):
        if root is None:
            return []
        else:
            self.explored.append(root.val)  # pre-order traversal
            print(f"Found {root.val}")
            self.dfs_traversal(root.left)
            self.dfs_traversal(root.right)

d = TreeNode(val=5)
c = TreeNode(val=4)
b = TreeNode(val=3)
a = TreeNode(val=2, left=c, right=d)
root = TreeNode(val=1, left=a, right=b)
dfs = DFS()
dfs.dfs_traversal(root)

Found 1
Found 2
Found 4
Found 5
Found 3


#### DFS for a binary tree
* Retains path as parameter

In [5]:
def dfs_traversal(root, explored):
    if root is None:
        return []
    else:
        explored.append(root.val)
        print(f"Found {root.val}")
        dfs_traversal(root.left, explored)
        dfs_traversal(root.right, explored)

d = TreeNode(val=5)
c = TreeNode(val=4)
b = TreeNode(val=3)
a = TreeNode(val=2, left=c, right=d)
root = TreeNode(val=1, left=a, right=b)
dfs = DFS()
explored = []
dfs_traversal(root, explored)
print(explored)

Found 1
Found 2
Found 4
Found 5
Found 3
[1, 2, 4, 5, 3]


#### DFS for Adjacency List
* Only retains path by printing it

```
            A
          /   \
         B     C
       /  \   /
      D   E  F
```

In [6]:
# https://www.educative.io/edpresso/how-to-implement-depth-first-search-in-python
# Using a Python dictionary to act as an adjacency list
# Time complexity: O(V + E)
graph = {
    'A' : ['B','C'],
    'B' : ['D', 'E'],
    'C' : ['F'],
    'D' : [],
    'E' : [],
    'F' : []
}


def dfs(visited, graph, node):
    if node not in visited:
        print(node)
        visited.add(node)
        for neighbor in graph[node]:
            dfs(visited, graph, neighbor)

visited = set() # Set to keep track of visited nodes.
# Driver Code
dfs(visited, graph, 'A')

A
B
D
E
C
F


#### DFS for Adjacency List
* Retains the path via the visited array

In [7]:
def dfs(visited, graph, node):
    if node not in visited:
        visited.append(node)
        for neighbor in graph[node]:
            dfs(visited, graph, neighbor)

# Driver Code
visited = [] # List to keep track of visited nodes.
dfs(visited, graph, 'A')
print(visited)

['A', 'B', 'D', 'E', 'C', 'F']


#### Find and remove leaves in a binary tree (DFS application)

```
                      20                       20               20        20
                    /    \                   /    \           /
                  8       22               8       22       8
                /   \    /   \              \
              5      3  4    25               3
                    / \
                  10    14

```

Levels of leaf nodes.

The higher level is found after removing lower level leaves
* level 0 nodes: 5, 10, 14, 4, 25
* level 1 nodes: 3, 22
* level 2 nodes: 8
* level 3 nodes: 20

In [15]:
class TreeNode:
    def __init__(self, key):
        self.val = key
        self.left = None
        self.right = None

root = TreeNode(20)
root.left = TreeNode(8)
root.right = TreeNode(22)
root.left.left = TreeNode(5)
root.left.right = TreeNode(3)
root.right.left = TreeNode(4)
root.right.right = TreeNode(25)
root.left.right.left = TreeNode(10)
root.left.right.right = TreeNode(14)

In [9]:
class Solution:
    def findLeaves(self, root: TreeNode) -> List[List[int]]:
        """
            Example:
                      20                       20               20        20
                    /    \                   /    \           /
                  8       22               8       22       8
                /   \    /   \              \
              5      3  4    25               3
                    / \
                  10    14

        - level 0 nodes: 5, 10, 14, 4, 25
        - level 1 nodes: 3, 22
        - level 2 nodes: 8
        - level 3 nodes: 20
        Output:
        {
            0: [5, 10, 14, 4, 25],
            1: [3, 22],
            2: [8],
            3: [20]
        }
        """
        lookup = collections.defaultdict(list)

        def dfs(node: TreeNode, level: int):
            """
            Gets the maximum depth from the left and right subtrees
            of a given node
            """
            if not node:
                return level
            max_left_level = dfs(node.left, level)
            max_right_level = dfs(node.right, level)
            level = max(max_left_level, max_right_level)
            lookup[level].append(node.val)
            return level + 1
        dfs(root, 0)
        print(lookup)
        # lookup.values() for defaultdict returns
        # a list of lists for all values
        return lookup.values()

Solution().findLeaves(root)

defaultdict(<class 'list'>, {0: [2, 8], 1: [4], 2: [22], 3: [20]})


dict_values([[2, 8], [4], [22], [20]])

In [10]:
root = TreeNode(20)
root.left = TreeNode(2)
root.right = TreeNode(22)
root.right.left = TreeNode(4)
root.right.left.left = TreeNode(8)
Solution().findLeaves(root)

defaultdict(<class 'list'>, {0: [2, 8], 1: [4], 2: [22], 3: [20]})


dict_values([[2, 8], [4], [22], [20]])

### BFS Adjacency List

In [10]:
# Source: https://www.educative.io/edpresso/how-to-implement-a-breadth-first-search-in-python
# Time complexity: O(V + E)
#                     A - C
#                    / \ /
#                   B   F
#                 / \  /
#                D   E

# this is a directed graph
my_graph = {
  'A' : ['B','F', 'C'],
  'B' : ['D', 'E'],
  'C' : ['F'],
  'D' : [],
  'E' : ['F'],
  'F' : []
}
from typing import List
def bfs(visited: List[str], graph: dict, node: str):
    visited.append(node)
    queue.append(node)

    print("Visiting vertices: ")
    while queue:
        # print("Queue: ", queue)
        s = queue.pop(0)
        print(s, end = " ")
        for neighbour in graph[s]:
            if neighbour not in visited:
                visited.append(neighbour)
                queue.append(neighbour)

# Driver Code
visited = [] # List to keep track of visited nodes.
queue = []
bfs(visited, my_graph, 'A')
print("\nVisited: ", visited)


Visiting vertices: 
A B F C D E 
Visited:  ['A', 'B', 'F', 'C', 'D', 'E']


#### Pros and cons of matrix representation vs. adjacency list representation vs. objects and pointers to represent graphs
Sources:
* [https://www.section.io/engineering-education/graph-data-structure-python/](https://www.section.io/engineering-education/graph-data-structure-python/)
* [https://www.geeksforgeeks.org/comparison-between-adjacency-list-and-adjacency-matrix-representation-of-graph/](https://www.geeksforgeeks.org/comparison-between-adjacency-list-and-adjacency-matrix-representation-of-graph/)
* [https://www.bigocheatsheet.com](https://www.bigocheatsheet.com)

Matrix representation (a.k.a adjacency matrix)

```
  A B C D E
A 0 4 1 0 0
B 0 0 2 1 0
C 1 0 0 0 0
D 3 0 0 0 0
E 0 0 0 0 0
```

Adjacency List representation
```
A -> [(B, 4), (C, 1)]
B -> [(C, 2), (D, 1)]
C -> [(A, 1)]
D -> [(A, 3)]
```

Note: In a complete graph where every vertex is connected, every entry in the matrix would have a value,
so iterating over all of them takes $O(|E|) = O(|V|^2)$ time.

##### Storage
* Matrix representation requires $O(|V|^2)$ space since a VxV matrix is used to map connections. Wasted space for unused connections
* Adjacency list requires $O(|V| + |E|)$ space since a O(|E|) is required for storing neighbors corresponding to each vertex
* Objects and pointers requires $O(|V| + |E|)$ space

##### Adding a vertex
* Matrix representation requires the storage be increased to $O((|V|+1)^2)$. To do this we need to copy the whole matrix
* Adjacency list requires O(1) time on average. Hash table insertion requires O(n) time in the worst though if there are too many collisions.

##### Removing an edge
* Matrix representation takes O(1) time since we set matrix[i][j] = 0
* Adjacency list representation requires potentially traversing over all edges in the worst case so it's O(|E|) time

##### Querying edges
* Matrix representation reqires O(1) time always.
* Adjacency List requires $O(|V|)$ time since a vertex can have at most $O(|V|)$ neighbors, so we'd have to check every adjacency vertex.