# Lab 9
## Data Structures & Algorithms
### Thursday, 18 April 2024

## Today
* [Refresher on graphs](#graphs)
* [Depth-first search](#dfs)
* [Breadth-first search](#bfs)
* [Exercises](#exercises)

## Refresher on graphs  <a class="anchor" id="graphs"></a>

Graphs are a fundamental data structure used to represent pairwise relationships between objects. They consist of a set of vertices $V$ (or nodes) and a set of edges $E$ connecting some pairs of vertices, so a graph object can be written as $G=(V,E)$. In a directed graph, edges have a direction, while in an undirected graph, edges have no direction.

Graphs are either represented by adjacency matrix ($n \times n$ matrix $\boldsymbol{A}$ with $A_{ij}=1$ if there is an edge between nodes $i$ and $j$) or by an adjacency list, which works better for sparse graphs (list for each node $i$ containing all nodes to which $i$ is connected through an edge).

And some other concepts:

**connectivity:** a graph is **connected** if there is a **path** (sequence of nodes where each pair is joined by an edge) between every pair of vertices in the graph. There are different algorithms to determine the connectivity of a graph. 

**cycle:** a path which starts and ends at the same vertex.

**tree:** connected graph that contains no cycles. Trees are used in many different contexts. For example, we have used them represent hierarchical relationships between steps in recursive algorithms (divide and conquer).

### Implement trees

Let's look at a very simple tree: a binary one (every parent node has no more than two childen). 

<div>
   <img src="images/tree_example.png" width="200px">
</div>

How can we create this structure in Python?

In [1]:
class TreeNode(object):
    """The tree node class (for a binary tree)"""
    
    def __init__(self, x):
        # The value attribute stores the 'data' (aka the number or ID of the node)
        self.val = x
        
        # The left child
        self.left = None
        
        # The right child
        self.right = None

The following code creates the tree we saw above in the example:

In [2]:
# The root
root = TreeNode(0)

# The left and right child at depth 1
root.left = TreeNode(1)
root.right = TreeNode(2)

# The children at depth 2
root.left.left = TreeNode(3)
root.left.right = TreeNode(4)
root.right.left = TreeNode(5)
root.right.right = TreeNode(6)

Now that we have created this object, we might want to find a way of displaying our tree (other than drawing some simple structure in power point, which is what I did above). For this, we can do something called **tree traversal** = showing the value of the nodes in a certain order. Four different types of tree traversal are: pre-order, in-order, post-order, and level-order. We'll look at two of them together and leave the others for the exercises.

### Traverse trees

**Pre-order traversal**
The pre-order traversal has the following recursive process: if the root is None (empty), return None; otherwise, display the value of the root, traverse the left subtree by recursively calling the pre-order function and then traverse the right subtree by recursively calling the pre-order function.

In [3]:
def pre_order(root):
    """
    Pre-order traversal 
    
    Parameters
    ----------
    root : the root of a binary tree
    """

    # Base
    if root == None:
        return
    
    print(root.val, end=' ')
    
    # Recursion
    pre_order(root.left)
    pre_order(root.right)

In [4]:
pre_order(root)

0 1 3 4 2 5 6 

**Level-order traversal** Let's think of a different way to traverse the tree: instead of moving directly to the left child of a visited subtree (as is done in the pre-order traversal), we want to traverse the tree level by level. The idea is that we go through each level, starting from the root, and traverse the nodes on this level form left to right. So for our example tree

<div>
   <img src="images/tree_example.png" width="200px">
</div>

we want to print 0, 1, 2, 3, 4, 5, 6

Let us look at the pseudo-code for this (you will implement this in the exercises):

```
level_order(root)
    If the root is empty (None), return None (corner case)
    Initialise a queue with the root
    While the queue is not empty:
        For N times (where N is the length of the current queue, which is the number of nodes at the current level):
            Remove the first node in the queue (i.e. the one furthest to the left)
            If this first node is not None:
                Print its value
                Add its children - first the left than the right child - to the end of the queue (i.e. to the right)
            EndIf
        EndFor
    EndWhile
```

### Graph search techniques

The different types of traversal represent different searching algorithms: depth-first search and breadth-first search. These types of algorithms are used to traverse or search a graph to find a particular vertex or to determine the connectivity of the graph.

## Depth-first search (DFS) <a class="anchor" id="dfs"></a> 

**Idea**: starts at the root (selecting an arbitrary node as the root in the case of a graph) and explores as far as possible along each branch before backtracking.

**Complexity**: The time complexity of DFS is O(V + E), since - in the worst case - every node and edge are visited once. The space complexity of DFS is O(V). Here, the algorithm uses a stack to keep track of the vertices to visit at the next steps. Again, the stacks contains all vertices of the graph in the worst case.

## Breadth-first search (BFS) <a class="anchor" id="bfs"></a> 

**Idea**: starts at the root (again, selecting some arbitrary node as the root in the case of a graph) and explores all of the neighbour nodes at the present level prior to moving on to the nodes at the next depth level.

**Complexity**: time complexity of BFS is O(V + E), where V is the number of vertices and E is the number of edges in the graph; this is because, in the worst case scenario, BFS visits every vertex and every edge once. The space complexity of BFS is O(V), where V is the number of vertices. This is because BFS uses a queue data structure to keep track of the vertices to visit next. In the worst case scenario, the queue contains all the vertices of the graph.

## Exercises  <a class="anchor" id="exercises"></a> 

### Exercise 1 (DFS)

Update the code for the `pre_order` function we looked at above to return the nodes of a tree in a different order (this will be called `in_order`: instead of displaying the value of the root first, it should be displayed after traversing the left subtree but before traversing the right subtree.

In [5]:
def in_order(root):
    """
    In-order traversal 
    
    Parameters
    ----------
    root : the root of a binary tree
    """
    
    # Implement me
    # Base
    if root == None:
        return
    
    # Recursion
    in_order(root.left)
    
    print(root.val, end=' ')
    
    # Recursion
    in_order(root.right)

In [6]:
in_order(root)

3 1 4 0 5 2 6 

### Exercise 2 (DFS)

Now write a function called `post_order`, where the root node is displayed last.

In [7]:
def post_order(root):
    """
    Post-order traversal 
    
    Parameters
    ----------
    root : the root of a binary tree
    """

    # Implement me
    # Base
    if root == None:
        return
    
    # Recursion
    post_order(root.left)
    post_order(root.right)
    
    print(root.val, end=' ')

In [8]:
post_order(root)

3 4 1 5 6 2 0 

### Exercise 3 (BFS)

Now, implement the level-order traversal, according to the pseudo-code that you can find above. The pre-, in-, and post-order are different in terms of the order in which they display the root node. They are all similar in the sense that they always move to the left child as soon as they visit the root of each subtree (and they thus represent depth-first search). In this exercise, you will implement the 'level-order traversal' which we discussed earlier (which represents breadth-first search). 

Here, the process is as follows: for each level (starting from the root 0) we traverse the nodes that have the same 'depth' (aka that are on the same level) from left to right before moving to the next level. Hint: use a queue structure to implement this! If you aren't familiar with using queues in Python, a good way is to use the 'collections' library, which has a queue class called `deque`, that can be used as follows:

In [9]:
from collections import deque
 
# initialise the queue
qu = deque([1, 2, 3])
print("initial queue: ", qu)
 
# use append() to insert element at the right
qu.append(4)
print(qu)
 
# use appendleft() to insert element at the left
qu.appendleft(6)
print(qu)

initial queue:  deque([1, 2, 3])
deque([1, 2, 3, 4])
deque([6, 1, 2, 3, 4])


In [10]:
def level_order(root):
    """
    Level-order traversal 
    
    Parameters
    ----------
    root : the root of a binary tree
    """

    # Corner case
    if root == None:
        return

    # Initialize the queue
    queue = deque([root])
    
    # Level order traversal
    while len(queue) > 0:
        # Get the length of the queue
        n = len(queue)    
        for _ in range(n):
            # Remove the first node in the queue
            node = queue.popleft()            
            if node != None:
                print(node.val, end=' ')
                # Add the left and right children to the end of the queue
                queue.append(node.left)
                queue.append(node.right)
        print()

In [11]:
level_order(root)

0 
1 2 
3 4 5 6 



### Exercise 4 (DFS)

Implement a depth-first search algorithm to find the height of a binary tree.

Hint: Use recursion to implement your solution! If the root is None (i.e. an empty tree), the height is -1. Otherwise, the height is the maximum of the height of the left and right subtree, plus 1 (which accounts for the root). When you have arrived at a leaf node (i.e. with no children) the function returns -1 for both children (since they do not exist) plus 1 (adding 1 to the overall height and thus accounting for itself).

In [12]:
def tree_height_dfs(root):
    """
    Find the height of a binary tree
    
    Parameters
    ----------
    root : the root of a binary tree
    
    Returns
    ----------
    The height of the tree : an integer
    """
    
    # Implement me
    # Base
    if root == None:
        return -1
    
    # Recursion
    return max(tree_height_dfs(root.left), tree_height_dfs(root.right)) + 1

In [13]:
root1 = None

root2 = TreeNode(0)

root3 = TreeNode(0)
root3.left = TreeNode(1)
root3.right = TreeNode(2)

root4 = TreeNode(0)
root4.left = TreeNode(1)
root4.right = TreeNode(2)
root4.left.left = TreeNode(3)

print(tree_height_dfs(root1))
print(tree_height_dfs(root2))
print(tree_height_dfs(root3))
print(tree_height_dfs(root4))

-1
0
1
2


### Exercise 5 (BFS)

Implement a breadth-first search algorithm to find the height of a binary tree.

Hint: Very similar to the level-order example, you can use a queue to implement your solution. Use a variable called `height` to keep track of the height of the tree and increase it at each round of the level-order traversal. 

In [14]:
def tree_height_bfs(root):
    """
    Find the height of a binary tree    
    
    Parameters
    ----------
    root : the root of a binary tree
    
    Returns
    ----------
    The height of the tree : an integer
    """

    # Implement me
    # Corner case
    if root == None:
        return -1

    # Initialize the queue
    queue = deque([root])
    
    # Initialize the height
    height = -1

    # BFS
    while len(queue) > 0:
        # Update height
        height += 1

        n = len(queue)    
        for _ in range(n):
            # Remove the first node in the queue
            node = queue.popleft()
            
            if node.left != None:
                # Add the left child to the end of the queue
                queue.append(node.left)
            if node.right != None:
                # Add the right child to the end of the queue
                queue.append(node.right)

    return height

In [15]:
root1 = None

root2 = TreeNode(0)

root3 = TreeNode(0)
root3.left = TreeNode(1)
root3.right = TreeNode(2)

root4 = TreeNode(0)
root4.left = TreeNode(1)
root4.right = TreeNode(2)
root4.left.left = TreeNode(3)

print(tree_height_bfs(root1))
print(tree_height_bfs(root2))
print(tree_height_bfs(root3))
print(tree_height_bfs(root4))

-1
0
1
2


### Exercise 6 (DFS)

A function that finds the paths (from the root to a leaf) of a binary tree (with time and space complexity O(n)).

Here, the `find_paths` function has already been implemented. It defines a global variable called `paths` to keep track of the leaves and then uses a helper function, which calls itself **recursively** to append the leaves to the `path` list in DFS style. In the helper function, the base cases have already been implemented - implement the recursive step! 

Hint: In the helper function, we use a string, `path`, to keep track of each path. When the root is None, return None (already implemented).  When the node is a leaf, add the value of the root to `path` and append `path` to `paths` and return None (already implemented). When the node is not a leaf, add the value of the root and a string with an arrow ('->') to `path` and then recursively apply the helper function to the left and right subtree. The outcome should be `['0']` when the tree is only a root node, `['0->1', '0->2']` when the tree consists of a root node and a left child node `1` and a right child node `2`, etc.

In [16]:
def find_paths(root):
    """
    Find the paths (from the root to a leaf) of a binary tree   
    
    Parameters
    ----------
    root : the root of a binary tree
    
    Returns
    ----------
    The paths
    """
    
    global paths      
    paths = []

    helper(root, '')
    return paths

In [17]:
def helper(root, path):
    """
    Find the paths (from the root to a leaf) of a binary tree   
    
    Parameters
    ----------
    root : the root of a binary tree
    
    Returns
    ----------
    The paths
    """

    # Base (the root is None)
    if root == None:
        return

    # Base (we are at a leaf)
    if root.left == None and root.right == None:
        # Add the value of the root to path        
        path += str(root.val)
        # Append path to paths
        paths.append(path)
        return
    # Recursion
    else:
        # Add the value of the root and the arrow ('->') to path
        path += str(root.val) + '->'   
        helper(root.left, path)
        helper(root.right, path)

In [18]:
# Test
root1 = None

root2 = TreeNode(0)

root3 = TreeNode(0)
root3.left = TreeNode(1)

root4 = TreeNode(0)
root4.left = TreeNode(1)
root4.right = TreeNode(2)

root5 = TreeNode(0)
root5.left = TreeNode(1)
root5.left.left = TreeNode(3)
root5.right = TreeNode(2)
root5.right.left = TreeNode(4)

root6 = TreeNode(0)
root6.left = TreeNode(1)
root6.left.left = TreeNode(2)
root6.right = TreeNode(1)
root6.right.right = TreeNode(2)

print(find_paths(root1))
print(find_paths(root2))
print(find_paths(root3))
print(find_paths(root4))
print(find_paths(root5))
print(find_paths(root6))

[]
['0']
['0->1']
['0->1', '0->2']
['0->1->3', '0->2->4']
['0->1->2', '0->1->2']


### Exercise 7 (BFS)

Write a BFS algorithm to find the bottom-left node in a binary tree. 

Hint: Use a queue to design your solution (it can follow a very similar logic to level-order traversal). Use a variable (something like `bottom_left`) to keep track of the current value of the left-most node. Once you have traversed all nodes, this variable will be the bottom-left node. 

In [19]:
# Implementation
def find_bottom_left_node(root):
    """
    Find the bottom-left node in a binary tree
    
    Parameters
    ----------
    root : the root of a tree
    
    Returns
    ----------
    The value of the bottom-left node in a binary tree
    """

    # Implement me
    # Corner case
    if root == None:
        return root
    
    bottom_left = root.val
    
    # Initialize the queue
    queue = deque([root])
    
    while len(queue) > 0:
        # Get the length of the queue
        n = len(queue)
        for i in range(n):
            # Remove the first node in the queue
            node = queue.popleft()

            # Update bottom_left using the value of the left-most node on each level
            if i == 0:
                bottom_left = node.val

            if node.left != None:
                # Add the value of the left child to the end of the queue
                queue.append(node.left)
            if node.right != None:
                # Add the value of the right child to the end of the queue
                queue.append(node.right)

    return bottom_left

In [20]:
# Test
root1 = None

root2 = TreeNode(0)

root3 = TreeNode(0)
root3.left = TreeNode(1)

root4 = TreeNode(0)
root4.left = TreeNode(1)
root4.right = TreeNode(2)

root5 = TreeNode(0)
root5.left = TreeNode(1)
root5.left.left = TreeNode(3)
root5.right = TreeNode(2)
root5.right.left = TreeNode(4)

print(find_bottom_left_node(root1))
print(find_bottom_left_node(root2))
print(find_bottom_left_node(root3))
print(find_bottom_left_node(root4))
print(find_bottom_left_node(root5))

None
0
1
1
3
