<div style="text-align: right">
    <i>
        LIN 537: Computational Lingusitics 1 <br>
        Fall 2019 <br>
        Alëna Aksënova ft. Joanne Chau
    </i>
</div>

# Notebook 15: Trees and tree traversals 

In this notebook, we will learn a concept of **tree traversals** -- which refers to the process of checking and or updating each node of a **tree** structure _exactly once_. There are three types of tree traversals in which will be covered: _pre-order, in-order,_ and _post-order_ traversals.

## Implementation of trees

**Tree** data structure can be implemented in Python as a _collection of nodes_ with the relations such as "child" defined in-between them. The basic item of the tree is a node, and we can define it using a class `Node`.

Trees are extremely important for linguistics, and especially for those working on syntax. For the purpose of this course, we will be working only with the _binary trees,_ but the code can be generalized towards other types of branching as well.

Consider the following implementation of `Node`. It has the following attributes:

  * `label` stores the label of the node;
  * `left_child` defines the left child of the node;
  * `right_child` defines its right child.

In [None]:
class Node():
    """ Defines a node in a tree. """
    
    def __init__(self, label, left_child = None, right_child = None):
        self.label = label
        self.left_child = left_child
        self.right_child = right_child

Now, let us encode a very simple binary tree given below.

<img src="images/15_1.png" width="200">

In [None]:
NP = Node("NP")
VP = Node("VP")
S = Node("S", left_child = NP, right_child = VP)

Or, alternatively, we can define it as below.

In [None]:
S = Node("S", left_child = Node("NP"), right_child = Node("VP"))

Now, it is easy to access labels of the nodes in that tree, starting from the root node.

In [None]:
print("Root node:", S.label)
print("Left child of the root:", S.left_child.label)
print("Right child of the root:", S.right_child.label)

**Practice 1.** Implement the following tree. Print the value of the node `V`.

<img src="images/15_2.png" width="250">

In [None]:
# your code

**Practice 2.** define a method `is_leaf` within the class `Node` that will return True if the node is a leaf, and False otherwise. Remember, that the node is a **leaf** if it has no children.

In [None]:
class Node():
    """ Defines a node in a tree. """
    
    def __init__(self, label, left_child = None, right_child = None):
        self.label = label
        self.left_child = left_child
        self.right_child = right_child
        
    def is_leaf(self):
        """ Tells if the node is a tree. """
        pass

Test if using the following trees. Expect the following output:

    Expected output:    S: False
                       NP: True

In [None]:
NP = Node("NP")
VP = Node("VP")
S = Node("S", left_child = NP, right_child = VP)

# print(" S:", S.is_leaf())
# print("NP:", NP.is_leaf())

## Traversals

**"Traverse"** is a basic operation that are applied to data structures. When we traverse, we visit every element of the structure. For example, we have traversed lists in the past when we were looping throught them using `for`-loops: every item of the list was accessed exactly once.
In **tree traverse**, we visit every node of the tree once. Unlike lists or strings, where there is a logical start and end to the traverse, there are different ways to approach traverse for trees.

For example, consider the image below.
Here, the labels are the orders in which the nodes are visited.
So, we start from the root, then we proceed to its left child, then to the right one.
Then we access the left child of the left child of the root, etc.


<center>
<img src = "http://www.techgeekbuzz.com/wp-content/uploads/2019/09/Tree-Traversal.png" width = "400">
</center>
<center>
<i>Example of a tree traversal </i>
</center>

## Tree traversals

Unlike lists, trees can be read in different ways. For example, we can either go breadth-first (**breadth first traversals**), or first descent as deep as possible (**depth first traversal**). Usually they are called **breadth first search (BFS)** and **depth first search (DFS)** because search is a frequent reason to traverse a tree.

In BFS, we start at the highest node and process the tree layer-by-layer, not looking at the nodes' children or sibling relationships. This is a linear processing of the tree.

DFS is the opposite, where children of nodes come into considerations. We process the trees by looking into each of the nodes to process all its' children before moving to the next sibling. There are three types of depth first traversals: **pre-order**, **in-order**, and **post-order**. 

## Breadth First Search (BFS)
BFS, also known as **Level Order Traversal**, traverses nodes in layers, or levels. Look at the diagram below: 
<center>
<img src = "https://he-s3.s3.amazonaws.com/media/uploads/fdec3c2.jpg" width = "400">
</center>
<center>
<i>Demonstration on how to read a level order traversal. </i>
</center>

This traversal always starts from the top of the tree. Imaging that each step of the tree is a layer, we read from the top down, left to right. In this tree (ignoring the weird braching), the traverse would read `0 1 2 3 4 5 6 7`. In this case, `0` would be our **root node**, i.e. the starting node of the tree.

For example, consider the encoding of the following tree.

<img src="images/15_3.png" width="250">

In [None]:
tree = Node("3")
four = Node("4")
five = Node("5")
six = Node("6")

one = Node("1", tree, four)
two = Node("2", five, six)

zero = Node("0", one, two)

In order to avoid a long yet readable tree definition, we can use a more concise way.

In [None]:
tree = Node("0", Node("1", Node("3"), Node("4")), Node("2", Node("5"), Node("6")))

The following code implements a method `level_order_traversal` for the class `Node`.

In [None]:
class Node():
    """ Defines a node in a tree. """
    
    def __init__(self, label, left_child = None, right_child = None):
        self.label = label
        self.left_child = left_child
        self.right_child = right_child
    
    
    def level_order_traversal(self):
        """ Implements the level-order traversal of a tree. """
        
        # here we will save all the nodes that we explored already
        explored = []
        
        # here we have out to-do list, we'll keep updating it
        queue = [self]
        
        # while there is something in the queue, take its first element
        while queue:
            node = queue.pop(0)
            
            if node:
                # mark the current node as explored, and add its children to the queue
                explored.append(node)
                queue.extend([node.left_child, node.right_child])
                
        # collect labels of the explored nodes
        order = [i.label for i in explored]
        return order

In [None]:
tree = Node("0", Node("1", Node("3"), Node("4")), Node("2", Node("5"), Node("6")))
print(tree.level_order_traversal())

**Question.** Why do we need the code `if node:` in the `level_order_traversal` above?

**Practice.** Using the code of the `level_order_traversal`, implement a method `BFS` that takes a label of a node as argument and returns all sub-nodes of the root with that label.

In [None]:
class Node():
    """ Defines a node in a tree. """
    
    def __init__(self, label, left_child = None, right_child = None):
        self.label = label
        self.left_child = left_child
        self.right_child = right_child
    
    
    def level_order_traversal(self):
        """ Implements the level-order traversal of a tree. """
        
        explored = []
        queue = [self]
        
        while queue:
            node = queue.pop(0)
            
            if node:
                explored.append(node)
                queue.extend([node.left_child, node.right_child])
                
        order = [i.label for i in explored]
        return order
    

    def BFS(self, label):
        """ Implements breadth first search. """
        pass

Test it in the following cell.

    Expected output (example):
        [<__main__.Node object at 0x7f3edc3f8240>, 
         <__main__.Node object at 0x7f3edc3f8a20>, 
         <__main__.Node object at 0x7f3edc3f8208>]

In [None]:
tree = Node("0", Node("1", Node("0"), Node("4")), Node("2", Node("0"), Node("6")))

# uncomment to test
# print(tree.BFS("0"))

## Depth First Search (DFS)

As mentioned earlier in the notebook, there are tree types of Depth First Search for trees. 

*  **Pre-Order Traversal**: In this traversal, we start at the root of the tree, read everything in it's left subtree first and then everything in the right subtree. When reading pre-order traversals, ensure that all children are read before moving over to the next subtree. Notice as you examine the diagram below that we finish the lowest subtrees first before moving along the tree. 
<center>
<img src = "https://www.gatevidyalay.com/wp-content/uploads/2018/07/Preorder-Traversal-Example.png" width = "400">
</center>
<center>
<i> Example of a pre-order traversal </i>
</center> 
*  **In-Order Traversal**: For in-order traversals, first start with the left subtree, then we go to the root, and then the right subtree. We start with the lowest left sibling and then work our way up and over. 
<center>
<img src = "https://www.gatevidyalay.com/wp-content/uploads/2018/07/Inorder-Traversal-Example-1.png" width = "400">
</center>
<center>
<i> Example of a in-order traversal </i>
</center> 
*  **Post-Order Traversal**: For our final traversal, we start with the lowest left node, then its' sister to the right, and then the node and work our way up the tree. 
<center>
<img src = "https://www.gatevidyalay.com/wp-content/uploads/2018/07/Postorder-Traversal-Example.png" width = "400">
</center>
<center>
<i> Example of a post-order traversal </i>
</center> 

Below are diagrams that will help guide you in reading traversals as you accustom your eyes into doing so. 
<center>
<img src = "https://www.gatevidyalay.com/wp-content/uploads/2018/07/Preorder-Traversal-Shortcut-1.png" width = "400">
</center>
<center>
<i> Demonstration of reading a pre-order traversal </i>
</center> 
<center>
<img src = "https://www.gatevidyalay.com/wp-content/uploads/2018/07/Inorder-Traversal-Shortcut-1.png" width = "400">
</center>
<center>
<i> Demonstration of reading a in-order traversal </i>
</center> 
<center>
<img src = "https://www.gatevidyalay.com/wp-content/uploads/2018/07/Postorder-Traversal-Shortcut-1.png" width = "400">
</center>
<center>
<i> Demonstration of reading a post-order traversal. Imagine plucking off all the left leaves of a tree first. </i>
</center> 

## Tree Traversal and Recursion

Recursion in computer science is a method of problem solving where the solution is dependent on the smaller instances to the solution, think of the triangle puzzle we did in class, divide and conquer! 

Examine the below tree. Pretend you're a computer and your task is to parse this tree as a preorder traversal. How would you do this? 

<center>
<img src = "https://www.cs.cmu.edu/~adamchik/15-121/lectures/Trees/pix/binaryTree.bmp" width = "400">
</center>
<center>
<i> Simple binary tree. </i>
</center> 

Start at the root node `A` and read its left child, `B`. Now, `B` becomes the root node, and we examine its left child, `D`. Now that all the subtrees of the left are read, we go to the right child of `A`, `C`. We examine now the left child of `C`, which is `E`, and lastly, the right child of `C`, `F`. Can you see how this is recursive? 

Tree traversal are multiple direct recursion. Multiple recursion is when the recursion contains multiple self self-references. Direct recursion occurs when a function calls upon itself.

Keeping this in mind, we can design and implement functions that can check readings of tree traversals for us. 


##Python

So the question now becomes, how do we code this into Python. We can simply classify our nodes into a class, where we look at what is our children and parents. 

In [None]:
class Node(): 
  def __init__(self,key):
    self.parent = None
    self.sister = None
    self.left = None
    self.right = None
    self.val = key

We can design functions that can help check the readings of tree traversals for us. 

We can make a `def Levelorder()` function with the class of Nodes. 

In [None]:
def Levelorder(): 
  # implement a function that would print each of the levels of the tree 
  

Let's say we are only curious to what the daughters are to the root node. (For instance, in syntax, we can say the root node is TP, and its daughters are DP and VP. Our function should be able to input the level we want to read, and be able to give us all the nodes on that level. Keep in mind, when we dig deeper into the tree we will be looking into more than one subtree, for example, the below tree, layer/level 2 are nodes `4 5 6 7`. 

<center>
<img src = "https://he-s3.s3.amazonaws.com/media/uploads/fdec3c2.jpg" width = "400">
</center>
<center>
<i>Demonstration on how to read a level order traversal. </i>
</center>

This function, we will call `def Levelnodes():` Instead of asking for all the possible nodes given on the layer, we will ask for specific daughters of a single node. The input should be the node we are looking as the root and the layer would be daughters that are under it. For the above picture, if we wanted all daughters for `1`, we would tell the code this and the output should be `4 5` only.

In [None]:
def Levelnodes(root, level): 
  # input the root and level 
  # looking into the level, tell us the daughter 

For a `def Preorder():` function, knowing the class for Nodes, implement a function to check the tree traversal in the pre-order traverse.

In [None]:
def Preorder(): 
  # allow the tree to visit the root first 
  # traverse the left subtree, Preorder(left-subtree)
  # traverse the right subtree, Preorder(right-subtree)

`def Inorder():` would look very similar 

In [None]:
def Inorder(): 
  # traverse the left tree, Inorder(left-subtree) 
  # visit the root 
  # traverse the right tree, Inorder(right-subtree)

**Practice**: And now implement a fuction `def Postorder():`

In [None]:
def Postorder(): 
  # traverse the left subtree, Postoder(left-subtree)
  # traverse the right subtree, Postoder(right-subtree)
  # visit the root