![pythonLogo.png](attachment:pythonLogo.png)
# Trees #

### Examples by Codecademy + commentary and exercise by N. Day - September 2022 ###

## This section introduces Trees. It contains interactive code (there may be some errors for you to repair) and a multiple choice quiz. At the end there is also <font color="red">a logbook exercise for you to complete</font>  ##





# Introduction to Trees and terminology #
 
Think of the natural tree:

![natTree](https://d32qe1r3a676y7.cloudfront.net/eyJidWNrZXQiOiJibG9nLWVjb3RyZWUiLCJrZXkiOiAiYmxvZy8wMDAxLzAxL2FkNDZkYmI0NDdjZDBlOWE2YWVlY2Q2NGNjMmJkMzMyYjBjYmNiNzkuanBlZyIsImVkaXRzIjp7InJlc2l6ZSI6eyJ3aWR0aCI6IDkwMCwiaGVpZ2h0IjowLCJmaXQiOiJjb3ZlciJ9fX0=)

The following terms are also adopted to describe the arrangement of data within a tree structure:

- Root - The starting node of the tree
- Leaf - The lowest level of a tree, which has no descendents
- Child - A descendent of a parent node. In a Binary Search Tree (BST) each parent can only have two children: a right child and left child. For effective Binary Search, the left child must always have a smaller integer value than the parent, and the right child must always have a greater value than the parent. 


![Treesgif](http://www.thecrazyprogrammer.com/wp-content/uploads/2017/08/Tree-Data-Structure.gif)

In [2]:
class TreeNode:
  def __init__(self, value):
    self.value = value
    self.children = []

  def __repr__(self, level=0):
    # HELPER METHOD TO PRINT TREE!
    ret = "--->" * level + repr(self.value) + "\n"
    for child in self.children:
      ret += child.__repr__(level+1)
    return ret

  def add_child(self, child_node):
    self.children.append(child_node) 

### TEST CODE TO PRINT TREE
company = [
  "Monkey Business CEO", 
  "VP of Bananas", 
  "VP of Lazing Around", 
  "Associate Chimp", 
  "Chief Bonobo", "Produce Manager", "Tire Swing R & D"]
root = TreeNode(company.pop(0))
for count in range(2):
  child = TreeNode(company.pop(0))
  root.add_child(child)

root.children[0].add_child(TreeNode(company.pop(0)))
root.children[0].add_child(TreeNode(company.pop(0)))
root.children[1].add_child(TreeNode(company.pop(0)))
root.children[1].add_child(TreeNode(company.pop(0)))
print("MONKEY BUSINESS, LLC.")
print("=====================")
print(root)


MONKEY BUSINESS, LLC.
'Monkey Business CEO'
--->'VP of Bananas'
--->--->'Associate Chimp'
--->--->'Chief Bonobo'
--->'VP of Lazing Around'
--->--->'Produce Manager'
--->--->'Tire Swing R & D'



# TreeNode class

From the top, let's create the constructor for TreeNode.

In [3]:
# Define your "TreeNode" Python class below
class TreeNode:
  def __init__(self, value):
    print("Initializing node...")
    self.value = value
    
seed = TreeNode("Koko")

Initializing node...


# Creating root and child nodes

In [4]:
# Define your "TreeNode" Python class below
class TreeNode:
  def __init__(self, value):
    self.value = value
    self.children = []

  def add_child(self, child_node):
    print("Adding " + child_node.value)
    self.children.append(child_node)

root = TreeNode("I am Root")
child = TreeNode("A wee sappling")

root.add_child(child)

Adding A wee sappling


# Removal of node from the tree

In [5]:
# Define your "TreeNode" Python class below
class TreeNode:
  def __init__(self, value):
    self.value = value
    self.children = []

  def add_child(self, child_node):
    print("Adding " + child_node.value)
    self.children.append(child_node)
    
  def remove_child(self, child_node):
    print("Removing " + child_node.value + " from " + self.value)
    new_children = []
    for child in self.children:
      if child != child_node:
        new_children.append(child)
    self.children = new_children

root = TreeNode("I am Root")
child = TreeNode("A wee sappling")
bad_seed = TreeNode("Root Rot!")

root.add_child(child)
root.add_child(bad_seed)

root.remove_child(bad_seed)


Adding A wee sappling
Adding Root Rot!
Removing Root Rot! from I am Root


# Refactor with lists - removal

In [6]:
# Define your "TreeNode" Python class below
class TreeNode:
  def __init__(self, value):
    self.value = value
    self.children = []

  def add_child(self, child_node):
    print("Adding " + child_node.value)
    self.children.append(child_node)
    
  def remove_child(self, child_node):
    print("Removing " + child_node.value + " from " + self.value)
    self.children = [child for child in self.children 
                     if child is not child_node]

root = TreeNode("I am Root")
child = TreeNode("A wee sappling")
bad_seed = TreeNode("Root Rot!")

root.add_child(child)
root.add_child(bad_seed)

root.remove_child(bad_seed)

Adding A wee sappling
Adding Root Rot!
Removing Root Rot! from I am Root


# Tree traversal

Three different types of tree traversal: 
- Pre-order
- Post-order
- In-order

![traversal](https://media.geeksforgeeks.org/wp-content/cdn-uploads/Preorder-from-Inorder-and-Postorder-traversals.jpg)

In [7]:
# Define your "TreeNode" Python class below
class TreeNode:
  def __init__(self, value):
    self.value = value
    self.children = []

  def add_child(self, child_node):
    print("Adding " + child_node.value)
    self.children.append(child_node)
    
  def remove_child(self, child_node):
    print("Removing " + child_node.value + " from " + self.value)
    self.children = [child for child in self.children 
                     if child is not child_node]
    
  def traverse(self):
    print(self.value)
    for node in self.children:
      print(node.value)

root = TreeNode("CEO")
first_child = TreeNode("Vice-President")
second_child = TreeNode("Head of Marketing")

root.add_child(first_child)
root.add_child(second_child)

root.traverse()


Adding Vice-President
Adding Head of Marketing
CEO
Vice-President
Head of Marketing


# Traversal - Root to Leaf

In [8]:
# Define your "TreeNode" Python class below
class TreeNode:
  def __init__(self, value):
    self.value = value
    self.children = []

  def add_child(self, child_node):
    print("Adding " + child_node.value)
    self.children.append(child_node)
    
  def remove_child(self, child_node):
    print("Removing " + child_node.value + " from " + self.value)
    self.children = [child for child in self.children 
                     if child is not child_node]

  def traverse(self):
    nodes_to_visit = [self]
    while len(nodes_to_visit) > 0:
      current_node = nodes_to_visit.pop()
      print(current_node.value)
      nodes_to_visit += current_node.children
    

root = TreeNode("CEO")
first_child = TreeNode("Vice-President")
second_child = TreeNode("Head of Marketing")
third_child = TreeNode("Marketing Assistant")

root.add_child(first_child)
root.add_child(second_child)
second_child.add_child(third_child)

root.traverse()


Adding Vice-President
Adding Head of Marketing
Adding Marketing Assistant
CEO
Head of Marketing
Marketing Assistant
Vice-President


# Binary Search Tree (BST)

Previously, we looked at the Binary Search algorithm. Remember that this operates on a divide and conquer principle. It will look for the middle of a sorted list. If the item is greater than the mid point, then the first half of the list is disregarded. The algorithm then iterates this process of dividing the list in half and will eventually find the item. 

Binary Search Trees will arrange nodes depending on their value. The root of the BST is regarded to be the mid point. Values lower than the mid point will be added to the left hand side, and values greater than the midpoint are added to the right hand side. This process repeats and items are then arranged in to levels (as can be seen below).

![BST_arrange](https://blog.penjee.com/wp-content/uploads/2015/12/optimal-binary-search-tree-from-sorted-array.gif)

Below is an example of a Binary Search algorithm running on a Tree of nodes alongside an Linear Search algorithm running on an array: 

![BST](https://blog.penjee.com/wp-content/uploads/2015/11/binary-search-tree-sorted-array-animation.gif)

In [9]:
def binary_search(sorted_list, target):
  left_pointer = 0
  right_pointer = len(sorted_list)
  
  # fill in the condition for the while loop
  while left_pointer < right_pointer:
    # calculate the middle index using the two pointers
    mid_idx = (left_pointer + right_pointer) // 2
    mid_val = sorted_list[mid_idx]
    if mid_val == target:
      return mid_idx
    if target < mid_val:
      # set the right_pointer to the appropriate value
      right_pointer = mid_idx
    if target > mid_val:
      # set the left_pointer to the appropriate value
      left_pointer = mid_idx + 1
  
  return "Value not in list"

# test cases
print(binary_search([5,6,7,8,9], 9))
print(binary_search([5,6,7,8,9], 10))
print(binary_search([5,6,7,8,9], 8))
print(binary_search([5,6,7,8,9], 4))
print(binary_search([5,6,7,8,9], 6))

4
Value not in list
3
Value not in list
1


# BST Traversal

In [10]:
class BinarySearchTree:
  def __init__(self, value, depth=1):
    self.value = value
    self.depth = depth
    self.left = None
    self.right = None

  def insert(self, value):
    if (value < self.value):
      if (self.left is None):
        self.left = BinarySearchTree(value, self.depth + 1)
        print(f'Tree node {value} added to the left of {self.value} at depth {self.depth + 1}')
      else:
        self.left.insert(value)
    else:
      if (self.right is None):
        self.right = BinarySearchTree(value, self.depth + 1)
        print(f'Tree node {value} added to the right of {self.value} at depth {self.depth + 1}')
      else:
        self.right.insert(value)
        
  def get_node_by_value(self, value):
    if (self.value == value):
      return self
    elif ((self.left is not None) and (value < self.value)):
      return self.left.get_node_by_value(value)
    elif ((self.right is not None) and (value >= self.value)):
      return self.right.get_node_by_value(value)
    else:
      return None
    
  # Define .depth_first_traversal() below:
  def depth_first_traversal(self):
    if (self.left is not None):
      self.left.depth_first_traversal()
    print(f'Depth={self.depth}, Value={self.value}')
    if (self.right is not None):
      self.right.depth_first_traversal()

tree = BinarySearchTree(48)
tree.insert(24)
tree.insert(55)
tree.insert(26)
tree.insert(38)
tree.insert(56)
tree.insert(74)

# Print depth-first traversal:
tree.depth_first_traversal()

Tree node 24 added to the left of 48 at depth 2
Tree node 55 added to the right of 48 at depth 2
Tree node 26 added to the right of 24 at depth 3
Tree node 38 added to the right of 26 at depth 4
Tree node 56 added to the right of 55 at depth 3
Tree node 74 added to the right of 56 at depth 4
Depth=2, Value=24
Depth=3, Value=26
Depth=4, Value=38
Depth=1, Value=48
Depth=2, Value=55
Depth=3, Value=56
Depth=4, Value=74


# Another example of BST

In [11]:
import random

class BinarySearchTree:
  def __init__(self, value, depth=1):
    self.value = value
    self.depth = depth
    self.left = None
    self.right = None

  def insert(self, value):
    if (value < self.value):
      if (self.left is None):
        self.left = BinarySearchTree(value, self.depth + 1)
        print(f'Tree node {value} added to the left of {self.value} at depth {self.depth + 1}')
      else:
        self.left.insert(value)
    else:
      if (self.right is None):
        self.right = BinarySearchTree(value, self.depth + 1)
        print(f'Tree node {value} added to the right of {self.value} at depth {self.depth + 1}')
      else:
        self.right.insert(value)
        
  def get_node_by_value(self, value):
    if (self.value == value):
      return self
    elif ((self.left is not None) and (value < self.value)):
      return self.left.get_node_by_value(value)
    elif ((self.right is not None) and (value >= self.value)):
      return self.right.get_node_by_value(value)
    else:
      return None
  
  def depth_first_traversal(self):
    if (self.left is not None):
      self.left.depth_first_traversal()
    print(f'Depth={self.depth}, Value={self.value}')
    if (self.right is not None):
      self.right.depth_first_traversal()


print("Creating Binary Search Tree rooted at value 15:")
tree = BinarySearchTree(15)

for x in range(10):
  tree.insert(random.randint(0, 100))
  
print("Printing the inorder depth-first traversal:")
tree.depth_first_traversal()

Creating Binary Search Tree rooted at value 15:
Tree node 8 added to the left of 15 at depth 2
Tree node 28 added to the right of 15 at depth 2
Tree node 46 added to the right of 28 at depth 3
Tree node 52 added to the right of 46 at depth 4
Tree node 68 added to the right of 52 at depth 5
Tree node 18 added to the left of 28 at depth 3
Tree node 71 added to the right of 68 at depth 6
Tree node 57 added to the left of 68 at depth 6
Tree node 29 added to the left of 46 at depth 4
Tree node 81 added to the right of 71 at depth 7
Printing the inorder depth-first traversal:
Depth=2, Value=8
Depth=1, Value=15
Depth=3, Value=18
Depth=2, Value=28
Depth=4, Value=29
Depth=3, Value=46
Depth=4, Value=52
Depth=6, Value=57
Depth=5, Value=68
Depth=6, Value=71
Depth=7, Value=81


# Further Reading

If interested in Trees, then BSTs are one of many arrangements of Trees. 


Other popular arrangements are: 

## AVL Trees 
Named after the programmers Georgy <b>Adelson-Velsky</b> and Evgenii <b>Landis</b> in 1962, the AVL tree is a self-correcting and balancing tree.

![AVLtree](https://upload.wikimedia.org/wikipedia/commons/f/fd/AVL_Tree_Example.gif)

## Red-Black Trees 

Invented by Rudolph Bayer in 1972, Red-Black Trees are a form of self-balancing tree but store additional colour information (which could be compressed to a 'bit' - 0 or 1 - to reduce memory overhead) which aids the re-balancing of the tree. Null pointers (None in Python) are also stored.

Red-Black Trees maintain a balanced structure with a guaranteed maximum height of O(log n), where 'n' is the number of nodes in the tree. This balanced structure ensures efficient search and insertion operations in the tree.

Rules: 
* Root must be black.
* New nodes start as red. 
* Red nodes cannot have red children (they must be black). 
* Self-balancing (like AVL)

![RBTrees](https://upload.wikimedia.org/wikipedia/commons/4/41/Red-black_tree_example_with_NIL.svg)


![RBTreesgif](https://camo.githubusercontent.com/a71d526a5cab7a42a8292610a52d28e893ef550a078fad1e945a43d1b44c8461/68747470733a2f2f63646e2e7261776769742e636f6d2f6d61656c76616c6169732f636f6d706f7274656d656e742d61726272652d726f7567652d6e6f69722d617665632d646f742f61356166666234322f6578656d706c655f616e696d6174696f6e2e676966)


## Decision Trees in data analysis and AI applications

The Tree structure can also be applied for decision making within AI applications.

![decisiontree](https://venngage-wordpress.s3.amazonaws.com/uploads/2019/08/what-is-a-decision-tree-5.png)


## Parse trees for Natural Language Processing 


![Con_NLP](https://upload.wikimedia.org/wikipedia/commons/5/54/Parse_tree_1.jpg)

![Dep_NLP](https://upload.wikimedia.org/wikipedia/commons/8/8c/Parse2.jpg)


## <font color="red">Logbook Exercise 9</font> ##

Insert a 'code' cell below. In this do the following:

- 1 - Implement the Binary Search Tree class as described above.
- 2 - Create a small dataset of integers and insert these nodes into an object of the BST class. For simplicity, generate a dataset which size is 'odd' so there is a natural mid point. Also avoid duplicate values. (Is there a Python structure which ignores duplicate values? Think back to an earlier lecture...)
- 3 - Check that you can traverse the tree in order. 
- 4 - Check that you can successfully retreive (search for) a node within the Tree. Check positive and negative cases (what happens when the item does not exist).
- 5 - Now generate a small dataset of single letters. Generate an odd size, and avoid duplicates again.
- 6 - Think about how you will insert these nodes into the tree... can they be added in the order they were generated?
- 7 - Once you have decided how to insert these into a BST, check that these were added in alphabetical order, by printing out the tree in order.   
- 8 - Study the remove_child method given above for a non-binary search tree. Are you able to use some of that code for a method that will remove a node from a BST? Your task is to code a remove_child method for the BST so it can remove any node from any position in the tree (leaf, parent, or root). You'll need to thoroughly test this to check that node references are maintained, and the correct node is made the parent.
- 9 - Challenge: Rather than printing out a vertical list of the tree nodes, can you print in the BST nodes in the arrangement shown in diagrams above? Are you able to format the print out of tree elements to look something like the below which shows the nodes and their edges:


         D
       /   \ 
      C     F
           / \
          E   G




# References & Learning Resources

 - W3Schools - there are many online resources for Python but the Python tutorial at https://www.w3schools.com/python/ is thorough, progressive, interactive and free. If you complete the main tutorial (skip the bits on installing Python as we will be using Ancaconda/Jupyter) the later sections on **"File Handling"**, **"NumPy"** and **"Machine Learning"** are also relevant. The **"Exercises"** and **"Quiz"** sections are also worthwhile activities for consolidating knowledge.
 - **Phillips, D. (2015). Python 3 object-oriented programming. Packt Publishing Ltd.** Although a 3rd edition has been released the 2nd edition is still pretty much up-to-date  and seems to be widely available in PDF format. As an added bonus this covers Design Patterns in some detail.
 - **https://www.learnpython.org/** is another comprehensive and intercative resource
 - **https://docs.python.org/3.7/tutorial/** is Python's own text-based tutorial. Despite the seemingly daunting number of sub-sections, it can be consumed in a fairly short time and manages to be both concise and comprehensive.
 - **Think Python 2e** is an excellent in-depth and free version of the O'Reilly hardcopy by Allen B. Downey and is available here ... https://greenteapress.com/wp/think-python-2e/
 - https://www.sololearn.com/ - great for mobile learning on the go ... free! Recommended by JJ
 - I have also adapted examples from *Learn Python In A Day: The Ultimate Crash Course To Learning The Basics Of Python In No Time* by *Acodemy* but this is out of print and is only mentioned for completeness.