# Trees

Trees are one of the most data structures in all of computer science.

They allows easily and efficiently store and search for data.

They also beautifully illustrate the behavior of algorithms and they can allow us to elegantly and rigorously analyze the behavior and runtime of algorithms.

A **Tree** is a linked data structure consisting of nodes which contain elements and references to 0 or more children.

```
          8
        /   \
       3     17
      / \   /  \
     0   5 9    21
```

The **root** is at the top of the tree. Every node has 0 or more children. Nodes with 0 or more children are **leaves**.

The **height** of a tree is the number of levels in a tree.

The height of a tree is usually much less than the number of nodes in the tree.

The above example tree is **binary tree** because every node can have at most two branches.

In a binary tree, with 3 levels, we can store 7 elements.

With 4 levels, we can store 15 elements.

With 5 levels, we can store 31 elements.

On each level $L$, we can $2^L$ elements:

```
Level L | # elements
------------------
    0   |    1
    1   |    2
    2   |    4
    3   |    8
    4   |    16
```

Over the whole tree with $L$ levels, we can store $2^L - 1$ elements

Eg, with 5 levels, we can store $2^5 - 1 = 32 - 1 = 31$ elements

```
# Levels | Max Size of Tree
---------------------------
    0    |       0
    1    |       1
    2    |       3
    3    |       7
    4    |       15
    5    |       31
```

Question: If we have $N$ elements that we want to store, how many levels do we need?

Since the number of elements we can store in a tree with L levels is exponential on the powers of 2, then the inverse of this, the number of levels needed to store N elements is logarithmic.

To store $N$ elements in a binary tree, we need $O(log_2 N)$ levels to that tree.

**The heights on our trees are logarithmic on the number of elements in this.**

# Binary Search Trees

A binary tree is one where every node can have at most two children.

A Binary Search Tree is a binary tree with a special search property.

**The Binary Search Tree Property**

For every node in the tree, all elements in its left subtree are less than it, and all element in its right subtree are greater (or equal to) than it.

What about duplicates? Normally, duplicates are placed in the right subtree.



# Implementation

BSTs are naturally recursive.

A BST contains its root element and two subtrees: its left and right subtree. A subtree is just a tree.

Attributes:
- root element
- left subtree
- right subtree

## Functions

### insert(element)

To insert an element, we need to navigate to where it should belong, create a new BST containing it, and set this new BST to be a subtree in the right place.

```
      8
```

To insert 21 into this tree, create a new right subtree of 8 containing 21 since 21>8.

```
      8
       \
       21
```

```
      8
     / \
    3   21
```


## Tree Traversals

A traversal visits every element in a tree, allowing us to process them. 

There are 3 tree traverals:
- preorder traversal
- inorder traversal
- postorder traversal

We recursively traverse the subtrees of a tree.

The difference between the three traversals is the when the processing of the element of this tree is performed with respect to the traversing of its subtrees.

**preorder traversal**
```python
print(self.element)
preorder_traverse(self.left)
preorder_traverse(self.right)
```

**inorder traversal**
```python
inorder_traverse(self.left)
print(self.element)
inorder_traverse(self.right)
```

**postorder traversal**
```python
postorder_traverse(self.left)
postorder_traverse(self.right)
print(self.element)
```

## Performing an inorder traversal

```
          8
        /   \
       3     17
      / \   /  \
     0   5 9    21
```

```python
inorder_traverse(self.left)
print(self.element)
inorder_traverse(self.right)
```

Traversal: `0 3 5 8 9 17 21` 

## Preorder Traversal

```
          8
        /   \
       3     17
      / \   /  \
     0   5 9    21
```

```python
print(self.element)
preorder_traverse(self.left)
preorder_traverse(self.right)
```

Traversal: `8 3 0 5 17 9 21`

## Postorder Traversal

```
          8
        /   \
       3     17
      / \   /  \
     0   5 9    21
```

```python
postorder_traverse(self.left)
postorder_traverse(self.right)
print(self.element)
```

Traversal: `0 5 3 9 21 17 8`

## Why have pre or post order traversal?

`x = 4 + 3 * 9`

We know to multiply first, then add.

We have PEMDAS.

To write a program to parse arithmetic syntax according to the order specified by PEMDAS, we'd need a convoluted set of conditionals to evaluate it. 

The problem is that the general expression is ambiguous with respect to the order of operations.

Our general mathematical notation is **infix** notation. The operator is in the middle of the operation.

`4 + 3`

There are two other arithmetic notations:
- **prefix** notation: `+ 4 3`
- **postfix** notation: `4 3 +`

On our original example `4 + 3 * 9`

`3 9 * 4 +`

To evaluate, we work from left to right, applying each operator to the preceding two operands.

We can represent arithmetic expressions as a tree:

```
   +
  / \
 4   *
    / \
   3   9
```

In order traversal: `4 + 3 * 9 `

postorder: `4 3 9 * +`

Trees and traversals are used all the time to correctly evaluate arithmetic expressions.

In [4]:
class BST:
    def __init__(self, element):
        self.element = element
        self.left = None
        self.right = None

    def insert(self, val):
        if val < self.element:
            # either the left subtree is empty and we can 
            # insert val here
            if self.left == None:
                self.left = BST(val)
            # or there is already a left subtree and we have 
            # to continue the insertion process
            else:
                self.left.insert(val)
        else: # duplicates will go to the right
            if self.right == None:
                self.right = BST(val)
            else:
                self.right.insert(val)

    def contains(self, val):
        if self.element == val:
            return True
        if val < self.element:
            if self.left == None:
                return False
            else:
                return self.left.contains(val)
        else:
            if self.right == None:
                return False
            else:
                return self.right.contains(val)

    def inorder_traverse(self):
        leftChild = self.left.element if self.left != None else None
        rightChild = self.right.element if self.right != None else None
        # go left if there is a left
        if self.left != None:
            self.left.inorder_traverse()
        print("{}: {} {}".format(self.element, leftChild, rightChild))
        # go right if there is a right
        if self.right != None:
            self.right.inorder_traverse()
     
tree = BST(8)
tree.insert(3)
tree.insert(17)
tree.insert(0)
tree.insert(5)
tree.insert(9)
tree.insert(21)

print(tree.contains(1))
tree.inorder_traverse()

False
0: None None
3: 0 5
5: None None
8: 3 17
9: None None
17: 9 21
21: None None


# Runtime Analysis

```
          8
        /   \
       3     17
      / \   /  \
     0   5 9    21
```

# Contains

The runtime of contains is constrained by the number of levels in the tree. How many comparisons do we have to perform until we get to a leaf?

If a tree is balanced, if there are roughly the same amount of nodes on either side, then the height of a tree is logarithmic on the number of elements.

`contains()` is $O(log n)$

Not all BSTs are balanced:

```
0
 \
  3
   \
    5
     \
      8
       \
        9
         \
          17
           \
            21
```

In a non-balanced BST, the runtime of `contains()` is $O(n)$

# insertion

Runtime is also $O(log n)$ in a balanced BST or $O(n)$ in an unbalanced BST.

# Traversals

Traversals have $O(n)$ because they visit every element of the tree.

# Balanced Trees

Our BST implementation is not balanced.

There are BST implementations that maintain balance by rebalance the tree after every insertion and removal.

Examples of Balanced BSTS:
- AVL Tree
- Red-Black Tree

