# Trees  

In computer science, a tree is a widely used abstract data type that simulates a hierarchical tree structure, with a root value and subtrees of children with a parent node, represented as a set of linked nodes.  
The tree is nonlinear, and is not cyclical.  
* *root*: Portrayed at top of tree.  
* *parent*: node one level above (closer to root) present node  
* *child*: node one level below (further from root) present node  
* *leaf*: bottom level nodes, with only one edge connecting them to parent.  
* *subtree*: Extracting a node of a certain tree and treating it as the root and all its children as a different tree from the original tree.   
* *ordered tree*: for all trees directly connected to the root, if they are ordered in some way, we call the resulting tree an ordered tree. Implementations make use of pointers. Many implementations equip all parent nodes with (ordered) pointers to their children.  
* *N-ary Tree*: each node can only have up to N children. 
* *Balanced Tree*: When the number of nodes on the left and right are well balanced, the tree is called a balanced tree. 






The computational complexity of querying a tree is proportional to its height, $O(h)$.  
Even if the number of nodes are equal, $h$ can vary widely.  

## Balanced Trees  

Well balanced trees tend to have lower values of $h$ for the number of nodes they have.  
Binary trees where the depth of leaves only varies by at most 1 are called strongly balanced binary trees.  
Binary trees where all leaves have the same depth are called complete binary trees. On top of that, if all nodes other than the leaves have two children each, it is called a perfect binary tree.  

If a complete binary tree has height $h$, Then $N = 2^h - 1$ which means $h = O(\log N)$  


# Binary Heaps  

Data structure that makes use of binary trees.  
Makes it easy extract the maximum (minimum) value stored.  
A binary heap is defined as a binary tree with:  
* *Shape property*: a binary heap is a complete binary tree; that is, all levels of the tree, except possibly the last one (deepest) are fully filled, and, if the last level of the tree is not complete, the nodes of that level are filled from left to right.  

* *Heap property*: the key stored in each node is either greater than or equal to ($\ge$) or less than or equal to ($\le$) the keys in the node's children, according to some total order.






Heaps where the parent key is greater than or equal to (≥) the child keys are called max-heaps; those where it is less than or equal to (≤) are called min-heaps. Efficient (logarithmic time) algorithms are known for the two operations needed to implement a priority queue on a binary heap: inserting an element, and removing the smallest or largest element from a min-heap or max-heap, respectively.

Insertion and Deletion operations: 

![](heap.jpg)

Root will be maximum (minimum) value, so can query for max in $O(N)$ time.  
Insertion and Deletion are $O(h) = O(\log N)$


Creating a heap from insertion takes $N$ values and reorders them (same operation as maximum deletion on right of graph which takes at most $h$ swaps) making the overall time complexity $O(N \log N)$.  
However, heapifying an existing binary tree takes only $O(N)$.  
Heapifying means comparing to children nodes and swapping present value with the maximum value of the children nodes if the maximum value of the children nodes exceeds present value. Do this operation for every node $N$ and the outcome is ensured to be a heap ($O(N)$). 

Time complexity of heapify:  
* Let tree height be $h$. heapifying at node with depth $d$ means the number of swaps is at most $h - d$.   
* The number of nodes at depth $d$ is at most $2^d$ (0-th node, or the root, has 1, 1st level has 2, 2nd has 4, etc.).  
* Thus, the number of swaps is  
$$\sum_{d=0} ^{h-1} 2^d * (h-d) = 2^{h-1} * 1 + 2^{h-2} * 2 + 2^{h - 3} * 3 + \dots + 2^1 * (h - 1) + 2^0*h$$
complexity is $O(2^h) = O(N)$


# Heap Sort  

1. Heapify the given array ($O(N)$)  
2. Extract (and delete) the maximum $N$ times ($O(N \log N)$)  

Total complexity is $O(N\log N)$  
When extracting top $K$, the complexity is $O(K \log N)$  



# Binary Search Trees  

Data structure that supports insertion, deletion, and search of elements much like a hash table or linked list.  
A *binary search tree* is a binary tree with values `key[v]` for every node `v`, such that:  
* internal nodes each store a key greater than all the keys in the node's left subtree and less than those in the right subtree.  



Example: 
![](binsearch.png)

searching for elements: go to each node, compare target element with node element. If target is smaller, go to left child, else go to right child until target is found. Time complexity $O(h) = O(\log N)$

insertion is similar. 

In [10]:
class BinSearchNode:
    def __init__(self, key, parent=None):
        self.left_child = None
        self.right_child = None
        self.key = key
        self.parent = parent
    
    def num_children(self):
        if self.left_child and self.right_child:
            return 2
        elif not self.left_child and not self.right_child:
            return 0
        else:
            return 1
    
    def return_children(self):
        return [child for child in [self.left_child, self.right_child] if child]
    
    def empty(self):
        return self.left_child == self.right_child == self.key == None

    def insert(self, key):
        if self.empty():
            self.key = key
        elif key < self.key:
            if not self.left_child:
                self.left_child = BinSearchNode(key, self)
            else:
                self.left_child.insert(key)
        elif key > self.key :
            if not self.right_child:
                self.right_child = BinSearchNode(key, self)
            else:
                self.right_child.insert(key)
    # needs improvement, how to alter in place? 
    def remove(self, key):
        target = self.lookup(key)
        if target.num_children() == 0:
            if target.parent.right_child.key == key:
                target.parent.right_child = None
            else:
                target.parent.left_child = None
        elif target.num_children() == 1:
            replacement = target.return_children()[0]
            if target.parent.right_child.key == key:
                target.parent.right_child = replacement
            else:
                target.parent.left_child = replacement
        else:
            replacement = target.left_child.find_max()
            replacement_key = replacement.key
            self.remove(replacement_key)
            target.key = replacement_key
            
    def lookup(self, key):
        # root
        if key == self.key:
            return True

        #recursive
        elif key > self.key and self.left_child:
            return self.left_child.lookup(key)
        elif key < self.key and self.right_child:
            return self.right_child.lookup(key)
        else:
            return False
    
    def find_min(self):
        if not self.left_child:
            return self.key
        else:
            return self.left_child.find_min()

    def find_max(self):
        if not self.right_child:
            return self.key
        else:
            return self.right_child.find_max()



19
2


# Union Find Data Structure  

Data structure that stores a collection of disjoint (non-overlapping) sets. Equivalently, it stores a partition of a set into disjoint subsets. It provides operations for adding new sets, merging sets (replacing them by their union), and finding a representative member of a set.

In union find, one group is represented by one tree. A collection of trees is called a forest.  
  
As long as the group is represented as a tree, the nodes and parent-children relations are irrelevant.  

## Union By Size  

Define size of a tree to be the number of nodes.  

Join two trees so that the root of the smaller tree becomes a child node of the root of the bigger tree.  

![](union.jpg)

## Path Compression  

When searching for a node's root, convert all parent nodes traversed in the process into roots. 

![](pathcompression.jpg)

```
function MakeSet(x) is
    if x is not already in the forest then
        x.parent := x
        x.size := 1     // if nodes store size
        x.rank := 0     // if nodes store rank
    end if
end function

function Find(x) is
    if x.parent ≠ x then
        x.parent := Find(x.parent)
        return x.parent
    else
        return x
    end if
end function

function Union(x, y) is
    // Replace nodes by roots
    x := Find(x)
    y := Find(y)

    if x = y then
        return  // x and y are already in the same set
    end if

    // If necessary, rename variables to ensure that
    // x has at least as many descendants as y
    if x.size < y.size then
        (x, y) := (y, x)
    end if

    // Make x the new root
    y.parent := x
    // Update the size of x
    x.size := x.size + y.size
end function
```




