# **Trees and graphs**
______________________________

# Binary Search Trees and Algorithms on Trees

## Contents:
- [Binary Search Trees](#Binary-Search-Trees)
    - [BST traversals](#BST-traversals)
    - [BST implementation](#Implement-a-BST)
    - [Problems](#Problems)
- [Red-Black Binary Search Trees](#Red-Black-Binary-Search-Trees)
- [Skip Lists](#Skip-Lists)

___________
##  Binary Search Trees

***It is a tree data structure which stores a collection and it keeps the main properties:***
 - elements can not repeat in collection (set-based collections)
 - each element (node) has a key, which is always a number
 - every node of a tree has two children
 - NILs have no children and no keys
 - `a key of the left child is less than its parent's key and a right child's key is greater than a parent's key`
 - performance of a tree depends on a height
 - height of a BST - longest path from a root to a NIL (not incl. root, but incl. NIL)

$$ height(NIL) = 0 $$
$$ height(n) = max[height(n.left), height(n.right)] + 1 $$

<img src = 'pics/Binary-Search-Tree.png' width = 500>

### BST traversals
Visitor patterns $ \Theta (n) $ :
- Inorder traversal - first visit ***left*** child, then ***node*** itself and then its ***right*** child 

In [1]:
def inorder(Node):
    if Node != None:
        inorder(Node.left)
        print(Node.key)
        inorder(Node.right)

`In this case Nodes are printed in sorted order`

- Preorder traversal - first visit ***Node***, then ***left*** child and then ***right*** child

In [2]:
def preoder(Node):
    if Node != None:
        print(Node.key)
        preorder(Node.left)
        preorder(Node.right)    

- Postorder traversal - first visit ***left*** child, then ***right*** child and then ***Node*** itself

In [3]:
def postoder(Node):
    if Node != None:
        postorder(Node.left)
        postorder(Node.right)
        print(Node.key)

### Implement a BST

<u>Operations:</u>

1. create a node 
2. find a node
3. insert a node
4. delete a node
5. compute a height


In [4]:
class Node: 
    # Implement a node of the binary search tree.
    # Constructor for a node with key and a given parent
    # Parent can be None for a root node.
    def __init__(self, key, parent = None): 
        self.key = key
        self.parent = parent 
        self.left = None # We will set left and right child to None
        self.right = None
        # Make sure that the parent's left/right pointer
        # will point to the newly created node.
        if parent != None:
            if key < parent.key:
                assert(parent.left == None), 'parent already has a left child -- unable to create node'
                parent.left = self
            else: 
                assert key > parent.key, 'key is same as parent.key. We do not allow duplicate keys in a BST since it breaks some of the algorithms.'
                assert(parent.right == None ), 'parent already has a right child -- unable to create node'
                parent.right = self
        
    # Utility function that keeps traversing left until it finds 
    # the leftmost descendant
    def get_leftmost_descendant(self):
        if self.left != None:
            return self.left.get_leftmost_descendant()
        else:
            return self
    
    # The search algorithm
    # Calling search recursively on left or right child.
    # If search succeeds: 
    # returning a tuple True and the node in the tree with the key we are searching for.
    # If the search fails to find the key: 
    # returning a tuple False and the node which would be the parent if we were to insert the key subsequently.
    def search(self, key):
        if self.key == key: 
            return (True, self)
        else:
            if self.key == key:
                return (True, self)
            elif self.key > key:
                # check left child
                if self.left == None:
                    return (False, self)
                else:
                    return self.left.search(key)
            else:
                # check right child
                if self.right == None:
                    return (False, self)
                else:
                    return self.right.search(key)        
    
    # The insert algorithm
    # To insert first searching for it and find out
    # the parent whose child the currently inserted key will be.
    # Then creating a new node with that key and insert.
    # Returning None if key already exists in the tree.
    # Returning the new node corresponding to the inserted key otherwise.
    def insert(self, key):
        (b, p) = self.search(key)
        if b:
            return None
        else:
            return Node(key, parent=p)
        
    # An algorithm to compute height of the tree.
    # height of a node whose children are both None is defined to be 1.
    # height of any other node is 1 + maximum of the height of its children.
    # Returning a number that is the height.
    def height(self):
        if self.left is None and self.right is None:
            return 1
        else:
            l, r = 1, 1
            if self.left is not None:
                l += self.left.height()
            if self.right is not None:
                r += self.right.height()
        return max(l, r)        
    
    # An algorithm to delete a key in the tree.
    # First, finding the node in the tree with the key.
    # Case 1: both children of the node are None
    #   -- in this case, deletion is easy: simply find out if the node with key is its
    #      parent's left/right child and set the corr. child to None in the parent node.
    # Case 2: one of the child is None and the other is not.
    #   -- replace the node with its only child. In other words,
    #      modify the parent of the child to be the to be deleted node's parent.
    #      also change the parent's left/right child appropriately.
    # Case 3: both children of the parent are not None.
    #    -- first find its successor (go one step right and all the way to the left).
    #    -- function get_leftmost_descendant may be helpful here.
    #    -- replace the key of the node by its successor.
    #    -- delete the successor node.
    # return: no return value specified
    
    def delete(self, key):
        (found, node_to_delete) = self.search(key)
        assert(found == True), f"key to be deleted:{key}- does not exist in the tree"
        # 1
        if node_to_delete.left == None and node_to_delete.right == None:
            if node_to_delete == node_to_delete.parent.left:
                node_to_delete.parent.left = None
            elif node_to_delete == node_to_delete.parent.right:
                node_to_delete.parent.right == None
        # 2.1        
        elif node_to_delete.left != None and node_to_delete.right == None:
            node_to_delete.left.parent = node_to_delete.parent
            if node_to_delete == node_to_delete.parent.left:
                node_to_delete.parent.left = node_to_delete.right
            elif node_to_delete == node_to_delete.parent.right:
                node_to_delete.parent.right = node_to_delete.left
        # 2.2
        elif node_to_delete.right != None and node_to_delete.left == None:
            node_to_delete.right.parent = node_to_delete.parent
            if node_to_delete == node_to_delete.parent.left:
                node_to_delete.parent.left = node_to_delete.right
            elif node_to_delete == node_to_delete.parent.right:
                node_to_delete.parent.right = node_to_delete.left
        # 3
        elif node_to_delete.left != None and node_to_delete.right != None:
            suc = node_to_delete.right.get_leftmost_descendant()
            node_to_delete.key = suc.key
            suc.parent.left = None

In [5]:
# Testing
t1 = Node(25, None)
t2 = Node(12, t1)
t3 = Node(18, t2)
t4 = Node(40, t1)

print('-- Testing basic node construction (originally provided code) -- ')
assert(t1.left == t2), 'test 1 failed'
assert(t2.parent == t1),  'test 2 failed'
assert(t2.right == t3), 'test 3 failed'
assert (t3.parent == t2), 'test 4 failed'
assert(t1.right == t4), 'test 5 failed'
assert(t4.left == None), 'test 6 failed'
assert(t4.right == None), 'test 7 failed'
# The tree should be : 
#             25
#             /\
#         12     40
#         /\
#     None  18
#

print('-- Testing search -- ')
(b, found_node) = t1.search(18)
assert b and found_node.key == 18, 'test 8 failed'
(b, found_node) = t1.search(25)
assert b and found_node.key == 25, 'test 9 failed -- you should find the node with key 25 which is the root'
(b, found_node) = t1.search(26)
assert(not b), 'test 10 failed'
assert(found_node.key == 40), 'test 11 failed -- you should be returning the leaf node which would be the parent to the node you failed to find if it were to be inserted in the tree.'

print('-- Testing insert -- ')
ins_node = t1.insert(26)
assert ins_node.key == 26, ' test 12 failed '
assert ins_node.parent == t4,  ' test 13 failed '
assert t4.left == ins_node,  ' test 14 failed '

ins_node2 = t1.insert(33)
assert ins_node2.key == 33, 'test 15 failed'
assert ins_node2.parent == ins_node, 'test 16 failed'
assert ins_node.right == ins_node2, 'test 17 failed'

print('-- Testing height -- ')

assert t1.height() == 4, 'test 18 failed'
assert t4.height() == 3, 'test 19 failed'
assert t2.height() == 2, 'test 20 failed'

print('Success!')

-- Testing basic node construction (originally provided code) -- 
-- Testing search -- 
-- Testing insert -- 
-- Testing height -- 
Success!


In [6]:
# Testing deletion
t1 = Node(16, None)
# insert the nodes in the list
lst = [18,25,10, 14, 8, 22, 17, 12]
for elt in lst:
    t1.insert(elt)

# The tree should look like this
#               16
#            /     \
#          10      18
#        /  \     /  \
#       8   14   17  25
#          /         /  
#         12        22


# Let us test the three deletion cases.
# case 1 let's delete node 8
# node 8 does not have left or right children.
t1.delete(8) # should have both children nil.
(b8,n8) = t1.search(8)
assert not b8, 'Test A: deletion fails to delete node.'
(b,n) = t1.search(10)
assert( b) , 'Test B failed: search does not work'
assert n.left == None, 'Test C failed: Node 8 was not properly deleted.'

# Let us test deleting the node 14 whose right child is none.
# n is still pointing to the node 10 after deleting 8.
# let us ensure that it's right child is 14
assert n.right != None, 'Test D failed: node 10 should have right child 14'
assert n.right.key == 14, 'Test E failed: node 10 should have right child 14'

# Let's delete node 14
t1.delete(14)
(b14, n14) = t1.search(14)
assert not b14, 'Test F: Deletion of node 14 failed -- it still exists in the tree.'
(b,n) = t1.search(10)
assert n.right != None , 'Test G failed: deletion of node 14 not handled correctly'
assert n.right.key == 12, f'Test H failed: deletion of node 14 not handled correctly: {n.right.key}'

# Let's delete node 18 in the tree. 
# It should be replaced by 22.

t1.delete(18)
(b18, n18) = t1.search(18)
assert not b18, 'Test I: Deletion of node 18 failed'
assert t1.right.key == 22 , ' Test J: Replacement of node with successor failed.'
assert t1.right.right.left == None, ' Test K: replacement of node with successor failed -- you did not delete the successor leaf properly?'

print('-- All tests passed!--')

-- All tests passed!--


### Problems

#### Problem 1

In this problem, you are given a binary search tree whose keys are numbers. We would like to convert it to a list of all nodes with keys at depth 1 (root), depth 2 (children of root), and so on. At each depth, the keys must appear from left to right.

The example below will clarify the problem.

**Example:**

Consider the BST below as input:
<img src="pics/tree1.png" width="40%"/>

You will need to output the list
~~~
[11, 4, 18, 15, 21, 13, 17]
~~~

The algorithm should work in time that is linear in the number of nodes of the tree.

For convenience, there's a binary search tree data structure `class TreeNode` with functions for insertion. 

In [7]:
class TreeNode:
    # Constructor for tree nodde
    def __init__(self, key, parent_node=None):
        self.key = key # set the key
        self.parent = parent_node # set the parent_node
        self.left = None # set the left child to None -- no left child to begin with
        self.right = None # set the right child to None - no right child to begin with.
    
    def is_root(self):
        return parent_node == None
    
    # Function: insert
    # insert a node with key `new_key` into the current tree.
    def insert(self, new_key):
        key = self.key 
        if new_key == key:
            print(f'Already inserted key {key}. Ignoring')
        elif new_key < key: # new_key must go into the left subtree
            if self.left == None: # no left child?
                new_node = TreeNode(new_key, self) # create one with self as parent
                self.left = new_node # set the left pointer
            else:
                self.left.insert(new_key) # recursively call insert on left subtree
        else:  # new_key must go into the right subtree.
            assert new_key > key
            if self.right == None: # no right child?
                new_node = TreeNode(new_key, self) # create one
                self.right = new_node
            else: 
                self.right.insert(new_key) # recusively call insert on right subtree.


def depthWiseTraverse(root_node):
    Q = [root_node]
    ans = [root_node.key]
    while len(Q) != 0:
        u = Q.pop(0)
        for v in (u.left, u.right):
            if v is not None and v.key not in ans:
                ans.append(v.key)
                Q.append(v)
    return ans

In [8]:
def make_tree(insertion_list):
    assert len(insertion_list) > 0
    root_node = TreeNode(insertion_list[0])
    for elt in insertion_list[1:]:
        root_node.insert(elt)
    return root_node

print('-- Test 1 --')
# Same as the example above
tree1 = make_tree([11, 18, 15,  13, 21, 17, 4])
lst1 = depthWiseTraverse(tree1)
print(lst1)
assert lst1 == [11, 4, 18, 15, 21, 13, 17]

print('-- Test 2 --')

tree2 = make_tree([3, 1, 2, 4, 6, 7])
lst2 = depthWiseTraverse(tree2)
print(lst2)
assert lst2 == [3, 1, 4, 2, 6, 7]

print('-- Test 3 --')
tree3 = make_tree([7, 3, 1, 2, 4, 6, 15, 8, 11, 10, 9])
lst3 = depthWiseTraverse(tree3)
print(lst3)
assert lst3 == [7, 3, 15, 1, 4, 8, 2, 6, 11, 10, 9]

print('All tests passed!')

-- Test 1 --
[11, 4, 18, 15, 21, 13, 17]
-- Test 2 --
[3, 1, 4, 2, 6, 7]
-- Test 3 --
[7, 3, 15, 1, 4, 8, 2, 6, 11, 10, 9]
All tests passed!


#### Problem 2

Once again consider a binary search tree (BST) whose keys are numbers. Given such a BST, we are asked to compute the sum along each branch from root to a leaf node starting with the leftmost branch and moving on to the rightmost.
  - For the purposes of this problem a leaf node is defined as having neither a left child nor a right child.

The example below will clarify the problem.

__Example__

Consider the BST below as input:
<img src="pics/tree2.png" width="45%">

We will need to output the list
~~~
[16, 22, 57, 61, 50]
~~~

Note: 
  - 16 = 11 + 4 -1 + 2
  - 22 = 11 + 4 + 7
  - 57 = 11 + 18 + 15 + 13
  - 61 = 11 + 18 + 15 + 17
  - 50 = 11 + 18 + 21

The algorithm should work in time that is linear in the number of nodes of the tree.

For convenience, we will reuse the binary search tree data structure `class TreeNode` with functions for insertion from the previous problem. 

In [9]:
def sumOfBranches(root_node):
    branches = []

    def dfs(node, visited):
        visited.append(node.key)
        if node.left is None and node.right is None:
            branches.append(visited)        
        for v in [node.left, node.right]:
            if v is not None and v.key not in visited:
                dfs(v, visited.copy())
    
    dfs(root_node, [])

    return [sum(i) for i in branches]

In [10]:
def make_tree(insertion_list):
    assert len(insertion_list) > 0
    root_node = TreeNode(insertion_list[0])
    for elt in insertion_list[1:]:
        root_node.insert(elt)
    return root_node

print('-- Test 1 --')
# Same as the example from problem 1
tree1 = make_tree([11, 18, 15,  13, 21, 17, 4])
lst1 = sumOfBranches(tree1)
print(lst1)
assert lst1 == [15, 57, 61, 50]

print('-- Test 2 --')
# Same as example from problem 2

tree2 = make_tree([11,4, 18, -1, 7, 15, 21, 2, 13, 17])
lst2 = sumOfBranches(tree2)
print(lst2)
assert lst2 == [16, 22, 57, 61, 50]

print('-- Test 3 --')
tree3 = make_tree([15])
lst3 = sumOfBranches(tree3)
print(lst3)
assert lst3 == [15]

print('-- Test 4 --')
tree4 = make_tree([4, 1, 2, 3, 8, 5, 6, 7,  10, 9])
lst4 = sumOfBranches(tree4)
print(lst4)
assert lst4 == [10, 30, 31]

print('All tests passed!')

-- Test 1 --
[15, 57, 61, 50]
-- Test 2 --
[16, 22, 57, 61, 50]
-- Test 3 --
[15]
-- Test 4 --
[10, 30, 31]
All tests passed!


#### Problem 3

We have a rectangular grid of points where one corner is $(0,0)$ and the other corner is $(W,H)$, where $W,H$ represent the width and height of the grid, respectively. From each point $(x,y)$, we can move along one of the cardinal directions to $$(x', y') \in \left\{ (x+1, y), (x-1, y), (x, y+1), (x, y-1) \right\},$$ as long as $0 \leq x' \leq W$ and $0 \leq y' \leq H$ (i.e, we are not allowed to move out of the grid). 

Furthermore, we specify a set of $k$ circles $$C = \left\{(x_1, y_1, r_1), \ldots, (x_k, y_k, r_k ) \right\}$$
where each circle has center $(x_i, y_i)$ and radius $r_i$.

The goal is to find a path from $(0,0)$ to $(W,H)$ while avoiding any point on the surface of or inside the circles in $C$. If such a path is found, your algorithm should return the path as a list of grid points. If not, your algorithm should return the empty list.

__Example 1__

Consider $W = H = 3$ and two circles $C= \{ (1,2,1), (2,2,0.5) \}$.

<img src="pics/grid1.jpg" width="40%">

The red lines show a path from $(0,0)$ to $(3,3)$. Your algorithm may return a list
`[(0,0), (1,0), (2,0), (3, 0), (3,1), (3,2), (3,3) ]` (there is another path in this case and any of them may be returned.


__Example 2__

Consider $W = H = 3$ and two circles $C= \{ (1,2,1), (2,2,1) \}$.

<img src='pics/grid2.jpg' width='40%'>

There are no paths in this case (in particular (3,2) lies on the orange circle though this is not 100% clear from the picture). Your algorithm should return the empty list.

In [11]:
from math import sqrt

# We may use this function to test if a point lies inside given circle.
def ptInCircle(x,y, circles_list):
    for (xc,yc,rc) in circles_list:
        d = sqrt ( (x-xc)**2 + (y-yc)**2)
        if d <= rc:
            return True
    return False

def findPath(width, height, forbidden_circles_list):
    # width is a positive number
    # height is a positive number
    # forbidden_circles_list is a list of triples [(x1, y1, r1),..., (xk, yk, rk)]
    assert width >= 1
    assert height >= 1
    assert all(x <= width and x >=0 and y <= height and y >= 0 and r > 0 for (x,y,r) in forbidden_circles_list)

    vertices = []
    for i in range(width + 1):
        for j in range(height + 1):
            if not ptInCircle(i, j, forbidden_circles_list):
                vertices.append((i, j))

    root = (0,0)
    all_paths = []
    def dfs(node, visited):
        visited.append(node)
        if (node[0], node[1]+1) not in vertices and (node[0]+1, node[1]) not in vertices:
            all_paths.append(visited)
        for v in [(node[0], node[1] + 1), (node[0] + 1, node[1])]:
            if v in vertices and v not in visited:
                dfs(v, visited.copy())

    dfs(root, [])
    path = ([i for i in all_paths if (i[0] == root and i[-1] == (width, height))])
    if len(path) == 0: return []
    else: return min(path)

In [12]:
# testing
def checkPath(width, height, circles, path):
    assert path[0] == (0,0), 'Path must begin at (0,0)'
    assert path[-1] == (width, height), f'Path must end at {(width, height)}'
    (cur_x, cur_y) = path[0]
    for (new_x, new_y) in path[1:]:
        dx = new_x - cur_x
        dy = new_y - cur_y
        assert (dx,dy) in [(1,0),(-1,0), (0,1),(0,-1)]
        assert 0 <= new_x and new_x <= width
        assert 0 <= new_y and new_y <= height
        assert not ptInCircle(new_x, new_y, circles)
        cur_x, cur_y = new_x, new_y
    return
print('-- Test 1 -- ')

circles = [(2,2,0.5), (1,2,1)]
p = findPath(3, 3, circles)
print(p)
checkPath(3, 3, circles, p)
print('-- Test 2 -- ')

circles1 = [(2,2,1), (1,2,1)]
p1 = findPath(3, 3, circles1)
print(p1)
assert p1 == [], 'Answer does not match with ours'

print('-- Test 3 -- ')
p2 = findPath(5,5, circles1)
print(p2)
checkPath(5, 5, circles1, p2)

print('-- Test 4 --')

circles3 = [(1,2,0.5), (2,2,1), (3,3,1),(4,3,1)]
p3 = findPath(5, 5, circles3)
print(p3)
checkPath(5, 5, circles3, p3)

print('-- Test 5 --')
circles5 = [ (4,1, 1), (4,4,1),(2,6,1)]
p5 = findPath(6,6,circles5)
print(p5)
assert p5 == []
print('All tests passed!')

-- Test 1 -- 
[(0, 0), (1, 0), (2, 0), (2, 1), (3, 1), (3, 2), (3, 3)]
-- Test 2 -- 
[]
-- Test 3 -- 
[(0, 0), (1, 0), (2, 0), (3, 0), (3, 1), (4, 1), (4, 2), (4, 3), (4, 4), (4, 5), (5, 5)]
-- Test 4 --
[(0, 0), (0, 1), (0, 2), (0, 3), (0, 4), (0, 5), (1, 5), (2, 5), (3, 5), (4, 5), (5, 5)]
-- Test 5 --
[]
All tests passed!


__________
##  Red-Black Binary Search Trees

***It is a Self-Balancing Binary Search Tree with extra stuff:***
- each node can be either red or black color by the following rules:
    1. Root and NILs are black
    2. Every red node must have black children (both)
    3. The number of black nodes on any path from any node to a leaf must be the same
- it helps to be a tree balanced

<img src = 'pics/red-black-tree.png' width = 350>

#### Find

Find operation is the same as in BST

#### Insert

Insert operation includes extra points:
- to avoid red-red violation
- to maintain red-black properties

`The rules for insertion a node:`

Newly inserted node is always colored red, then there are several cases:
1. the parent of a newly inserted node is black -> everything is OK, nothing to do
2. the parent is red -> red-red violation -> a grandparent is black and an uncle is red -> we change the color of a grandparent and its children, so that grandpa becomes red, and our parent and uncle - black

<img src='pics/redBlackCase2.png' width = 400>

3. the parent is red -> red-red violation -> both grandparent and uncle are black -> perform a tree rotation (left or right)

<img src='pics/red-black-rotations.png' width = 400>

#### Delete

There are tons of deletion cases - look in the book for pseudocode

_____________
## Skip Lists

***It is an alternative to Balanced Binary Search Trees data structure, that supports the following operations:***
- Insert(key, value)
- Search for key
- Delete key
- Iterate through all items in sorted order

Skip List is a ***randomized*** data structure with many layers:
- Provides probabilistic guarantees
- Elegant approach to data structure design

#### Main idea:

<img src = 'pics/skip_lists.jpg' width = 500>

- Layers of lists
    - each layer is sorted in ascending order
    - bottom layer contains all elements in the skip list
    - each layer has 'sentinels' -INF and +INF
    - each layer has a subset of elements of previous layer


- Two types of pointers
    - down pointer -> pointer to same value in the layer below
    - right pointer -> pointer to next higher element in the current layer
    

#### Finding an element in a skip list

1. Input: *key* for the target element
2. Start with top most layer
3. While current element <= key:
    - go to next element to the left of current element in current layer
    - go back to the last element <= key
    - if current layer is NOT the last layer:
        - traverse down pointer to next layer and do step 3
    - else:
        if last element == key, then return it, else NOT FOUND

#### Inserting an element into a skip list

- First find where the element will be inserted in the last layer
- Insert element in the last layer while maintaining sorted order
- ***Toss a coin*** with probability 1/2 of turning *heads*
- While *latest coin toss turned up heads*:
    - if current layer is top layer,
        - then create a new layer and set it to current layer
        - else make the current layer point to one layer above
    - create a new node in the current layer and insert so that it is in sorted order
    - add a down pointer from new node to corresponding node in the layer below
    - *toss coin again*
    
Average height for each node in skip list is 2