# Chapter 14: Binary Search Trees

**Binary Search Trees are a workhorse of a data structure and can be used to solve almost every data structures problem reasonably efficiently**

* search for a key
* find min and max elements
* look for successor or predecssor of a search key
* enumerate keys in a range in a sorted order

* BST is a binary tree in which the nodes store keys that are comparable (integers or strings)
* The keys stored at nodes have to respect the BST property - key stored at a node is great than or equal to the keys stored at the nodes of its left subtree and less than or equal to the keys stored in the nodes of its right subtree
* Key lookup, insertion, and deletion take time proportional to the height of the tree, worst case can be O(n), however with non naive implementation guarentee O(log n)
* red-black trees are an example of height balanced BSTs and are widely used
* avoid putting mutable objects in a BST, or remove an object prior to updating if it is mutable

In [1]:
# BST prototype
class BSTNode:
    def __init__(self, data = None, left = None, right = None):
        self.data, self.left, self.right = data, left, right
    
    def set_right(self, r_node):
        self.right = r_node
    
    def set_left(self, l_node):
        self.left = l_node

## Binary search trees boot camp

* Searching is the single most fundamental application of BSTs
* also offers ability to find the next largest/next smallest element
* these operations, along with lookup, delete and find take time O(log n) for library implementations
* uses slightly more than O(n) space

In [2]:
# illustrates search on BSTs using recursion
def search_bst(tree, key):
    if not tree or tree.data == key:
        return tree
    elif key < tree.data:
        return search_bst(tree.left, key)
    else:
        return search_bst(tree.right, key)

h = BSTNode(13)
g = BSTNode(17, h, None)
f = BSTNode(11, None, g)
e = BSTNode(5)
d = BSTNode(2)
c = BSTNode(3, d, e)
b = BSTNode(7, c, f)
m = BSTNode(31)
l = BSTNode(29, None, m)
n = BSTNode(41)
k = BSTNode(37, l, n)
j = BSTNode(23, None, k)
p = BSTNode(53)
o = BSTNode(47, None, p)
i = BSTNode(43, j, o)
a = BSTNode(19, b, i)

print search_bst(a, 31).data

31


**top tips**
* you can iterate through elements in sorted order in time O(n)
* some problems need a combination of a BST and a hashtable. For example student objects where it is GPA sorted and you need to update a students GPA but all you have is the student name and new GPA, finding the student would be O(n) without a hashtable
* augmented BSTs are needed for things like the number of elements in a range

**libraries**
Python does not come with a built-in BST library
* **sortedcontainers** module is the best option
    * data stucture is a sorted list of sorted lists
    * asymptotic time complexity for inserts and deletes is O(sqrt(n))
* **bintrees** is used here which implements sorted sets and sorted dictionaries using balanced BSTs
    * insert(e) insers new element e in the BST
    * discard(e) removes e in the BST is present
    * min_item()/max_item() yield the smallest and largest key-value pair in the BST
    * min_key()/max_key() yield the smallest and largest key in the BST
    * pop_min()/pop_max() remove and return the smallest and largest key-value pair in the BST
    * these oeprations take O(log n)

# Problems

## 14.1 Test if a Binary Tree Satisfies the BST Property 

Write a program that takes as input a binary tree and checks if the tree satisfies the BST property. 

In [3]:
def BST_check(tree, low_range = float('-inf'), high_range = float('inf')):
    if not tree:
        return True
    elif not (low_range <= tree.data and high_range >= tree.data):
        return False
    
    return (BST_check(tree.left, low_range, tree.data) and 
            BST_check(tree.right, tree.data, high_range))

h = BSTNode(13)
g = BSTNode(17, h, None)
f = BSTNode(11, None, g)
e = BSTNode(5)
d = BSTNode(2)
c = BSTNode(3, d, e)
b = BSTNode(7, c, f)
m = BSTNode(31)
l = BSTNode(29, None, m)
n = BSTNode(41)
k = BSTNode(37, l, n)
j = BSTNode(23, None, k)
p = BSTNode(53)
o = BSTNode(47, None, p)
i = BSTNode(43, j, o)
a = BSTNode(19, b, i)
z = BSTNode(100)

assert(BST_check(a) == True)

j.set_left(z)
assert(BST_check(a) == False)

# 14.2 Find the First Key Greater than A Given Value in a BST

Write a program that takes as input a BST and a value and returns the first key that would appear in an inorder traversal which is great than the input value. For example, when applied to BST in fig 14.1 you should return 29 for 23

In [16]:
def first_greater_BST(tree, value):
    print tree.data
    if tree.data == value:
        print
        bottom = inorder_traversal(tree, tree) 
        print bottom
    elif tree.data < value:
        next_node = first_greater_BST(tree.right, value)
    else:
        next_node = first_greater_BST(tree.left, value)
        
    return tree.data
    
def inorder_traversal(head, node):
    if node:
        next = inorder_traversal(head, node.left)
        if node != head:
            print node.data
            return (next or node)
        next = inorder_traversal(head, node.right)  
    
    
h = BSTNode(13)
g = BSTNode(17, h, None)
f = BSTNode(11, None, g)
e = BSTNode(5)
d = BSTNode(2)
c = BSTNode(3, d, e)
b = BSTNode(7, c, f)
m = BSTNode(31)
l = BSTNode(29, None, m)
n = BSTNode(41)
k = BSTNode(37, l, n)
j = BSTNode(23, None, k)
p = BSTNode(53)
o = BSTNode(47, None, p)
i = BSTNode(43, j, o)
a = BSTNode(19, b, i)

print first_greater_BST(a, 23)

19
43
23

29
37
None
19
