# Search trees

For this notebook we use the explicit representation of binary trees. This representation makes it much easier to modify a tree since trees are represented as local `left` and `right` information rather than global lists.

In [10]:
class TreeNode:
    def __init__(self, value, left = None, right = None):
        self.value = value
        self.left = left
        self.right = right

For search trees, we are not necessarily guaranteed that all inner nodes have both a left and a right subtree, so it is possible for one but not both to be `None`. Keep this in mind if you want to write a new display function.

In [11]:
def display_tree(tree):
    if tree is None:
        return ""
    if tree.left is None and tree.right is None:
        return str(tree.value)
    
    if tree.left is None:
        subtree = "({right})".format(right = display_tree(tree.right))
    elif tree.right is None:
        subtree = "({left})".format(left = display_tree(tree.left))
    else:
        subtree = "({left},{right})".format(left = display_tree(tree.left),
                                            right = display_tree(tree.right))
    return "{subtree}{value}".format(subtree = subtree, value = tree.value)

In [12]:
tree = TreeNode(3, TreeNode(1), TreeNode(6, TreeNode(4), TreeNode(7)))
display_tree(tree)

'(1,(4,7)6)3'

## Operations on a search tree

Operations we typically want implemented on a search tree are `insert` (add an element to a tree), `member` (check if a value is in the tree), and `delete` (remove a value from the tree).

### Member

Of these operations, `member` is the simplest. You need to use the search tree property for searching, but otherwise you just need a simple recursive function.

In [8]:
def member(tree, value):
    if tree is None:
        return False
    if tree.value == value:
        return True
    if tree.value < value:
        return member(tree.right, value)
    else:
        return member(tree.left, value)

In [9]:
for v in range(10):
    print("Is", v, "in the tree?", member(tree, v))

Is 0 in the tree? False
Is 1 in the tree? True
Is 2 in the tree? False
Is 3 in the tree? True
Is 4 in the tree? True
Is 5 in the tree? False
Is 6 in the tree? True
Is 7 in the tree? True
Is 8 in the tree? False
Is 9 in the tree? False


For modifying a search tree, it turns out to be simpler to write functions that create new trees rather than change the existing one (although it is a good exercise to try to implement the operations to modify a tree instead---you have to be careful about the special case of an empty tree, though).

If we think in terms of modifying trees, we can handle the operations as simple transition rules, and writing recursive functions dealing with these is relatively straightforward.


### Insert

If we aim to implement the operations as recursive functions we have two cases we must deal with: empty trees and trees with children. When inserting a value into a tree, we must find out where to insert it and then update the tree. For the base case that we have an empty tree, we must create a singleton with the new value. For the recursive case, we have either found the value already present in the tree, or we must insert it in the left or right tree depending on the value in the current node.

In [13]:
def insert(tree, value):
    if tree is None:
        return TreeNode(value)
    else:
        if tree.value == value:
            return tree
        elif tree.value < value:
            return TreeNode(tree.value, tree.left, insert(tree.right, value))
        else:
            return TreeNode(tree.value, insert(tree.left, value), tree.right)

Creating new `TreeNode` objects in the recursive cases, rather than simply assigning to `tree.left` or `tree.right` might seem odd to you, but it makes handling the empty tree simpler to return a new tree. It also makes the data structure *persistent*, meaning that other references to it are not affected by us modifying it here.

In [17]:
tree = None
for i in range(2,6):
    tree = insert(tree, i)
print(display_tree(tree))

(((5)4)3)2


In [16]:
for v in range(10):
    print("Is", v, "in the tree?", member(tree, v))

Is 0 in the tree? False
Is 1 in the tree? False
Is 2 in the tree? True
Is 3 in the tree? True
Is 4 in the tree? True
Is 5 in the tree? True
Is 6 in the tree? False
Is 7 in the tree? False
Is 8 in the tree? False
Is 9 in the tree? False


### Delete

Deleting elements is a slightly more difficult operation. We have a recursive function, once again, and a case analysis for what to do in the basis case and the recursive case, but what makes deletion more difficult is that it is easy to delete a leaf---you just return an empty tree---but not so simple to delete an inner node.

The trick is to reduce deleting inner nodes to deleting leaves (or nodes with a single child). What we can do is, if we find the value to delete in an inner node, we replace that value with the largest element in the tree smaller than the root and then delete that value from the left child.

![Deleting in search tree](search-tree-delete.png)

In [19]:
def delete(tree, value):
    if tree is None:
        return None
    if tree.value == value:
        if tree.left is None:
            return tree.right
        if tree.right is None:
            return tree.left
        replacement_value = rightmost_value(tree.left)
        return TreeNode(replacement_value, delete(tree.left, replacement_value), tree.right)
    elif tree.value < value:
        return TreeNode(tree.value, tree.left, delete(tree.right, value))
    else:
        return TreeNode(tree.value, delete(tree.left, value), tree.right)

In [20]:
tree = delete(tree, 3)
tree = delete(tree, 4)
print(display_tree(tree))

(5)2
