In [4]:
class Tree:
    """A tree with label as its label value."""
    def __init__(self, label, branches=[]):
        self.label = label
        for branch in branches:
            assert isinstance(branch, Tree)
        self.branches = list(branches)

    def __repr__(self):
        if self.branches:
            branch_str = ', ' + repr(self.branches)
        else:
            branch_str = ''
        return 'Tree({0}{1})'.format(self.label, branch_str)

    def __str__(self):
        return '\n'.join(self.indented())

    def indented(self, k=0):
        indented = []
        for b in self.branches:
            for line in b.indented(k + 1):
                indented.append('  ' + line)
        return [str(self.label)] + indented

    def is_leaf(self):
        return not self.branches

In [5]:
class BTree(Tree): #Subclass of a Tree
    empty = Tree(None) # Represent an empty as a Tree that contains nothing
    
    def __init__(self, label, left = empty, right = empty):
        # Construct a binary tree
        Tree.__init__(self, label, [left, right])
        
    # To access left branch, we use @property method decorator
    @property
    def left(self):
        return self.branches[0]
    
    # Same for right branch
    @property
    def right(self):
        return self.branches[1]
    
    # A new definition of is_leaf
    def is_leaf(self):
        # A leaf is when both branches are empty
        return [self.left, self.right] == [BTree.empty] * 2
    
    # The __repr__ method shows what to display
    def __repr__(self):
        if self.is_leaf():
            return 'BTree({0})'.format(self.label)
        elif self.right is BTree.empty:
            left = repr(self.left)
            return 'BTree({0}, {1})'.format(self.label, left)
        else:
            left, right = repr(self.left), repr(self.right)
            if self.left is BTree.empty:
                left = 'BTree.empty' 
            template = 'BTree({0}, {1}, {2})'
            return template.format(self.label, left, right)

# Binary Search Trees

The most common application of a binary tree is to aid in binary search.

## Binary Search

Binary search is a strategy for finding a value in a sorted sequence (e.g. list).
* The strategy is to check the middle
* If it's not the value that we're looking for, we can still eliminate half the list 

Let's say we're looking for 20 in

In [None]:
[1, 2, 4, 8, 16, 32, 64] # Sorted in increasing order

First we check the middle element, `8`

<img src = '8.jpg' width = 500/>

8 is not 20, but it's smaller than 20. This means we can eliminate 8 and the elements smaller than 8. 

<img src = 'eliminate.jpg' width = 500/>

Now we check the middle of the remaining elements, which is 32.

<img src = '32.jpg' width = 500/>

32 is greater than 20. Thus we can eliminate 32 and the elements greater than 32.

<img src = '16.jpg' width = 400/>

Now we only have 16 left.  16 is not 20, and this way we can tell that 20 is not in the list. Thus, return `False`. 

What about if we checked for an element that's within the list? Let's say we want to check if `4` is in the following list,

In [None]:
[1, 2, 4, 8, 16, 32]

First, Python checks the middle element 8. 8 is greater than 4, so we can get rid of 8 and all elements greater than that.

<img src = '4.jpg' width = 500/>

Now Python checks for the middle element again, 2. 2 is smaller than 8, so Python gets rid of 2 and all the elements smaller than 2.

<img src = 'left.jpg' width = 300/>

Now we only have `4` left. 4 is exactly the number we are looking for! Thus, Python returns `True`.

For a sorted list of length `n`, what $\Theta$ expression describes the time required to check whether an element is within a list? 

$$ \Theta(log (n))$$

We eliminate half the work with each step. 

Binary search trees are like sorted list, except that they store their values as labels in a `Tree`. The advantage of using `Tree` is that it is quicker to add new elements than adding new elements into a sorted list.

## Binary Search Trees

A binary search tree is a binary tree where each node's label is:
1. Larger than all node labels in its left branch
2. Smaller than all node labels in its right branch

There are multiple binary search trees for every sorted list of numbers. For example,

In [None]:
[1, 3, 5, 7, 9]

The numbers above can be represented in a binary search tree like the following,

<img src = 'bin_tree.jpg' width = 200/>

Everything on the left side of `7` is smaller than `7`. Everything on the right side of `7` is greater than `7`. This property pertains recursively (e.g. also applies for the subtrees).

The above binary tree is not the only binary search tree. Here are some others.

<img src = 'bin_tree2.jpg' width = 200/> <img src = 'bin_tree3.jpg' width = 200/>

## Implementation

Let's say we have a sorted list of numbers `s`. How do we construct a binary search tree out of `s`?

We saw that there are many binary search trees that can be formed from a list of numbers. The best kind of binary search trees are the balanced ones. This means for any subtrees within the whole binary tree, the left branch should have about the same length as the right branch. 

Traversing a balanced tree gives us the $\Theta(log(n))$ order of growth since if we can eliminate the left or right branch, we get rid of half the work. 

In [1]:
def balanced_bst(s):
    # Base case: if s is empty, return an empty binary search tree
    if not s:
        return BTree.empty
    else:
        #First, find the midpoint
        mid = len(s) // 2
        # Then we construct a binary search tree out of everything that's less than the midpoint

        # For left branch, create a bst that's up to but not including the midpoint
        left = balanced_bst(s[:mid])
        # For the right tree includes the elements after midpoint, and all the elements after that.
        right = balanced_bst(s[mid+1:])
        # Return a BTree with s[mid] as the label
        return BTree(s[mid], left, right)

Now let's try to create a binary search tree out of the list `[3, 4, 5]`. The root of the tree should be `4`!

In [6]:
balanced_bst([3, 4, 5])

BTree(4, BTree(3), BTree(5))

As we can see, we obtain a Binary Tree with a label 4, a left branch 3 and a right brand 5!

How about a list of numbers from 1 to 9?

In [7]:
balanced_bst(range(10))

BTree(5, BTree(2, BTree(1, BTree(0)), BTree(4, BTree(3))), BTree(8, BTree(7, BTree(6)), BTree(9)))

## Discussion Questions 

### What's the largest element in a binary search tree?

What's the largest element in a binary search tree?

For example, if we have the following tree,

<img src = 'bin_tree.jpg' width = 200/>

Then it should return `11`. 

In [26]:
def largest(t):
    # If the right branch of the currently selected tree is empty
    if t.right is BTree.empty:
        # Then return the label of the currently selected tree
        return t.label
    else:
        # Since the greater element is always on the right branch, recursive call on the right branch
        return largest(t.right)

In [27]:
largest(balanced_bst(range(12)))

11

### What's the 2nd largest element in a binary search tree?

In [29]:
def second(t):
    # Base case: if we start with a leaf
    if t.is_leaf():
        return None
    
    #This second condition applies for a Binary Search tree like below.
    # If the right branch of the currently selected tree is empty, then search through the left branch
    # If we happen to search through the left branch, it must be the case that left branch has a right branch!
    elif t.right is BTree.empty:
        return second(left)
    # If 2 branches to the right of the currently selected tree doesn't have any more branches
    elif t.right.is_leaf():
        # Then the currently selected tree is the answer!
        return t.label
    else:
        return second(t.right)

<img src = '2nd.jpg' width =200/>