# Key Terms

Binary Search Trees have these properties:

- The left subtree of a node has elements less than the node's key
- The right subtree of a node has elements greater than the node's key 
- Left and right subtrees must also be BST's

KEY THING VS Binary Trees: ORDER MATTERS!

BST's can elude a lot of people, including myself. So here's a section on where it can be used in real life

# Why It's Useful

Let's say we have a flight time schedule tracker, and we want to add a new flight to the dataset.

Here are the times: 1, 3, 5, 9, 12
And we want to add 6 to this set.

## Using a sorted array
[1, 3, 5, 9, 12]

Finding insertion point: O(logn) with binary search
Insertion is O(n) (shifting)

## Using a list
1 -> 3 -> 5 -> 9 -> 12

Insertion: O(1)
Finding insertion point: Can't use binary search on a list, so O(n) starting from the head

## Using a heap

Min or Max element has to be the root

Finding insertion point: O(n)

## Using a Binary Search Tree

Finding insertion point: O(logn) (binary search)
Insertion: O(height) or O(logn) if tree is guaranteed balanced

Binary Search Trees are a bit more complicated than heap trees. You need a few more bytes per data structure of the node vs a heap .

For this problem, it's perfect!

## Rank

Let's throw a twist to this problem. Let's say after solving this, we're also asked to give the rank(t), or how many flights are scheduled <= time t?

We need to add more info to the node structure. One data point we can add is how many nodes are less than it, or the "subtree size".

If we do this, we'll need to modify the subtree size when doing *inserts* and *deletes*

Once we have each node representing it's subtree size, this is the algorithm to get the rank.

- Go to the insertion point where t is
- For each node on the way down to node with value t, add 1 for the node itself, then add the tree.left.size to the rank value
- Do this for the node with value t as well.

The resulting value is the rank we are looking for in O(h) time

# Data Structure

### BST Node

In [21]:
class BSTNode:
    def __init__(self, data=None, left=None, right=None):
        self.data = data
        self.left = left
        self.right = right



# Tips

- Searching is the most common use of BSTs. Unlike hash tables, a good BST library can in O(h) or O(logn)
  - get the min element (keep going down the left leaves)
  - get the max element (keep going down the right leaves)
  - find the next largest/smallest element
  - lookup
  - delete
  - find

- Common mistake when updating a BST node is that it's easy for a modified node to show as if it's not in the tree when querying for it after.

  - When updating a node, remove it from the tree, update the value, and add it to the tree to avoid this.

- Combining a hashtable with a BST can be powerful.
  - Let's say student objects in a BST are ordered by GPA, and you wanna update a student node's GPA value.
  - To find the student would take a full traversal, but with a hashtable we can get to it directly

# Analysis

Key lookup, insertion and deletion are worst case the height of the tree: O(n) in many cases

But for some trees like red-black trees that are guaranteed to be balance or height of O(logn) are O(logn) worst case.
The tradeoff is additional data on the tree nodes.

# Implementation

### Searching in a BST

In [22]:
def search_bst(tree: BSTNode, key: int):
    # Base case
    if tree is None or tree.data == key:
        return tree
    
    # If key is greater than current node, go right
    if key > tree.data:
        return search_bst(tree.right, key)

    # Else check left
    return search_bst(tree.left, key)

# Python Library

`bintrees` is a good python library that makes use of sorted sets and sorted dicts using balanced BSTs.

`sortedcontainers` is the current go to for sorted sets, dicts, and lists, but for the sake of education and following EPI, let's continue to use bintrees for interview prep

In [12]:
import bintrees

t = bintrees.RBTree([(5, 'Alpha'), (2, 'Bravo'), (7, 'Charlie'), (3, 'Delta'), (6, 'Echo')])

print(t)
print(t[2])
print(t.min_item(), t.max_item())



RBTree({2: 'Bravo', 3: 'Delta', 5: 'Alpha', 6: 'Echo', 7: 'Charlie'})
Bravo
(2, 'Bravo') (7, 'Charlie')


In [13]:
t.insert(9, 'Golf')
print(t)

RBTree({2: 'Bravo', 3: 'Delta', 5: 'Alpha', 6: 'Echo', 7: 'Charlie', 9: 'Golf'})


In [14]:
print(t.min_key(), t.max_key())

2 9


In [18]:
t.discard(3)
print(t)

RBTree({2: 'Bravo', 5: 'Alpha', 6: 'Echo', 7: 'Charlie', 9: 'Golf'})


In [19]:
a = t.pop_min()
print(t)

RBTree({5: 'Alpha', 6: 'Echo', 7: 'Charlie', 9: 'Golf'})


In [20]:
b = t.pop_max()
print(t)

RBTree({5: 'Alpha', 6: 'Echo', 7: 'Charlie'})
