# Binary Search Trees
The BST is a binary tree that respects the BST property where the key stored at a node is greater than or equal to the keys stored at the nodes of its left subtree, and less than or equal to the keys stored at the nodes in right subtree.

## Tips:
- With a BST you can **iterate** through elements in a **sorted order** in time $O(n)$ (regardless of whter it is balanced)
- Some problems need a **combination of a BST and a hashable**. For example, if you insert student objects in a BST and entries are ordered by GPA, and then a student's GPA needs to be updated and all we have is the student's name and new GPA, we cannot find the student by name without a full traveral. However, with an additional hash table, we can directly go to the corresponding entry in the tree.
- Sometimes it is necessary to **augment** a BST to make it possible to manipulate more complicated data, e.g., intervals, and efficiently support more complex queries, e.g., the number of elements in a range.
- The BST property is a **global property** - a binary tree may have the property that each node's key is greater tan the ey at its left child and smaller than the key at its right child, but it may not be a BST.

In [15]:
from data_structures import trees
from data_structures.trees.binary import BinaryTreeNode
from typing import List

bst = trees.bst.make_tree_example()

### 14.1: Is a Binary Tree a BST?

In [10]:
def is_bst(tree: BinaryTreeNode) -> bool:
    ''' 
    check constraints at each node
    parent node creates a global constraint on all its child nodes
    '''
    def are_keys_in_range(tree: BinaryTreeNode, low_range: float=float('-inf'), high_range: float=float('inf')) -> bool:
        if tree is None:
            return True 
        if not low_range <= tree.data <= high_range:
            return False
        return (are_keys_in_range(tree.left, low_range, tree.data) and are_keys_in_range(tree.right, tree.data, high_range)
        )
    return are_keys_in_range(tree)

print(is_bst(trees.binary.make_tree_example()))
print(is_bst(bst))

False
True


Time complexity is $O(n)$ and space complexity is $O(h)$ where $h$ is the height of the tree

#### Variant
Use a queue

### 14.2: Find the First Key Greater than a Given Value

In [14]:
def find_first_greater_than_k(tree: BinaryTreeNode, key: int) -> BinaryTreeNode:
    subtree, candidate = tree, None

    while subtree:
        if subtree.data <= key:
            subtree = subtree.right
        # node greater than key
        else:
            candidate = subtree
            subtree = subtree.left
    return candidate

print(find_first_greater_than_k(bst, 40).data)
print(find_first_greater_than_k(bst, 47).data)
print(find_first_greater_than_k(bst, 12).data)


41
53
13


Time complexity is $O(h)$ where $h$ is the height of the tree and space complexity is $O(1)$

#### Variant 14.2.A:
Find the node in a BST whose key equals the input value and appears first in an in-order traversal. BST can contain duplicate keys

In [26]:
class BinaryTreeNodeName(BinaryTreeNode):
    def __init__(self, data=None, name=None, left=None, right=None) -> None:
        self.data = data
        self.name = name 
        self.left = left 
        self.right = right 

def find_first_key(tree: BinaryTreeNode, key: int) -> BinaryTreeNode:
    subtree, candidate = tree, None 

    while subtree:
        if subtree.data == key:
            candidate = subtree
            subtree = subtree.left 
        elif subtree.data < key:
            subtree = subtree.right 
        # node less than key
        else:
            subtree = subtree.left

    return candidate


bst_duplicates = BinaryTreeNodeName(data=108, name='A',
                    left=BinaryTreeNodeName(data=108, name='B',
                            left=BinaryTreeNodeName(data=-10, name='C',
                                    left=BinaryTreeNodeName(data=-14, name='D'),
                                    right=BinaryTreeNodeName(data=2, name='E')
                                    ),
                            right=BinaryTreeNodeName(data=108, name='F')
                            ),
                    right=BinaryTreeNodeName(data=285, name='G',
                            left=BinaryTreeNodeName(data=243, name='H'),
                            right=BinaryTreeNodeName(data=285, name='I',
                                    right=BinaryTreeNodeName(data=401, name='J')
                            )
                    )
                )
trees.binary.traversal_levelorder(bst_duplicates)
print()

assert find_first_key(bst_duplicates, 108).name == 'B'
assert find_first_key(bst_duplicates, 285).name == 'G'
assert find_first_key(bst_duplicates, 143) is None
assert find_first_key(bst_duplicates, 243).name == 'H'

108 108 285 -10 108 243 285 -14 2 401 


Time complexity is $O(h)$ where $h$ is the height of the tree and space complexity is $O(1)$

### 14.3: Find the K Largest Elemnts in a BST

In [20]:
def find_k_largest(tree: BinaryTreeNode, k: int) -> List[int]:
    ''' 
    do reverse in-order traveral to get nodes in descending sorted order
    return once have k elements 
    '''
    def helper(tree: BinaryTreeNode):

        if tree and len(k_largest_nodes) < k:
            if tree.right:
                helper(tree.right)

            if len(k_largest_nodes) < k:
                k_largest_nodes.append(tree.data)

                if tree.left:
                    helper(tree.left)

    k_largest_nodes: List[BinaryTreeNode] = []
    helper(tree)
    return k_largest_nodes

find_k_largest(bst, 4)

[53, 47, 43, 41]

Time complexity is $O(h+k)$. Must descend the height of tree and then ascend k times

### 14.4: Compute the LCA in a BST

In [32]:
# assume key in a < b
def find_lca(tree: BinaryTreeNode, a: BinaryTreeNode, b: BinaryTreeNode) -> BinaryTreeNode:
    while tree.data < a.data or tree.data > b.data:

        while tree.data < a.data:
            tree = tree.right     # LCA must be in right subtree 

        while tree.data > b.data:
            tree = tree.left      # LCA must be in left subtree 

    # now a.data <= tree.data && tree.data >= b.data
    return tree 

assert find_lca(bst, a=bst.left.left.left, b=bst.left.left.right).data == 3
assert find_lca(bst, a=bst.left.left.left, b=bst.right.left.right).data == 19
assert find_lca(bst, a=bst.left.left.left, b=bst.left.right.right.left).data == 7
assert find_lca(bst, a=bst.right.left, b=bst.right.right.right).data == 43
assert find_lca(bst, a=bst.right.left, b=bst.right.left.right).data == 23

### 14.8: Build a Minimum Height BST from a Sorted Array

In [21]:
def build_min_height_bst(A: List[int]) -> BinaryTreeNode:
    def helper(start: int, end: int) -> BinaryTreeNode:
        if start >= end:
            return None

        mid = (start + end) // 2

        return BinaryTreeNode(data=A[mid], 
                    left=helper(start, mid),
                    right=helper(mid+1, end)
                    )
    return helper(0, len(A))

A = [4, 10, 9, 11, 2, 9, 3, 1, 5, 2, 8]
A.sort()
root = build_min_height_bst(A)
trees.binary.traversal_levelorder(root)

5 2 9 2 4 9 11 1 3 8 10 

$O(n)$ time complexity