In [2]:
# setup
from IPython.core.display import display,HTML
display(HTML('<style>.prompt{width: 0px; min-width: 0px; visibility: collapse}</style>'))
display(HTML(open('rise.css').read()))

# imports
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
%matplotlib inline
sns.set(style="whitegrid", font_scale=1.5, rc={'figure.figsize':(12, 6)})


# CMPS 2200
# Introduction to Algorithms

## Optimal Binary Search Trees


### Binary Search Tree

For a set of keys $S$ ($|S|=n$), a *binary search tree* organizes $S$ as follows. Each element $x$ of $S$ is placed in the tree so that all elements $y \leq x$ are in the left subtree of $x$ and all of the elements $z>x$ are in the right subtree. For any binary tree $T$, let $d(x, T)$ be the depth of an element $x\in S.$

<img src="BST.png" width="70%">

A **balanced** binary search tree has the property that the maximum depth is $O(\log n)$. There are a number of approaches to ensure that search, insertion and deletion operations can be performed in $O(\log n)$ time. 

Example: Let $S = \{1, 2, 3, 4, 5, 6, 7\}$. Then a balanced binary search tree $T$ would have the following structure:

<img src="balanced_bst.jpg" width="30%">

Suppose we are focused on the cost of retrieval of items from $T$. In the worst case, retrieval has cost $3$. 



### Optimal Binary Search Tree


But what if we knew the frequency of retrieval $f(x)$ for each $x\in S$? We previously used this additional knowledge to greatly improve compression. Then a binary search tree $T$ above has search cost $C(T) = \sum_{x\in S} f(x)\cdot d(x, T).$ 

Do frequencies affect the best retrieval cost we can achieve? 

Example: Let $f(1) = 1000, f(2) = 1, f(3) = 1, f(4) = 1, f(5) = 1, f(6) = 1, f(7) = 1.$ Then a balanced tree has cost $3014.$ 

<img src="balanced_bst.jpg" width="30%">

<br>
<br>
<br>
Consider the following tree:

<img src="frequency_bst.jpg" width="30%">

The cost is $2000 + 1 + 2 + 6 + 8 = 2017$, which is much better!

With the given frequencies, does this binary search tree minimize retrieval cost?



No! Here is a better tree with cost $1000 + 2 + 3 + 8 + 10 = 1023$:

<img src="better_frequency_bst.jpg" width="30%">


<span style="color:red">**Question**:</span> Is this the optimal?? Can we do better?

Given a set of $S$ comparable elements with given frequencies, the **Optimal Binary Search Tree** problem asks us to find the binary search tree $T$ for $S$ that minimizes 

$$C(T) = \sum_{x\in S} f(x)\cdot d(x, T).$$

Would a greedy algorithm work? What if we simply made the most frequent item the root?

This approach fails if the largest element is most frequent. Consider $S={1, 2, 3}$ where $f(1) = 9, f(2)=9, f(3) = 10$. 

<img src = "greedy_counterexample.jpg" width="40%">


A greedily constructed tree has $3$ as the root and cost $10+18+27 = 55.$ A balanced tree has $2$ as the root and cost $9 + 18 + 20 = 47.$


### Intuition - > Example

> Define $B(n)$ as the number of BSTs of $n$-nodes

<img src = "figures/bst_exm.png" width="75%">




<span style="color:red">**Question**:</span> How can we extend to $n$ nodes?


For $B(n)$, any node can be the root, and set it to $r$:

<img src = "figures/bst_intuition.png" width="80%">




$$B(n) = \sum\limits_{i=0}^{n-1}(B(i)*B(n-i-1),$$ where $B(0) = 1; B(1) = 1; B(2) = 2$.



<br><br><br>
<br><br><br>
<br><br><br>
$$B(n) = \dfrac{(2n)!}{(n + 1)! * n!}$$

### A Dynamic Programming Approach

Suppose we have the elements of $S$ in sorted order. Let $S_{i,j}$ denote the elements from rank $i$ to rank $j$ inclusive. For any $T$ with root $S_r$, let's look at its cost in terms of the left and right subtrees $T_L$ and $T_R$: 

$$\begin{eqnarray*}
C(T) &=& \sum_{x\in T} f(x)\cdot d(x, T) \\
&=& f(S_r) + \sum_{x\in T_L} f(x) \cdot (d(x, T_L) + 1) + \sum_{x\in T_R} f(x) \cdot (d(x, T_R) + 1) \\
&=& f(S_r) + \sum_{x\in T_L} f(x) + \sum_{x\in T_L} f(x) \cdot d(x, T_L) + \sum_{x\in T_R} f(x)+ \sum_{x\in T_R} f(x) \cdot d(x, T_R) \\
&=& \sum_{x\in T} f(x) + \sum_{x\in T_L} f(x) \cdot d(x, T_L) + \sum_{x\in T_R} f(x) \cdot d(x, T_R) \\
&=& \sum_{x\in T} f(x) + C(T_L) + C(T_R) \\
\end{eqnarray*}$$

Let $\mathit{OBST}(S)$ be the cost of an optimal binary search tree for $S$ with given frequencies. Some element $r$ in $S$ must be the root of an optimal binary search tree. Moreover, the left and right subtrees must also be optimal binary search trees for $S_{0,r-1}$ and $S_{r+1, n-1}$, respectively. 

But which element $r$ should be the root?



It should be the element that yields left and right subtrees so that the overall tree $T$ minimizes the total cost $C(T)$. 

**Optimal Substructure for Optimal Binary Search Trees**: For a set of $n$ keys $S$, 


$$
\mathit{OBST}(S) = 
\begin{cases}
0,~~~\text{if}~~ |S| = 0\\
\sum_{x \in S} f(x) + \min_{i \in [n]} \left(\mathit{OBST}(S_{1,i-1})+\mathit{OBST}(S_{i+1,n-1})\right),~~~\text{otherwise}\\
\end{cases}
$$

Yet again we see that the recursion tree for this optimal substructure property grows exponentially. Fortunately, the evaluation of subproblems requires time linear in the size of the subproblem.


In [1]:
def obst_recursive(keys, freq):
    def obst_recursive_helper(freq, i, j):

        # Base cases 
        if j < i:      
            return 0
        if j == i:     
            return freq[i] 

        fmin = float('inf')

        for r in range(i, j + 1):
            cost = (obst_recursive_helper(freq, i, r - 1) + obst_recursive_helper(freq, r + 1, j)) 
            if cost < fmin: 
                fmin = cost
                
        current_cost = sum(freq[i:j+1]) 

        return fmin + current_cost
     
    return obst_recursive_helper(freq, 0, len(keys) - 1)
 

In [3]:
def obst_dp(keys, freq):
    n = len(keys)
    cost = [[0] * n for _ in range(n)]

    for i in range(n):
        cost[i][i] = freq[i]

    for l in range(2, n + 1):
        for i in range(n - l + 1):
            j = i + l - 1
            cost[i][j] = float('inf')
            for k in range(i, j + 1):
                current_cost = sum(freq[i:j + 1])
                if k > i:
                    current_cost += cost[i][k - 1]
                if k < j:
                    current_cost += cost[k + 1][j]
                cost[i][j] = min(cost[i][j], current_cost)

    return cost[0][n - 1]



In [4]:
# Example usage:
keys = [1, 2, 3]
freq = [9, 9, 10]


print(f"The cost of the OBST (Recursive) is: {obst_recursive(keys, freq)}")

print(f"The cost of the OBST (DP) is: {obst_dp(keys, freq)}")


The cost of the OBST (Recursive) is: 47
The cost of the OBST (DP) is: 47
