In [7]:
# setup
from IPython.core.display import display,HTML
display(HTML('<style>.prompt{width: 0px; min-width: 0px; visibility: collapse}</style>'))
display(HTML(open('rise.css').read()))

# imports
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
%matplotlib inline
sns.set(style="whitegrid", font_scale=1.5, rc={'figure.figsize':(12, 6)})


  from IPython.core.display import display,HTML


# CMPS 2200
# Introduction to Algorithms

## Dynamic Programming [0-1 Knapsack]

### 0-1 Knapsack

Suppose there are $n$ objects, each with a *value* $v_i$ and *weight* $w_i$. You have a "knapsack" of capacity $W$ and want to fill it with a set of objects $X \subseteq [n]$ so that $w(X) \leq W$ and $v(X)$ is maximized. 

We saw previously that a greedy approach didn't really take leftover capacity into account. 

We can give a simple counterexample with 2 objects that have weights/values $(10, 5), (9.999, 3)$ with $W=5.$
    
The problem is that the greedy choice to maximize value/weight is incorrect because we can have leftover capacity. We really need to look at *all possible* choices of objects and their associated optimal solutions. 

<span style="color:red">**Question**:</span> How many solutions do you have if we use `Brute Force` algorithm for $n$ objects?

<br>



Let $OPT(S, W)$ be an optimal solution to the Knapsack problem for a set of objects $S$ and capacity $W$. We started with $n$ objects and capacity $W$ so we are interested in finding $OPT([n], W)$. 

<br>
<br>
<br>
<br>
Now, we can make the following simple observation: if object $n$ is in the optimal solution, then <br>
<br>
$~~~~~~~~OPT([n], W)= \color{blue}{\{n\} \cup OPT\Big([n-1], W-w(n)\Big)}$ 
<br>
<br>
If it isn't, then <br>
<br>
$~~~~~~~~OPT([n], W) = \color{red}{OPT([n-1], W)}$.



**Optimal Substructure for Knapsack**: For any set of objects $[n]$ and $W>0$, we have

$$
v(\text{OPT}([n], W)) = 
\max \Big\{
\color{blue}{v(n) + v(OPT([n-1], W - w(n)))} ,\;\;
\color{red}{v(OPT([n-1], W))}
\Big\}.
$$


In a way, this really isn't saying much. Put plainly we're just saying that the optimal solution either contains object $n$ or it doesn't.

For the example above, we have that:

|index |value|weight|
|------|------|-----|
|0     | 10    |5    |
|1     | 9.999     |3    |

$\begin{eqnarray*}
v(OPT([1], 5)) &=& \max\{v(1) + v(OPT(0, 2)), v(OPT(0, 5))\} \\
&=& \max\{9.999, 10\} \\
&=& 10. \\
\end{eqnarray*}$

Does this give us an algorithm? It is easy to write this in SPARC:

$$
\begin{aligned}
\mathit{knapsack}(n, W) = & \\
& \texttt{if }~ n = 0: \\
& \quad 0 \\[6pt]
& \texttt{if }~ n = 1: \\
& \quad \texttt{if }~ w(1) \le W: \\
& \qquad v(1) \\
& \quad \texttt{else:}~ 0 \\[6pt]
& \texttt{else:} \\
& \quad \texttt{if }~ w(n) \ge W: \\
& \qquad \mathit{knapsack}(n-1, W) \\
& \quad \texttt{else:} \\
& \qquad \max\{v(n) + \mathit{knapsack}(n-1, W - w(n)),~ \mathit{knapsack}(n-1, W)\}
\end{aligned}
$$




We can see that our optimal substructure recurrence depends both on the number of objects as well as their weights. The number of recursive calls doubles in every recursion.

<span style="color:red">**Question**:</span> **What is the work and span??**

Suppose all items have weight 1 - is there a glaring inefficiency we can fix? Let's consider the recursion tree:

<img src="figures\knapsack_recursion_tree.jpg" width="70%">

If we blindly recompute the redundant calls when they are encountered, then we will do $\Omega(2^n)$ work and $O(n)$ span even if we can take all the items.  

However, suppose that whenever we need to compute $v(OPT(i, w))$, we compute it once and save the result for later use (e.g., in a suitable data structure) -- this is called <span style="color:red">memoization</span>. 


Then, we no longer have a binary tree but rather a **directed acyclic graph** or **DAG**.


<img src="figures\knapsack_recursion_dag.jpg" width="70%">



<br>

> The number of nodes in this DAG will allow us to determine the work of this algorithm, and the longest path in the DAG will allow us to determine the span. 


When performing memoization, we can either proceed **top-down** or **bottom-up**:

- **top-down**: use recursion solution as usual, but maintain a hashmap or related data structure to quickly lookup solutions previously computed

- **bottom-up**: start with solutions to smallest problem instances, then proceed to larger instances. This is typically implemented by filling a table.

|index |value|weight|
|------|------|-----|
|0     | 10   |5    |
|1     | 6    | 3 |
|2     | 6    | 2 |


Optimal solution is 12 (second and third items) for capacity $W =5$.
<br>

#### Optimal Substructure

$$v\big(OPT([n], W)\big) = \max \{v(n) + v\big(OPT([n-1], W - w(n))\big), v\big(OPT([n-1], W)\big)\}$$







In [1]:

### objects = [(10,5), (6,3), (6,2)]

## Implementation 1
def recursive_knapsack(objects, i, W):
    v, w = objects[i]
    if (i == 0):
        if w <= W:
            return v
        else:
            return 0
    else:
        if w <= W:
            take = v + recursive_knapsack(objects, i-1, W-w)
            dont_take = recursive_knapsack(objects, i-1, W)
            return max(take, dont_take)
        elif W == 0:
            return 0
        else:
            # w>W
            return recursive_knapsack(objects, i-1, W)


In [25]:

## Implementation 2

def topdown_knapsack(objects, i, W, memo = {}):

    # Check if result already computed
    if (i, W) in memo:
        return memo[(i, W)]   
    
    # otherwise update memo
    else:

        v, w = objects[i]

        # Base case
        if i == 0:
            if w <= W:
                result = v
            else:
                result = 0
        else:
            if w <= W:
                take = v + topdown_knapsack(objects, i - 1, W - w, memo)
                dont_take = topdown_knapsack(objects, i - 1, W, memo)
                result = max(take, dont_take)
            else:
                result = topdown_knapsack(objects, i - 1, W, memo)

        # Store result in memo before returning
        memo[(i, W)] = result
    return result


### Bottom-Up

<br><br>**Consider this table**: number of items to include (rows) by weight (cols)



<center><img src="figures\bottom_up.jpeg" width="40%"></center>

| |0 |1 |2 |3 |4 |5 |
|-|-|-|-|-|-|-|
|0| | | | | | |
|1| | | | | | |
|2| | | | | | |
|3| | | | | | |


| |0 |1 |2 |3 |4 |5 |
|-|-|-|-|-|-|-|
|0|0 |0 |0 |0 |0 |0 |
|1|0 |0 |0 |0 |0 |10|
|2|0 |0 |0 |6 |6 |10|
|3|0 |0 |6 |6 |6 |**12** |

#### One More Example [Please try to solve it]


The capacity is 11, and there are 5 items with different values and weights.<br>
<img src="figures\0-1Quiz.png" width="24%">

In [4]:
## Implementation 3
def tabular_knapsack(objects, W):
    n = len(objects)
    # we'll rely on indices to also represent weights, so we'll index from 1...W 
    # in the weight dimension of the table
    OPT = [[0]*(W+1)]
    
    # initialize the first row of the table
    for w in range(W+1):
        if objects[0][1] <= w:
            OPT[0][w] = objects[0][0]
        else:
            OPT[0][w] = 0
    
    # use the optimal substructure property to compute increasingly larger solutions
    for i in range(1,n):
        OPT.append([0]*(W+1))
        v_i, w_i = objects[i]
        for w in range(W+1):
            if w_i <= w:
                OPT[i][w] = max(v_i + OPT[i-1][w-w_i], OPT[i-1][w])
            else:
                OPT[i][w] = OPT[i-1][w] 
#     print(OPT)
    return OPT[n-1][W]

In [18]:
## Evaluation Stage
## Case 1
W = 5
objects = [(12,5), (6,3), (6,2)]
n = len(objects)-1

print('Implementation 1:', recursive_knapsack(objects.copy(), n, W))


print('Implementation 2:', topdown_knapsack(objects, n, W))


print('Implementation 3:', tabular_knapsack(objects, W))



Implementation 1: 12
Implementation 2: 12
Implementation 3: 12


In [22]:

## Case 2
W = 5
objects = [(10, 5), (9.999, 3)]
n = len(objects)-1
print('Implementation 1:', recursive_knapsack(objects.copy(), n, W))


print('Implementation 2:', topdown_knapsack(objects, n, W))


print('Implementation 3:', tabular_knapsack(objects, W))


Implementation 1: 10
Implementation 2: 10
Implementation 3: 10


In [31]:
import time

W = 100
n = 500
objects = [(i, i) for i in range(1, n)]
n = len(objects) - 1

t0 = time.time()
print('Implementation 1:', recursive_knapsack(objects, n, W))
t1 = time.time()
print(f"Time of Recursive: {t1 - t0:.5f}s")

print('Implementation 2:', topdown_knapsack(objects, n, W))
t2 = time.time()
print(f"Time of Top Down: {t2 - t1:.5f}s")

print('Implementation 3:', tabular_knapsack(objects, W))
t3 = time.time()
print(f"Time of Botton Up: {t3 - t2:.5f}s")


Implementation 1: 100
Time of Recursive: 0.91412s
Implementation 2: 100
Time of Top Down: 0.00007s
Implementation 3: 100
Time of Botton Up: 0.00418s


### Elements of Dynamic Programming

This is what we call **dynamic programming**. The elements of a dynamic programming algorithm are:

- Optimal Substructure
- Recursion DAG

The correctness of the dynamic programming approach follows from the optimal substructure property (i.e., induction). If we can prove that the optimal substructure property holds, and that we compute a solution by correctly implementing this property then our solution is optimal.

As with divide and conquer algorithms, achieving a good work/span can be tricky. We can minimize redundant computation by memoizing solutions to all subproblems. This can be done *top-down* by saving the result of a recursive call the first time we encounter it. Or, we can compute the optimal substructure property *bottom-up* by starting with the base case(s) and working our way up.

Can we derive the number of nodes in the DAG using the optimal substructure property?


### Work and Span in Dynamic Programming

Since we memoize the solution to every distinct subproblem, the number of nodes in the DAG is equal to the number of distinct subproblems considered. 

The longest path in the DAG represents the span of our dynamic programming algorithm.


For example, 0-1 Knapsack Problem (n, W)
- There are at most $O(nW)$ nodes in this DAG, and the longest path is $O(n)$. Each node requires $O(1)$ work/span, so the work is $O(nW)$ and the span is $O(n)$. 


### Why "Dynamic Programming"?

The mathematician Richard Bellman coined the term ["dynamic programming"](https://en.wikipedia.org/wiki/Dynamic_programming) to describe the recursive approach we just showed. The optimal substructure property is sometimes referred to as a "Bellman equation." But why did he call it dynamic programming?

There is some [folklore](https://en.wikipedia.org/wiki/Dynamic_programming#History) around the exact reason. But it could possibly be because "dynamic" is a really dramatic way to describe the search weaving through the DAG. The term "programming" was used in the field of optimization in the 1950s to describe an optimization approach (e.g., linear programming, quadratic programming, mathematical programming). 
