In [277]:
# setup
from IPython.core.display import display,HTML
display(HTML('<style>.prompt{width: 0px; min-width: 0px; visibility: collapse}</style>'))
display(HTML(open('rise.css').read()))

# imports
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
%matplotlib inline
sns.set(style="whitegrid", font_scale=1.5, rc={'figure.figsize':(12, 6)})


# CMPS 2200
# Introduction to Algorithms

## Dynamic Programming [Cont'd]

### Knapsack [Review & Cont'd]

Suppose there are $n$ objects, each with a *value* $v_i$ and *weight* $w_i$. You have a "knapsack" of capacity $W$ and want to fill it with a set of objects $X \subseteq [n]$ so that $w(X) \leq W$ and $v(X)$ is maximized. 

Let $OPT(S, W)$ be an optimal solution to the Knapsack problem for a set of objects $S$ and capacity $W$. We started with $n$ objects and capacity $W$ so we are interested in finding $OPT([n], W)$. 

Now, we can make the following simple observation: if object $n$ is in the optimal solution, then <br>
<br>
$~~~~~~~~OPT([n], W)=\{n\} \cup OPT([n-1], W-w(n)).$ 
<br>
<br>
If it isn't, then <br>
<br>
$~~~~~~~~OPT([n], W) = OPT([n-1], W)$.

**Optimal Substructure for Knapsack**: For any set of objects $[n]$ and $W>0$, we have

$$v(OPT([n], W)) = \max \{v(n) + v(OPT([n-1], W - w(n))), \\~~~~~~v(OPT([n-1], W))\}.$$

We can see that our optimal substructure recurrence depends both on the number of objects as well as their weights. The number of recursive calls doubles in every recursion.

Suppose all items have weight 1 - is there a glaring inefficiency we can fix? Let's consider the recursion tree:

<center>
<img src="knapsack_recursion_tree.jpg" width="70%">
</center>

If we blindly recompute the redundant calls when they are encountered, we have:  
$$W(n) = 2W(n-1)+O(1)$$<br>
$$S(n) = S(n-1)+O(1)$$

To sum up, we will do $\Omega(2^n)$ work and $O(n)$ span even if we can take all the items.


### Directed Acyclic Graph (DAG)

When we need to compute $v(OPT(i, w))$, we compute it once and save the result for later use (e.g., in a suitable data structure) -- this is called *memoization*. 

<center>
<img src="knapsack_recursion_dag.jpg" width="70%">
</center>

`Work`: the number of nodes in the DAG equals to the number of distinct subproblems considered, since we memoize the solution to every distinct subproblem.

`Span`: the longest path in the DAG represents the span of our dynamic programming algorithm.

For example, 0-1 Knapsack Problem (n, W)
- How much does each node need?
- What are the work and span?


There are at most $O(nW)$ nodes in this DAG, and the longest path is $O(n)$. Each node requires $O(1)$ work/span, so the work is $O(nW)$ and the span is $O(n)$. 

When performing `memoization`, we can either proceed **top-down** or **bottom-up**:

- **top-down**: use recursion solution as usual, but maintain a hashmap or related data structure to quickly lookup solutions previously computed

- **bottom-up**: start with solutions to smallest problem instances, then proceed to larger instances. This is typically implemented by filling a table.

|index |value|weight|
|------|------|-----|
|0     | 10   |5    |
|1     | 6    | 3 |
|2     | 6    | 2 |


Optimal solution is 12 (second and third items) when capacity $W =5$.
<br>

`Consider this table`: number of items to include (rows) by weight (cols)


| |0 |1 |2 |3 |4 |5 |
|-|-|-|-|-|-|-|
|0|0 |0 |0 |0 |0 |0 |
|1|0 |0 |0 |0 |0 |10|
|2|0 |0 |0 |6 |6 |10|
|3|0 |0 |6 |6 |6 |**12** |


<br>

#### One More Example [Please try to solve it]


The capacity is 11, and there are 5 items with different values and weights.<br>
<img src="0-1Quiz.png" width="24%">



In [5]:
import random
import time

## Implementation 1
def recursive_knapsack(objects, n, W):
    v, w = objects[n]
    if (n == 0):
        if (w <= W):
            return(v)
        else:
            return(0)
    else:
        if (w <= W):
            take = v + recursive_knapsack(objects, n-1, W-w)
            dont_take = recursive_knapsack(objects, n-1, W)
            return(max(take, dont_take))
        elif (W == 0):
            return(0)
        else:
            # w>W
            return(recursive_knapsack(objects, n-1, W))

## Implementation 2
def tabular_knapsack(objects, W):
    n = len(objects)
    # we'll rely on indices to also represent weights, so we'll index from 1...W 
    # in the weight dimension of the table
    OPT = [[0]*(W+1)]
    
    #print(objects[0][1])
    # initialize the first row of the table
    for w in range(W+1):
        if (objects[0][1] <= w):
            OPT[0][w] = objects[0][0]
        else:
            OPT[0][w] = 0

    # use the optimal substructure property to compute increasingly larger solutions
    for i in range(1,n):
        OPT.append([0]*(W+1))
        v_i, w_i = objects[i]
        for w in range(W+1):
            if (w_i <= w):
                OPT[i][w] = max(v_i + OPT[i-1][w-w_i], OPT[i-1][w])
            else:
                OPT[i][w] = OPT[i-1][w]               
#     print(OPT)
    return(OPT[n-1][W])

In [2]:
## Evaluation Stage
## Case 1
W = 5
objects = [(10,5), (6,3), (6,2)]
n = len(objects)-1

print('Implementation 1:', recursive_knapsack(objects.copy(), n, W))

print('Implementation 2:', tabular_knapsack(objects, W))


Implementation 1: 12
Implementation 2: 12


In [3]:

## Case 2
W = 5
objects = [(10, 5), (9.999, 3)]
n = len(objects)-1
print('Implementation 1:', recursive_knapsack(objects.copy(), n, W))
print('Implementation 2:', tabular_knapsack(objects.copy(), W))

Implementation 1: 10
Implementation 2: 10


In [9]:
## Case 3
W = 100
n = 1000
objects = [(i, i) for i in range(1, n)]
n = len(objects)-1
t0 = time.time()
print('Implementation 1:', recursive_knapsack(objects, n, W))

t1 = time.time()
print(t1-t0)

print('Implementation 2:', tabular_knapsack(objects, W))

t2 = time.time()
print(t2-t1)

Implementation 1: 100
1.2302701473236084
Implementation 2: 100
0.012791633605957031


### Elements of Dynamic Programming

The elements of a dynamic programming algorithm are:

- Optimal Substructure
- Recursion DAG

The `correctness` of the dynamic programming approach follows from the optimal substructure property (i.e., induction). If we can prove that the optimal substructure property holds, and that we compute a solution by correctly implementing this property then our solution is optimal.

