<a href="https://colab.research.google.com/github/davidludington/comp363assignments/blob/main/Dynamic_Programming_Assignment_David_Ludington.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Dynamic Programming

Dynamic Programming is often seeing as challenging because of its mathematical abstraction. And yet it's an awesome and very useful algorithmic strategy, worth exploring and appreciating.

Consider a collection (a set, if you like) of $n$ items. Each item has a value $v_i$ and a weight $w_i$. Here, $1 \leq i \leq n$. The contrived example used in most textbooks, is the collection of artifacts in a museum. Considering all the combinations of museum artifacts whose total weight does not exceed a limit, $C$. Which of these combinations has the most value?

Of course, no one writes algorithms because they want to plan a museum heist. The problem exemplified above, in terms of questionable morality, belongs to a broader group of *optimization* problems under constraints. In the case of the museum heist, we try to maximize the value of what we can take within the weight limit of what we can carry.


Without a weight limit, we can take everything. But when we impose a weight limit, $C$, we must consider only the combination of items that maximize value while complying with the limit condition.

Let's define two sets here: $X = \{x_1, x_2, \ldots, x_n\}$
are the items in the musem. Each item $x_i$ in this set has a value $v_i$ and a weight $w_i$.

And let $T$ be a subset of $X$ whose elements $x_k\in T$ fall under our weight restriction and their total value is maximum of any combination of items that fit the weight restriction. For example, consider a small museum with four artifacts and therefore $X=\{x_1, x_2, x_3, x_4\}$. There are $2^4$ combinations of items we can take from this collection, including two trivial choices: nothing and everything.

\begin{align}
\hline
& \text{combination} && \text{value} && \text{weight} \\ \hline
& -                  && 0                && 0   \\ \hline
& x_1                && v_1              && w_1 \\ \hline
& x_2                && v_2              && w_2 \\ \hline
& x_3                && v_3              && w_3 \\ \hline
& x_4                && v_4              && w_4 \\ \hline
& x_1, x_2           && v_1+v_2          && w_1 + w_2 \\ \hline
& x_1, x_3           && v_1+v_3          && w_1 + w_3 \\ \hline
& x_1, x_4           && v_1+v_4          && w_1 + w_4 \\ \hline
& x_2, x_3           && v_2+v_3          && w_2 + w_3 \\ \hline
& x_2, x_4           && v_2+v_4          && w_2 + w_4 \\ \hline
& x_3, x_4           && v_3+v_4          && w_3 + w_4 \\ \hline
& x_1, x_2, x_3      && v_1+v_2+v_3      && w_1 + w_2 + w_3 \\ \hline
& x_1, x_3, x_4      && v_1+v_3+v_4      && w_1 + w_3 + w_4 \\ \hline
& x_1, x_2, x_4      && v_1+v_2+v_4      && w_1 + w_2 + w_4 \\ \hline
& x_2, x_3, x_4      && v_2+v_3+v_4      && w_2 + w_3 + w_4 \\ \hline
& x_1, x_2, x_3, x_4 && v_1+v_2+v_3+v_4  && w_1 + w_2 + w_3 + w_4 \\ \hline
\end{align}

Given the combinations above, we want to find the one whose value is maximum while its weight does not exceed some limit $C$. Easy, right? Just put together a greedy program like:

```python
v_max = 0 # initialize max value
best_combination = None # Initialize winning combo
for each combination:
    if weight of items in this combination <= C:
        if value of items in this combination > v_max:
            v_max = value of items in this combination
            best_combination = this combination
return best_combination
```

This pseudocode will work for up to a few items. There are $2^n$ combinations among $n$ items. As the number of items is getting larger, the program will take longer to consider all $2^n$ combinations. For $n>30$ we'll have to wait minutes for an answer, for $n>40$ the program will take days, and for $n>100$ the program will take centuries. There are over 20,000 artifacts at the Art Institute of Chicago. To find all the combinations among them will take $2^{20,000}$ steps, or about about $10^{6,000}$. Even if our program computes 1 trillion combinations per second, it will take $10^{5,988}$ seconds to complete. That's about $10^{5,980}$ years. For comparison, the age of the universe is about $10^{10}$ years. We cannot simply compute all the combinations and find those that fit our criteria for maximum value within a weight limit.

A better alternative is to see if the problem possesses *optimal substructure,* i.e, if its solution has similarities to the solution of an immediately smaller problem. (That's not a formal definition for optimal substructure, but sufficient for now).

We begin by assuming that somehow we have the optimal solution, i.e., a subset $T\subseteq X$ with the museum items we can carry, within our weight limit $C$, and whose total value is the highest among all possible combinations of items at or below the weight limit.

We can make the following obvious statement about $T$ and the last item $x_n$ of $X$:

**Either $x_n \in T$ or $x_n \notin T$**

(We call the statement above a *tautology,* because "duh" does not sound sufficiently officious).

Let's look at the two possibilities.



## $x_n \notin T$
In this case, $T$ is a subset of $X-\{x_n\}$ and an optimal solution for that subset. Meaning that of all combinations of items in $\{x_1, x_2, \ldots, x_{\color{brown}{ n-1}}\}$, the items in $T$ have the highest value at or below the weight limit $C$.



## $x_n \in T$
This is possible only when the weight of item $x_n$ is within our restrictions, i.e., $w_n \leq C$. Because the last item $x_n$ is part of the optimal solution, we **cannot** claim that $T$ is also an optimal solution for the set $X-\{x_n\}$.

**Perhaps** $T-\{x_n\}$ is an optimal solution for the smaller problem $X-\{x_n\}$. The optimal solution for the full problem (the entire set $X$) is still $T$ and it includes the last item $x_n$. Which means that our original capacity $C$ is now reduced by the weight of that item $C-w_n$.

This scenario is only possible when there is room for the $x_n$ item, i.e., when $w_n \leq C$. After that item is added to $T$, the subset of items we can remove from the museum, the available capacity (for additional items from the smaller set $\{x_1, x_2, \ldots, x_{\color{brown}{n-1}}\}$) is $C-w_n$. If we decide to take $x_n$.






## Decision time

How do we decide if we want to take the last item $x_n$ or not? By looking at the value of the items in set $T$ with, or without item $x_n$.

If item $x_n$ is **not** in $T$, then the total value we can remove from the collection $\{x_1, x_2, \ldots, x_n\}$ is the same as the total value we can remove from the sligthly smaller collection  $\{x_1, x_2, \ldots, x_{\color{brown}{n-1}}\}$. When can the last item $x_n$ be excluded from $T$? When there is no room for it, i.e., when $w_n > C$.

If there is room for that item, $w_n \leq C$, then do we take it or not? If we do not take it, basically we have the previous case: $T$ is the optimal solution for the smaller collection  $\{x_1, x_2, \ldots, x_{\color{brown}{n-1}}\}$. But if we take it, we reduce our available capacity to $C-w_n$. This reframes the next problem: find the optimal solution for the smaller collection  $\{x_1, x_2, \ldots, x_{\color{brown}{n-1}}\}$ but with a lower weight limit $C-w_n$.

The reframing above suggests recursion, but in a smart way. Now we have to find the optimal solution for a collection  $\{x_1, x_2, \ldots, x_{\color{brown}{n-1}}\}$ and weight limit $C-w_n$.

Suppose, someone hands us the solution *again:* a set $T'\subseteq  \{x_1, x_2, \ldots, x_{\color{brown}{n-1}}\}$, whose elements add up at or below the new weight limit $C-w_n$ and whose total value is the highest among all possible such subsets of the smaller set $\{x_1, x_2, \ldots, x_{\color{brown}{n-1}}\}$.

This solution may or may not contain the last item $x_{\color{brown}{n-1}}$ of the collection -- *duh!*, um, tautology.

If there is room for the last item, we have to consider what's the total value of $T$, with or without $x_n$ and select the best deal (the $\max$ value).

Working our way backwards, we can write that the most value we can remove from a collection with $n$ items and a weight limit $C$ is:

$$
S(n, C) = \begin{cases}
S(n-1,C),\ \text{if}\ w_n > C \\ \\
\max{\left[S(n-1,C),\; v_n+S(n-1, C-w_n)\right]},\ \text{if}\ w_n\leq C
\end{cases}
$$

The two branches above correspond to the original tautology.

When $x_n$ is not in $T$, then
$$S(\color{magenta}n,C) = S(\color{brown}{n-1},C)$$
The best value removed from the collection $\{x_1, x_2, \ldots, x_{\color{magenta}n}\}$ while staying at or below the weight capacity $C$ is the **same** best value removed from the smaller collection $\{x_1, x_2, \ldots, x_{\color{brown}{n-1}}\}$. The two collections differ only by one element: $x_n$.

When $x_n$ **could** be added to $T$, then
$$S(\color{magenta}n,C)=\max{\left[S(\color{brown}{n-1},C),\; v_{\color{magenta}n}+S(\color{brown}{n-1}, C-w_{\color{magenta}n})\right]}$$

The $\max$ function above forces us to chose one of two options. **Either** leave $x_n$ out of $T$ and accept $S(\color{magenta}n,C) = S(\color{brown}{n-1},C)$ **or** include $x_n$ in $T$ and accept $S(\color{magenta}n,C)=v_{\color{magenta}n}+S(\color{brown}{n-1}, C-w_{\color{magenta}n})$.

Using the same analysis on the smaller problems, we can write branched expressions for the terms $S(n-1, C)$ and $S(n-1, C-w_n)$:

$$
S(n-1, C) = \begin{cases}
S(n-2,C),\ \text{if}\ w_{n-1} > C \\ \\
\max{\left[S(n-2,C),\; v_{n-1}+S(n-2, C-w_{n-1})\right]},\ \text{if}\ w_{n-1}\leq C
\end{cases}$$

and

$$
S(n-1, C-w_n) = \begin{cases}
S(n-2,C-w_n),\ \text{if}\ w_{n-1} > C-w_n \\ \\
\max{\left[S(n-2,C-w_n),\; v_{n-1}+S(n-2, C-w_n-w_{n-1})\right]},\ \text{if}\ w_{n-1}\leq C-w_n
\end{cases}
$$

Or course now we have to write branched expressions for the quantities $S(n-2,C)$, $S(n-2, C-w_{n-1})$, $S(n-2,C-w_n)$, and $S(n-2, C-w_n-w_{n-1})$.

It may seem weird: we started with one branched expression (for $S(n, C)$), then we got two branched expressions (for $S(n-1, C)$ and $S(n-1, C-w_n)$), and then four. Next we'll need 8, then 16, and so on. This seems to be spawning fast. **However,** we don't have to compute all of them, but only about half, based on the comparison between an item's weight and the available capacity ($w_n$ and $C$ for the first pass, then $w_{n-1}$ and $C$ and also $w_{n-2}$ as well as $C-w_{n-1}$ for the second pass, etc).

Notice also that the first branch of each branched expression is included as an argument of the $\max$ function in the second branch, so we only have to compute that term once.

Using a more compact notation, we can write

\begin{align}
S_{i,j} & = \begin{cases} S_{i-1,j}\ \ \text{if}\ w_i > j \\ \\
\max\left\{  S_{i-1,j},  v_i+S_{i-1,j-w_i} \right\}\ \ \text{if}\ w_i \leq j
\end{cases}
\end{align}

The term $S_{i,j}$ is the highest value we can get from a combination of $i$ items, when we can take a total weight of $j$.

Obviously we are interested for the highest value when $i=n$, (i.e., considering every possible combination among all $n$ in the collection) and $j=C$,(i.e. when our bag is empty and the available capacity is $C$). Remember, as we add items to the bag, it's capacity is reduced. We stop when the capacity is 0.

Let's focus on variable $i$ for a moment. The equation above gives $S_{i,\color{gray}{\text{whatever}}}$ as an expression of $S_{i-1,\color{gray}{\text{whatever}}}$. And that as an expression of $S_{i-2,\color{gray}{\text{whatever}}}$, and so on, until we write $S_{\mathbf{1},\color{gray}{\text{whatever}}}$ as an expression of $S_{\mathbf{0},\color{gray}{\text{whatever}}}$.

The expression $S_{0,\color{gray}{\text{whatever}}}$ is the best value we can fit in *whatever* capacity we available by considering any possible combination from a collection with **nothing** in it. Well, that's easy: if we break into an empty museum, we walk out empty handed, no matter how much loot we could have taken away. There is nothing to take away! And so we can write in general  $S_{0,j}= 0$ for any $0\leq j \leq C$.

Similarly, we can describe $S_{i,0}$ as the best value we can take when our capacity is zero. Again, we can take nothing. And so $S_{i,0}=0$ for any $0\leq i\leq n$.

### Data structure representation

It may be a bit easier to discuss the algorithm, and further demistify the math, if we started thinking how to implement things. The value $S_{i,j}$ is a function of two integer variables, ($i$ and $j$), so maybe we can implement it as a two dimensional array.

Programming the solution is quite easy -- even though we are still not sure how it works. We start with initializing the array for $S$.

```python
S = [[0 for _ in range(C+1)] for _ in range(n+1)] # _ is legit variable name
```

The initialization sets all the elements of array `S` to 0. Inadvertedly, it takes care of the initial conditions $S_{i,0}=0$ and $S_{0,j}=0$: the first row and the first column must be all zeros.

It is important to clarify why the array has as many rows and columns. We want to examine scenarios involving combinations among $n$ items. Each row corresponds to a scenario for up to $n$ items. The first row is the trivial scenario for 0 items; the second row for 1 item; the third row for combinations among the first 2 items, etc. Therefore we need $n+1$ rows.

Each scenario also considers how much we can carry away, up to the limit $L$. This includes the posibility that we cannot carry anything at all, all the way up to been able to carry the full limit $L$. And so we need $L+1$ columns.

We assume that arrays `v` and `w`, each with `n` elements representing the value and the weight of each item respectively, have been provided.

Using array ``S``, the equation for $S_{i,j}$, can be written as:

```python
if j < w[i]:
  S[i][j] = S[i-1][j]
else:  
  S[i][j] = max(S[i-1][j], v[i]+S[i-1][j-w[i]])
```


First, the code tackles the simple case when the weight of item `w[i]` exceeds the available capacity `j`. In this case the optimal combination among the first $i$ items does not include the $i$-th item: there is no room for it. So the optimal combination is the same as that for the first $i-1$ items when capacity is $j$. The `S[i-1][j]` has been **computed already in an earlier step.**

Things get slightly more complicated if there is room to consider item $i$. In this case, the capacity (as implied by the `else` branch of the conditional statement above) is `j >= w[i]`.

Just because there is room for item $i$ doesn't mean that we'll pick it up. If we take it, we are reducing our capacity by `w[i]` (the $-w_i$ part in the equation $C-w_i$ earlier). The new capacity will be `j-w[i]`. Taking item `i` adds its value to our gains, we we earn `+v[i]` on top of what is already in our bag.

**What could be in our bag** if we add item `i`? Any combination among the first `i-1` items whose total weight is at or below `j-w[i]` and has the highest value among all such cobinations. In other words, `S[i-1][j-w[i]]` plus the value `v[i]`.

The alternative is to leave item $i$, in which case our capacity remains `j`, and we are free to fill it with any combination among the first `i-1` items that maximizes value. That value is `S[i-1][j]`.

In other words: when there is room for the last item, ie, when `j > w[i]`, we have two options: either keep the value `S[i-1][j]` or add item `i` to the combination for a value of `v[i]+S[i-1][j-w[i]]`. Obviously we want the option with the highest value, which is:
```python
max( S[i-1][j],    v[i] + S[i-1][j-w[i]] )
#    ---------     ---------------------
#    what's the    what's the best value
#    the best      if we took item i
#    value if      thus effectively
#    we left       reducing the available
#    item i out?   capacity to j - w[i]?
#    ------------------------------------
#    Find the largest of these two values
```

Let's try to put everything together in a method.

In [None]:
def dyn_prog(v, w, C):
  """Find the optimal value among n items under a constraint C using
  dynamic programming.

  Successive optimal solutions for problems of size 0 ≤ i ≤ n
  and for constraints 0 ≤ j ≤ C are computed, leading to the
  final optimal solution S[n][C].

  Inputs
  ------
  v : list
    Values of items we use to build optimal solution. For n items, this list
    is expected to have n+1 items. list[0] is not used, so that item value is
    synchronized with position index. First item is at [1] (instead of 0),
    second item at [2] (instead of 1), etc.
  w : list
    Weights of items we use to build optimal solution. For n items, this list
    is expected to have n+1 items. list[0] is not used, so that item weight is
    synchronized with position index. First item is at [1] (instead of 0),
    second item at [2] (instead of 1), etc.
  C : int
    Contraint for optimal solution; total weight of items in optimal
    solution cannot exceed C.

  Returns
  -------
  S : list
    All optimal solutions for subproblems  of size 0 ≤ i ≤ n and for
    constraints 0 ≤ j ≤ L. Ultimately we are only interested in the final
    optimal value S[n][L], but we need array S to backtrack and identify
    the items comprising the optimal solution
  """
  # List v (and w) has one extra element, since we are not using position [0]
  # to store any meaningful data. We skip position zero so that the data for
  # the first item will be at position [1], for the second at position [2], etc.
  # The actual number of items to process is the length-1.
  n = len(v) - 1
  # Initialize the S array. We need one extra row for the combinations among
  # zero items and an extra column for optimal solutions at zero capacity. These
  # values are trivial (S[i][0] = S[0][j] = 0) but imporant because they provide
  # the initial conditions for the algorithm.
  S = [ [0 for _ in range(C+1)] for _ in range(n+1)]
  # explore every combination of items and capacities
  for item in range(1, n+1): # Loop runs up to and including n
    for capacity in range(1, C+1): # Loop runs up to and including C
      # The weight of item.
      weight = w[item]
      # The value of item.
      value = v[item]
      # Optimal solution of previously smaller problem (with one item less)
      # at the same capacity.
      one_less_item = item - 1
      previous = S[one_less_item][capacity]
      if weight > capacity:
        # Current item weights more than present capacity. It cannot be added
        #  to solution, even if we removed everything to make room for it.
        # Simply there is no room. The optimal value at this capacity is
        # the optimal value for the smaller problem, with one less item
        # at same capacity.
        S[item][capacity] = previous
      else:
        # We are here because capacity ≥ current item weight. This means that
        # we can remove some items from the previous optimal solution to make
        # room for the current item. If we did so, the value of the optimal
        # solution can be found at the reduced capacity for the previously
        # smaller problem added to the value of the current item. The reduced
        # capacity is what remains after we make room for the current item.
        reduced = capacity - weight
        previous_at_reduced = S[one_less_item][reduced]
        S[item][capacity] = max(previous, value + previous_at_reduced)
        #                                 ------------------------
        #                                 Value of current item plus the value
        #                                 of previously smaller problem (with
        #                                 one less item) at the reduced capacity
        #                                 that is necessary in order to fit the
        #                                 current item.
  # Done, return the full array S.
  # The value for the optimal solution
  # is at S[n][C]. The full array is
  # needed in order to find the items
  # that comprise the optimal solution
  return S

When you look at the code for method `dyn_prog` above, without any comments, eliminate variables `weight`, `value`, `one_less_item`, `previous`, `reduced` and `previous_at_reduced` that have been introduced for illustrative purposes, and simplify the loop indices, we get just 10 lines of compact code:

```python
def dyn_prog(v, w, C):
  n = len(v) - 1
  S = [ [0 for _ in range(C+1)] for _ in range(n+1)]
  for i in range(1, n+1):
    for j in range(1, C+1):
      if w[i] > j:
        S[i][j] = S[i - 1][j]
      else:
        S[i][j] = max(S[i - 1][j], v[i] + S[i - 1][j - w[i]])
  return S
```

The code can be simplified even more, by factoring out `S[i-1][j]` and keeping only one branch of the `if` statement:

```python
def dyn_prog(v, w, C):
  n = len(v) - 1
  S = [ [0 for _ in range(C+1)] for _ in range(n+1)]
  for i in range(1, n+1):
    for j in range(1, C+1):
      S[i][j] = S[i - 1][j]
      if w[i] <= j:
        S[i][j] = max(S[i][j], v[i] + S[i - 1][j - w[i]])
  return S
```

# Reconstruction

Use the array `S` returned by the method above, to identify the items in the optimal solution whose value is in `S[n][C]`. For example, using the test case above, method `reconstruct` below should return a list with item labels `[2,3,4]`.

To construct the list of items in the optimal solution, we start with `S[n][C]` and ask the question, *how did we get here?* The answer lies in the `if` statement of the dynamic programming method. The challenge is to reverse it in such a way that it tells us which items brought us there.

Any value in the `S` array is the outcome of the `if` statement in method `dyn_prog`. Specifically `S[i][j]` is either equal to `S[i-1][j]` or `v[i]+S[i-1][j-w[i]]`. We can tell for certain that if

```python
S[i][j] == v[i]+S[i-1][j-w[i]]
```

then item `i` is in the optimal solution. But what if we find that `S[i][j] == S[i-1][j]`? Do we include item `i` in the optimal solution, or not?

*Hint:* there is second, hidden `if` statement in `dyn_prog`, in the `max` operation.

In [None]:
def reconstruct(S,v,w):
  """Find the items in the optimal solution whose value is at the bottom right
  position of a dynamic programming array.

  Inputs
  ------
  S : list
    The dynamic programming array with optimal solution.
  v : list
    Values of items we use to build optimal solution. For n items, this list
    is expected to have n+1 items. list[0] is not used, so that item value is
    synchronized with position index. First item is at [1] (instead of 0),
    second item at [2] (instead of 1), etc.
  w : list
    Weights of items we use to build optimal solution. For n items, this list
    is expected to have n+1 items. list[0] is not used, so that item weight is
    synchronized with position index. First item is at [1] (instead of 0),
    second item at [2] (instead of 1), etc.

  Returns
  -------
  items : list
    List of items in the optimal solution.
  """
  # YOUR CODE HERE

  #number of items
  N = len(S) - 1
  #capacity limit
  capacity = len(S[0]) - 1
  #list to return
  items = []

  #pointers to iterate over S at the bottm right postion
  i = N
  j = capacity

  while i > 0 and j > 0: #there are still items left to pick from and there is still space in the bag
    if(S[i][j] != S[i-1][j]): #the current solution we are at doesnt equal the one directy above it i.e. that item was added at the same capacity when more items were avalible
      items.append(i) #if that item was added add that item to the list of items we grabed
      j -= w[i] #if that item was grabbed decrease the capacity by the weight of the item we grabbed to create another optimal subprolem
    i -= 1 #check next solution without current item

  items.reverse() #reverses list of items to represent the order we woul dpick them up

  return items


In [None]:
# Simple test case, matches slide deck example at:
# https://docs.google.com/presentation/d/1fhhKnA9CH3AY_ltPt4qgtjsXocscWCf5C2cgxi4RCKw/edit?usp=sharing

import numpy as np # for nice array printing with np.matrix()
C = 6
n = 4

w = [-1, 4, 3, 2, 1] # weights
v = [-1, 5, 4, 3, 2] # values
#     |  |  |  |  |
#    [0][1][2][3][4]
#     |  |  |  |  |
#     |  |  |  |  +--> 4th item weight w[4] and value v[4]
#     |  |  |  +-----> 3rd item weight w[3] and value v[3]
#     |  |  +--------> 2nd item weight w[2] and value v[2]
#     |  +-----------> 1st item weight w[1] and value v[1]
#     +--------------> not used -- stump value to allow us to use array
#                      elements 1 through n-inclusive, for data.

w1 = [-1, 1, 4, 1, 7, 1, 3, 6]
v1 = [-1, 6, 12 ,5, 35, 6, 8, 3]
C1 = 10

S=dyn_prog(v,w,C)
print(np.matrix(S))
print(reconstruct(S,v,w))

print("\n")

S1 = dyn_prog(v1, w1, C1)
print(np.matrix(S1))
reconstruct(S1,v1,w1)

[[0 0 0 0 0 0 0]
 [0 0 0 0 5 5 5]
 [0 0 0 4 5 5 5]
 [0 0 3 4 5 7 8]
 [0 2 3 5 6 7 9]]
[2, 3, 4]


[[ 0  0  0  0  0  0  0  0  0  0  0]
 [ 0  6  6  6  6  6  6  6  6  6  6]
 [ 0  6  6  6 12 18 18 18 18 18 18]
 [ 0  6 11 11 12 18 23 23 23 23 23]
 [ 0  6 11 11 12 18 23 35 41 46 46]
 [ 0  6 12 17 17 18 24 35 41 47 52]
 [ 0  6 12 17 17 20 25 35 41 47 52]
 [ 0  6 12 17 17 20 25 35 41 47 52]]


[1, 3, 4, 5]