# Dynamic Programming

In [3]:
# Import third-party modules
import numpy as np

**STRATEGY**

1. Break down problem into non-overlapping subproblems of the same type
2. Create a recursive solution: solve each subproblem and combine the results
3. Determine $T(n)$: worst-case runtime
4. Optionally, create iterative solution

Each subproblem is the same type as the original, naturally leading to a recursive solution.  
(You can often rewrite the solution to be iterative.)


**Worst case $T(n)$**  
The worst case time taken for the algorithm for problem size $n$.

## The Change Problem

### Algorithm 1: Change problem

**Input:** An integer $money$ and positive integers $coin_1, ..., coin_d$.

**Output:** The minimum number of coins with denominations $coin_1, ..., coin_d$ that changes $money$.

`GreedyChange`: Not optimal  

`RecursiveChange`: $T(n) = \sum_{d \in D} T(n-d) + O(1)$ (a long long time)

`DPChange`: ✅

In [3]:
def GreedyChange(money):
    return money

GreedyChange(5)

5

In [4]:
def RecursiveChange(money):
    return money

RecursiveChange(5)

5

In [5]:
def DPChange(money):
    # Go forwards in time, from 0 to money.
    # At each time step, we calculate the minimum number of coins needed to make change for the current amount of money.
    return money

RecursiveChange(5)

5

## The Alignment Game

Alignment of two strings is a two-row matrix such that:  
* *1st row:* symbols of the 1st string (in order) interspersed by "-"  
* *2nd row*: symbols of the 2nd string (in order) interspersed by "-"

|Sequence 1:| A | T | - | G | T | T | A | T | A
|-| - | - | - | - | - | - | - | - | -
|**Sequence 2:**| **A** | **T** | **C** | **G** | **T** | **-** | **C** | **-** | **C**
|Score| $+1$ | $+1$ | $-\sigma$ | $+1$ | $+1$ | $-\sigma$ | $-\mu$ | $-\sigma$ | $-\mu$

**Common subsequence:** **ATGT**

$\begin{align}
\text{\#symbols in two strings} & =  2 \cdot (\text{\#matches} + \text{\#mismatches}) + 1 \cdot (\text{\#insertions} + \text{\#deletions}) \\
& = [2 \cdot \text{\#matches} -1 \cdot (\text{\#insertions} + \text{\#deletions})] + [2 \cdot (\text{\#insertions} + \text{\#deletions} + \text{\#mismatches})] \\
& = 2 \cdot \text{Alignment Score} (\mu=0, \sigma=1/2) + 2 \cdot \text{Edit Distance}\\
\end{align}$


$$\therefore \text{minimising Edit Distance} \iff \text{maximising Alignment Score}$$

### Problem 2: Optimal alignment

**Input:** Two strings, mismatch penalty $\mu$, and indel penalty $\sigma$.

**Output:** An alignment of the strings maximising the score.

### Problem 3: Longest common subsequence

**Input:** Two strings.

**Output:** A longest common subsequence of these strings.

Equivalent to: *Optimal slignment* with $\mu = \sigma = 0$

### Problem 4: Edit distance

**Input:** Two strings.

**Output:** The minimum number of operations (insertion, deletions, and substitutions of symbols) to transform one string into another.

Equivalent to: minimum number of insertions, deletions and mismatches in an alignment of two strings (among all possible alignments).

Let $D(i,j)$ be the edit distance of an $i$-prefix $A[1...i]$ and a $j$-prefix $B[1...j]$. Then

$$
D(i,j) = \min \left\{\begin{array}{c}
D(i,j-1) + 1 \\\\
D(i-1,j) + 1 \\\\
D(i-1,j-1) + 1 & \text{if} \; A[i] \neq B[j]\\\\
D(i-1,j-1) & \text{if} \; A[i] = B[j]\\
\end{array}\right.
$$

**Distance Matrix**  

Trace back from bottom right to top left to get Optimal Alignment.

<p align="center">
    <img src="images/edit_distance_matrix.png" width="450" style="display: inline-block; margin-right: 0px;">
</p>


## Knapsack

**GOAL**:  Maximise value while limiting total weight.



|Fractional Knapsack| Discrete Knapsack |
| - | - |
|Can take fractions of items.  | Each item is either taken or not.|
| - | **Variant 1**: with repetitions (unlimited quantities) <br> **Variant 2**: without repetitions (one of each item)|
|**Use greedy algorithm.**| **Use dynamic programming.** |




#### Example Problems


**1. TV commercial placement**  
Select a set of TV commercial (each commercial has duration and cost) so that the total revenue is maximal while the total length does not exceed the lenfth of the available time slot.

**2. Optimising data centre performance**  
Purchase computers for a data enter to achieve the maximal performance under limited budget.

<p align="center">
    <img src="images\knapsack_examples.png" width="450" style="display: inline-block; margin-right: 0px;">
</p>




### Problem 5: Knapsack with repetitions

**Input:** Weights $w_1, ..., w_n \geq 0$ and values $v_1, ..., v_n \geq 0$ of $n$ items, and total weight $W \geq 0$.

**Output:** The maximum value of items whose weight does not exceed $W$. Each item can be used any number of times.

Let $\text{value}(w)$ be the maximum value of knapsack of weight $w$. Then,
$$
\text{value}(w) =  \underset{i : w_i \leq w}{\max} \{ \text{value}(w-w_i) + v_i \}.

$$

**Running time:** $O(n W)$

In [70]:
def Knapsack(weights, values, W, repetitions=True):
    n = len(weights)
    Value = [0]*(W+1)
    Value[0] = 0

    # Calculate the value of the optimal solution for weight w-1, w-2, ..., 1
    for w in range(1, W + 1):
        Value[w] = 0
        # Consider each item
        for i in range(1, n + 1):
            # If the i-th item fits in the knapsack with remaining weight w
            if weights[i-1] <= w:
                # Calculate the value of fitting the i-th item in the knapsack when w weight remains.
                val = Value[w - weights[i-1]] + values[i-1]
                # Update the value if it is greater than the current value, reaching the optimal solution for remaining weight w.
                if val > Value[w]:
                    Value[w] = val
    print(Value)
    return Value[W]

Knapsack([6, 3, 4, 2], [30, 14, 16, 9], 10)

[0, 0, 9, 14, 18, 23, 30, 32, 39, 44, 48]


48

In [43]:
Knapsack([6, 3, 4, 2], [30, 14, 16, 9], 10, repetitions=False)

2 9 W =  10
10 9

3 14 W =  10
10 14

4 16 W =  10
10 16

6 30 W =  10
10 30

10 30 W =  10
10 30



30

### Problem 6: Knapsack without repetitions

**Input:** Weights $w_1, ..., w_n \geq 0$ and values $v_1, ..., v_n \geq 0$ of $n$ items, and total weight $W \geq 0$.

**Output:** The maximum value of items whose weight does not exceed $W$. Each item can be used at most once.

Let $\text{value}(w, i)$ be the maximum value of knapsack of weight $w$ and items $1,...,n$. Then, given the $i$-th item is used or not,
$$
\text{value}(w, i) =  \max \{ \text{value}(w-w_i, i-1) + v_i,  \text{value}(w, i-1)\}.

$$

**Running time:** $O(n W)$

In [69]:
def Knapsack(weights, values, W):
    weights = np.array(weights)
    values = np.array(values)
    n = len(weights)
    # initialise numpy 2x2 array with zeros
    Value = np.zeros((W+1, n+1))

    # Calculate the value of the optimal solution for weight w-1, w-2, ..., 1
    for w in range(1, W+1):
        for i in range(1, n+1):
            #print(f"Value[{w}, {i}] = {Value[w, i-1]}")
            Value[w, i] = Value[w, i-1]
            if weights[i-1] <= w:
                # Calculate the value of fitting the i-th item in the knapsack when w weight remains.
                val = Value[w - weights[i-1], i-1] + values[i-1]
                # Update the value if it is greater than the current value, reaching the optimal solution for remaining weight w.
                if val > Value[w, i]:
                    Value[w, i] = val # ie Value[w, i] = max(Value[w - weights[i-1], i-1] + values[i-1], Value[w, i-1])
    print(Value)
    return Value[W, n]

Knapsack([6, 3, 4, 2], [30, 14, 16, 9], 10)

[[ 0.  0.  0.  0.  0.]
 [ 0.  0.  0.  0.  0.]
 [ 0.  0.  0.  0.  9.]
 [ 0.  0. 14. 14. 14.]
 [ 0.  0. 14. 16. 16.]
 [ 0.  0. 14. 16. 23.]
 [ 0. 30. 30. 30. 30.]
 [ 0. 30. 30. 30. 30.]
 [ 0. 30. 30. 30. 39.]
 [ 0. 30. 44. 44. 44.]
 [ 0. 30. 44. 46. 46.]]


46.0

**Reconstructing a solution**

<p align="left">
    <img src="images/knapsack_reconstruct.png" width="450" style="display: inline-block; margin-right: 0px;">
</p>


### Final remarks

* Use a hash table!
* If all problems must be solves, then an iterative algorithm is usually faster since it has no recursion.
* However, there are cases when one does not need to solve all subproblems,  
e.g. assume $W, w_1,...,w_n \mid 100$, then $\text{value}(w)$ not needed when $w \nmid 100$.
* $O(nW)$ is not polynomial, it is exponential: $O(n2^{\log W})$:  
To represent $W$, we need only $\log(W)$ digits, so input size is $\log(W)$ and not $W$.  
Hepful explanation: https://stackoverflow.com/questions/4538581/why-is-the-knapsack-problem-pseudo-polynomial

## Placing Parenthesis

**Input:** A sequence of digits $d_1, ..., d_n$ and a sequence of operations $op_1, ..., op_n \in \{+, -, \times\}$.

**Output:** An order of applying these operaitons that maximises the value of the expression $d_1 \; op_1 \; d_2 \; op_2 \; ... \; op_n \; d_n$.

**Running time:** $O(n^3)$

* Let $E_{i,j}$ be the subexpression  
$d_i \; op_i \; ... \; op_{j-1} \; d_j$.

* Subproblems:  
$$
M(i,j) = \max E_{i,j} = \underset{i \leq k \leq j-1}{\max} \left\{\begin{array}{c}
M(i,k) \quad op_k \quad M(k+1,j) \\\\
M(i,k) \quad op_k \quad m(k+1,j) \\\\
m(i,k) \quad op_k \quad M(k+1,j) \\\\
m(i,k) \quad op_k \quad m(k+1,j) \\
\end{array}\right.
$$

$$
m(i,j) = \min E_{i,j} = \underset{i \leq k \leq j-1}{\min} \left\{\begin{array}{c}
M(i,k) \quad op_k \quad M(k+1,j) \\\\
M(i,k) \quad op_k \quad m(k+1,j) \\\\
m(i,k) \quad op_k \quad M(k+1,j) \\\\
m(i,k) \quad op_k \quad m(k+1,j) \\
\end{array}\right.
$$

<br>
<p align="middle">
    <img src="images\placing_parenthesis_order.png" width="450" style="display: inline-block; margin-right: 0px;">
</p>




In [21]:
import operator
ops = {
    "+": operator.add,
    "-": operator.sub,
    "*": operator.mul,
}   

def MinAndMax(i, j, m, M, op):
    min = np.inf
    max = -np.inf

    for k in range(i, j):
        a  = ops[op[k]](M[i, k], M[k+1, j])
        b  = ops[op[k]](M[i, k], m[k+1, j])
        c  = ops[op[k]](m[i, k], M[k+1, j])
        d  = ops[op[k]](m[i, k], m[k+1, j])

        min = np.min([min, a, b, c, d])
        max = np.max([max, a, b, c, d])
    
    return min, max

def Parentheses(d, op):
    assert(len(d) == len(op) + 1)
    n = len(d)
    m = np.zeros((n, n))
    M = np.zeros((n, n))

    for i in range(n):
        m[i, i] = d[i]
        M[i, i] = d[i]
    

    # Iterate over the diagonals: (1,1) (2,2) (3,3) (4,4) (5,5),  (1,2) (2,3) (3,4) (4,5),  (1,3) (2,4) (3,5),  (1,4) (2,5),  (1,5)
    for s in range(1, n):
        for i in range(n-s):
            j = i + s
            
            m[i, j], M[i, j] = MinAndMax(i, j, m, M, op)

    print(M)
    print(m)
    return int(M[0, n-1])


Parentheses([5, 8, 7, 4, 8, 9], ['-', '+', '*', '-', '+'])

[[  5.  -3.   4.  25.  65. 200.]
 [  0.   8.  15.  60.  52.  75.]
 [  0.   0.   7.  28.  20.  35.]
 [  0.   0.   0.   4.  -4.   5.]
 [  0.   0.   0.   0.   8.  17.]
 [  0.   0.   0.   0.   0.   9.]]
[[   5.   -3.  -10.  -55.  -63.  -94.]
 [   0.    8.   15.   36.  -60. -195.]
 [   0.    0.    7.   28.  -28.  -91.]
 [   0.    0.    0.    4.   -4.  -13.]
 [   0.    0.    0.    0.    8.   17.]
 [   0.    0.    0.    0.    0.    9.]]


200

<p align="left">
    <img src="images\placing_parenthesis_reconstruct.png" width="450" style="display: inline-block; margin-right: 0px;">
</p>