# Dynamic Programming  

**Def**: simplifying a complicated problem by breaking it down into simpler sub-problems in a recursive manner, memoing the solutions to the subproblems, and aggregating them for a holistic solution. 

## Frog Problem  
>There are $N$ stones, numbered $1,2,…,N$. For each $i \; (1\le i\le N)$, the height of Stone $i$ is $h_i$.

>There is a frog who is initially on Stone 1. He will repeat the following action some number of times to reach Stone $N$:

>If the frog is currently on Stone $i$, jump to Stone $i+1$ or Stone $i+2$. Here, a cost of $|h_i - h_j|$ is incurred, where $j$ is the stone to land on.
>Find the minimum possible total cost incurred before the frog reaches Stone $N$.

We will go about solving this by finding the optimal path to each node (stone), which is a binary decision (leap from previous stone or one before, comparing cost of optimal path to get to each and the cost to get from each to the target stone). 

In [4]:
def frog(H):
    N = len(H)
    opt = [0] * N
    # cost to get to 0th node is 0. we start with 1th node.
    for i in range(1, N):
        if i == 1:
            opt[i] = H[i] - H[i-1] # H[1] - H[0]
        else:
            temp1 = abs(H[i] - H[i - 1]) + opt[i-1]
            temp2 = abs(H[i] - H[i - 2]) + opt[i-2]
            opt[i] = min(temp1, temp2) # relaxation

    return opt[N-1]

print(frog([1, 2, 4, 4, 5, 3, 4, 7]))
        


8


The computational complexity is $O(N)$.  
The above implementation is pull-based (think of $i$ when $i-1$, $i-2$ are known) but the implementation could also be push-based (think of $i + 1$, $i + 2$ when $i$ is known).
When optimality is derived through the optimality of smaller subparts as in the question above, the problem is said to have *optimal substructure*.  



The problem can also be solved making use of a recursive executive search with a memo.  
What this does is essentially the same as the dynamic programming solution. 

## Knapsack Problem  

> Given a set of items, each with a weight and a value, determine the number of each item to include in a collection so that the total weight is less than or equal to a given limit and the total value is as large as possible.

The standard method of dynamic programming: 
1. Recursive solution 
2. Memo intermediate results

In this case, we look at each item and consider two cases: one in which we include the item, and one in which we exclude it, opting for the maximizing option. We do this recursively at every node, so that the end result gives us a global maxima. 

In [17]:
import pandas as pd
def knapSack(W, wt, val, n):
    """
    W (int): maximum weight
    wt (list): list of weights of each item
    val (list): list of value(price) of each item
    n: number of items
    """
    K = [[0 for x in range(W + 1)] for x in range(n + 1)]
 
    # Build table K[][] in bottom up manner
    for i in range(n + 1): # i will form the y axis
        for w in range(W + 1): # w will form the x axis
            if i == 0 or w == 0:
                K[i][w] = 0 # the 0th item or the 0th weight limit will always yield 0 packed items
            elif wt[i-1] <= w: # i - 1 to control for the x and y axes populated by 0
                K[i][w] = max(val[i-1]
                          + K[i-1][w-wt[i-1]], 
                              K[i-1][w])
            else:
                K[i][w] = K[i-1][w]
 
    return K
 
 
# Driver code
val = [3, 2, 6, 1, 3, 85]
wt = [2, 1, 3, 2, 1, 5]
W = 14
n = len(val)
df = knapSack(W, wt, val, n)
for i in range(n):
    print(f"item{i+1} - wt: {wt[i]}; val: {val[i]}")
print("items available (i)/weight limit (w)")
print(pd.DataFrame(df))



item1 - wt: 2; val: 3
item2 - wt: 1; val: 2
item3 - wt: 3; val: 6
item4 - wt: 2; val: 1
item5 - wt: 1; val: 3
item6 - wt: 5; val: 85
items available (i)/weight limit (w)
   0   1   2   3   4   5   6   7   8   9   10  11  12  13   14
0   0   0   0   0   0   0   0   0   0   0   0   0   0   0    0
1   0   0   3   3   3   3   3   3   3   3   3   3   3   3    3
2   0   2   3   5   5   5   5   5   5   5   5   5   5   5    5
3   0   2   3   6   8   9  11  11  11  11  11  11  11  11   11
4   0   2   3   6   8   9  11  11  12  12  12  12  12  12   12
5   0   3   5   6   9  11  12  14  14  15  15  15  15  15   15
6   0   3   5   6   9  85  88  90  91  94  96  97  99  99  100


## Edit Distance (Sequence Alignment)

**Def**: Given two strings $S, T$, our aim is to convert $S$ into $T$ by repeating the 3 steps below. The minimum number of steps necessary to complete the conversion is the two strings' edit distance. 
* Update: Change one character in $S$ to any given character
* Delete: Choose one character in $S$ and delete it
* Insert: Insert one character in any given position in $S$. 

Say $S=$ logistic, and $T=$ algorithm.  
The solution is similar to the knapsack problem, with the following graph structure:
![](img.jpg)

Just have to find the shortest path!!  

In [2]:
import pandas as pd
def editDistance(s,t):
    s = ' ' + s
    t = ' ' + t
    S = len(s)
    T = len(t)
    opt = [[0 for i in range(T)] for j in range(S)]
    for i in range(S):
        opt[i][0] = i
    for j in range (T):
        opt[0][j] = j
    for j in range(1,T):
        for i in range(1,S):
            if s[i] == t[j]:
                opt[i][j] = opt[i-1][j-1] # 斜め移動が0コスト
            else: # check up down and upper left for lowest path
                opt[i][j] = min(opt[i-1][j], opt[i][j-1], opt[i-1][j-1]) + 1
    return opt

# test
S = "logistic"
T = "algorithm"
df = pd.DataFrame(editDistance(S, T))
df.columns = [""] + list(T)
df.index = [""] + list(S)
print(df)





      a  l  g  o  r  i  t  h  m
   0  1  2  3  4  5  6  7  8  9
l  1  1  1  2  3  4  5  6  7  8
o  2  2  2  2  2  3  4  5  6  7
g  3  3  3  2  3  3  4  5  6  7
i  4  4  4  3  3  4  3  4  5  6
s  5  5  5  4  4  4  4  4  5  6
t  6  6  6  5  5  5  5  4  5  6
i  7  7  7  6  6  6  5  5  5  6
c  8  8  8  7  7  7  6  6  6  6


## Partitioning  

Dynamic Programming is useful in optimizing partitions.  
For example, partitioning japanese words 
> 僕 / は / 君 / を / 愛し / て / いる  


The problem can be generalized as such:
> There is a list of $N$ elements. Each slice (interval) of this element $[l, r)$ has a score $c_{l, r}$.  
> Let $K$ be an integer such that $K\le N$. Then, take $K + 1$ integers $t_0, t_1, \dots, t_K$ such that $0 = t_0 < t1 < \cdots < t_K = N$.   
> The total score of partitions $[t_0, t_1), [t_1, t_2), \dots, [t_{K-1}, t_K)$ is 
$$c_{t_0,t_1} + c_{t_1, t_2} + \cdots + c_{t_{K-1}, t_K}$$
>Minimize the above score

Code example in C, given the costs are given.  
```
// defining DP table
vector <long long> dp(N+1, INF);

// initial condition
dp[0] = 0;

// populate DP with loop
for (int i = 0; i <= N; ++i) {
    for (int j=0; j < i; ++j) {
        chmin(dp[i], dp[j] + c[j][i]) 
        # choose between (so far) optimized cost of partitioning interval [0, i) or optimized partitions of interval [0, j) plus the cost of interval [j, c)
    }
}
```

computational complexity is $O(N^2)$