In [2]:
# setup
from IPython.core.display import display,HTML
display(HTML('<style>.prompt{width: 0px; min-width: 0px; visibility: collapse}</style>'))
display(HTML(open('rise.css').read()))

# imports
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
%matplotlib inline
sns.set(style="whitegrid", font_scale=1.5, rc={'figure.figsize':(12, 6)})


  from IPython.core.display import display,HTML


# CMPS 2200
# Introduction to Algorithms

## Dynamic Programming - Edit Distance


### Target: Finish your meal.

- **Solution 1**: Eat the starter, main course and dessert all at once without following any order. Getting food into your stomach is the only thought you have.

- **Solution 2**: Eat the starter first to stimulate your appetite and now you can enjoy main course better. After finishing the starters, go on to the main courses. And now further for dessert!

Consider the following Python code to compute the minimum element in a list:

```python 
def my_min(a):
    S = a[0]
    for i in range(len(a)):
        S = min(S, a[i])
    return(S)
```
What algorithmic paradigm are we using here?

| Feature             | **Greedy Algorithm**                                                         | **Dynamic Programming**                                                                                            |
| ------------------- | ---------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------ |
| **Approach**        | Make the best local (immediate) choice, hoping it leads to a global optimum. | Explore all subproblems, store their results, and combine them to find a global optimum.                           |
| **Key Idea**        | *Locally optimal → globally optimal*                                         | *Optimal substructure + overlapping subproblems*                                                                   |
| **Decision Making** | One-shot choice at each step                                                 | Exhaustive, but efficient reuse via memoization or tabulation                                                      |
| **Speed**           | Usually faster (O(n log n), O(n))                                            | Often slower (O(n²), O(n·W), etc.)                                                                                 |
| **Memory Usage**    | Low                                                                          | Higher (tables or recursion caches)                                                                                |
| **When It Works**   | Problem has **greedy-choice property** and **optimal substructure**          | Problem has **optimal substructure** and **overlapping subproblems**, but *not necessarily greedy-choice property* |


### Edit Distance

Given two strings $S, T \in \Sigma^*$, how similar are they?

We can measure this using **edit distance**, which is the number of insertions and deletions (sometimes substitutions) needed to turn $S$ into $T$. Note that we can also go from $T$ to $S$ if we just reverse the edits (by turning insertions into deletions)

Example: $S$ = `abcdefghijkl`, $T$ = `abcdghikjl`. How many edits are needed?

Consider following edit sequence:

$S$: `abcdefghijkl---`<br>
$T$: `abcd--ghi---kjl`

This has 5 deletions and 3 insertions, for a total of 8 edits. What about this one:

$S$: `abcdefghijk-l`<br>
$T$: `abcd--ghi-kjl`

We have 3 deletions and 1 insertion for a total of 4 edits.

Our goal is to compute the **minimum edit distance** between two strings $S$ and $T$ of lengths $m$ and $n$, respectively.

It might seem like a toy problem, but this is a critical problem in comparing gene/protein sequences, and also online tools (e.g., Git/Overleaf/Google Doc). By attaching  weights to insertions and deletions, we can assess the evolutionary distance between two sequences.



Notice that once again, if we greedily apply edits to the beginning or end of the string we might miss a set of edits interspersed throughout the string. 


**Can we identify an optimal substructure property for this problem?**

<br>

Let's use case-based reasoning about the optimal solution as we did for Knapsack. Let $\color{red}{\mathit{MED}(S, T)}$ be the optimal number of edits between $S$ and $T$. 

<br><br><br>

In an optimal sequence of edits, how would we deal with the first two characters of $S$ and $T$, respectively?

<br><br>

For the base cases, is $S$ is empty and $T$ is not, what is the edit cost?  
S=` ` <br>
T=`abcde`

<br><br><br>


If either string is empty, then the edit cost is simply the length of the other string.

<br><br>

What if $S[0] = T[0]$?  
S=`abc`<br>
T=`ade`

<br><br>

then there is no benefit to editing and $\mathit{MED}(S, T) = \mathit{MED}(S[1:], T[1:])$. 

<br><br>
What if $S[0] \neq T[0]$?  
S=`abc` <br>
T=`bde`

<br><br><br>
then we must incur 1 edit either `insertion` or `deletion`. The less costly edit is either `Delete S[0] from S` or `Insert T[0] to S`.

$\rightarrow 1+\mathit{MED}(S[1:], T)~~~~$    e.g, $1+\mathit{MED}($ `bc` , `bde` $)$  
or   
$\rightarrow 1+\mathit{MED}(S, T[1:])~~~~$  e.g, $1+\mathit{MED}($ `abc` , `de` $)$  

<br>

**Optimal Substructure for Edit Distance**: Let $S$ and $T$ be strings of length $m$ and $n$, the optional substructure is:

$$\mathit{MED}(S, T) = 
\begin{cases}
\mathit{MED}(S[1:], T[1:]), \mbox{if}~~~S[0]=T[0] \\
1+\min\{\mathit{MED}(S[1:], T),\mathit{MED}(S, T[1:])\}, \mbox{otherwise} \\
\end{cases}
$$

Just as with Knapsack, this recursion tree for this recurrence yields an exponential number of nodes. How many nodes are there, and what is the depth? 

<!-- <br><br>
<br><br>
<br><br>
Note that if we allow `substitutions`, then we can easily replace `S[0]` with `T[0]`, in this case, we have another solution.
$\rightarrow 1+\mathit{MED}(S[1:], T[1:])~~~~$e.g, S = `abc`, T = `dbc`.  <span style="color:blue">**You will see this in Assignment 5 - Question 3.**</span>

 -->




The recursion tree has $O(2^{m+n})$ nodes and depth $O(m+n)$. Are there shared subproblems?

For $S$=`ABC` and $T$=`DBC` we have the following DAG:

<img src="figures\edit_distance_dag.jpg" width="60%">

How much sharing is possible? In other words, how many distinct subproblems are there?

In any recursive call, the subproblems we consider consist of strings with one less character. So there are $O(mn)$ subproblems, each of which can each be computed in $O(1)$ time (if we have precomputed the necessary dependencies). The longest path in the recursion DAG is $O(m+n)$.



### What the pros and cons for top-down and bottom up?

<img src="figures\Dynamic-Programming.png" width="70%">






### 🧠 Top-Down DP (Memoization)


✅ Pros

- Easier to write and understand – Follows the natural recursive structure of the problem.
$\rightarrow$ Great for problems that are naturally recursive (e.g., Fibonacci).

- Solves only necessary subproblems – Doesn’t compute states you don’t need.
$\rightarrow$ Useful when many possible subproblems exist but only a few are relevant.

- Quick to implement – Especially with recursion + caching (e.g., using a dictionary or array).

❌ Cons

- Function call overhead – Recursive calls add stack frames, which increases overhead.
$\rightarrow$ Can be an issue for large input sizes.

- Risk of stack overflow – Deep recursion (like in long chains) can exceed call stack limits.

- More difficult to control order of computation – May be harder to optimize memory usage.

###  ⚙️ Bottom-Up DP (Tabulation)

✅ Pros

- No recursion overhead – Uses loops, so no stack issues.

- Can be more space-optimized – You can often keep only a few rows or variables instead of the entire table.

- Deterministic execution order – Easier to reason about time and memory usage.

❌ Cons

- Computes all subproblems – Even those not needed for the final answer. $\rightarrow$ Less efficient if only a subset of subproblems are relevant.

- Harder to implement – Requires careful thought about the correct iteration order.

- Less intuitive – Especially for problems that are naturally recursive (like trees or graphs).

In [1]:
def MED(S, T):
    #print("S:%s, T:%s" % (S, T))
    if (S == ""):
        return(len(T))
    elif (T == ""):
        return(len(S))
    else:
        if (S[0] == T[0]):
            return(MED(S[1:], T[1:]))
        else:
            return(1 + min(MED(S, T[1:]), MED(S[1:], T)))

## Case 1
S= "abcdefghijkl"
T= "abcdghikjl"
print(MED(S, T))

## Case 2
S0 = 'kitten'
T0 = 'sitting'
print(MED(S0, T0))

4
5


In [3]:
def med_top_down(S, T, MED={}):

    
    if (S, T) in MED:
        return MED[(S, T)]
    
    if not S:
        return len(T)
    if not T:
        return len(S)

    if S[0] == T[0]:  # If first characters are the same, move to the next
        MED[(S, T)] = med_top_down(S[1:], T[1:], MED)
    else:
        insert = med_top_down(S, T[1:], MED) + 1  # Insert a character
        delete = med_top_down(S[1:], T, MED) + 1  # Delete a character
        MED[(S, T)] = min(insert, delete)
    
    return MED[(S, T)]

## Case 1
S= "abcdefghijkl"
T= "abcdghikjl"
print("Case 1:", med_top_down(S, T))

## Case 2
S = 'kitten'
T = 'sitting'
print("Case 2:", med_top_down(S, T))


Case 1: 4
Case 2: 5


### Top-down Design

```python

def med_top_down(S, T, MED={}):

    
    if (S, T) in MED:
        return MED[(S, T)]
    
    if not S:
        return len(T)
    if not T:
        return len(S)

    if S[0] == T[0]:  # If first characters are the same, move to the next
        MED[(S, T)] = med_top_down(S[1:], T[1:], MED)
    else:
        insert = med_top_down(S, T[1:], MED) + 1  # Insert a character
        delete = med_top_down(S[1:], T, MED) + 1  # Delete a character
        MED[(S, T)] = min(insert, delete)
    
    return MED[(S, T)]

## Case 1
S= "abcdefghijkl"
T= "abcdghikjl"
print("Case 1:", med_top_down(S, T))

## Case 2
S = 'kitten'
T = 'sitting'
print("Case 2:", med_top_down(S, T))

```

### Bottom-up Design: 2D Array <span style="color:red">of size $(n+1)\times(m+1)$</span>

$$\mathit{MED}(S, T) = 
\begin{cases}
\mathit{MED}(S[1:], T[1:]), \mbox{if}~~~S[0]=T[0] \\
1+\min\{\mathit{MED}(S[1:], T),\mathit{MED}(S, T[1:])\}, \mbox{otherwise} \\
\end{cases}
$$

<img src="figures\med_table.png" width="60%">


<img src="figures\med_table_rs.png" width="60%">

<img src="figures\med_table_rs_path.png" width="60%">

$$
v(\text{OPT}([n], W)) = 
\max \Big\{
\color{blue}{v(n) + v(OPT([n-1], W - w(n)))} ,\;\;
\color{red}{v(OPT([n-1], W))}
\Big\}.
$$ 



<img src="figures\0-1Quiz.png" width="24%">

### Bottum-up

$$
\begin{array}{c|ccccccccccc}
i \backslash W & 0 & 1 & 2 & 3 & 4 & 5 & 6 & 7 & 8 & 9 & 10 & 11 \\ \hline
1 & 0 & 1 & 1 & 1 & 1 & 1 & 1 & 1 & 1 & 1 & 1 & 1 \\
2 & 0 & 1 & 6 & 7 & 7 & 7 & 7 & 7 & 7 & 7 & 7 & 7 \\
3 & 0 & 1 & 6 & 7 & 7 & 18 & 19 & 24 & 25 & 25 & 25 & 25 \\
4 & 0 & 1 & 6 & 7 & 7 & 18 & 22 & 24 & 28 & 29 & 29 & 40 \\
5 & 0 & 1 & 6 & 7 & 7 & 18 & 22 & 28 & 29 & 34 & 35 & 40 \\
\end{array}
$$



### Top-down

- `(1, 0)`: 0
- `(1, 2)`: 1
- `(1, 3)`: 1
- `(1, 4)`: 1
- `(1, 5)`: 1
- `(1, 6)`: 1
- `(1, 9)`: 1
- `(1, 11)`: 1
- `(2, 0)`: 0
- `(2, 4)`: 7
- `(2, 5)`: 7
- `(2, 6)`: 7
- `(2, 11)`: 7
- `(3, 4)`: 7
- `(3, 5)`: 18
- `(3, 11)`: 25
- `(4, 4)`: 7
- `(4, 11)`: 40
- `(5, 11)`: 40


$$
\begin{array}{c|ccccccccccc}
i \backslash W & 0 & 1 & 2 & 3 & 4 & 5 & 6 & 7 & 8 & 9 & 10 & 11 \\ \hline
1 & 0 &   & 1 & 1 & 1 & 1 & 1 &   &   & 1 &   & 1 \\
2 & 0 &   &   &   & 7 & 7 & 7 &   &   &   &   & 7 \\
3 &   &   &   &   & 7 & 18 &   &   &   &   &   & 25 \\
4 &   &   &   &   & 7 &   &   &   &   &   &   & 40 \\
5 &   &   &   &   &   &   &   &   &   &   &   & 40 \\
\end{array}
$$
