In [22]:
import numpy as np

#### The `Minimum Edit Distance` is defined as the minimum total cost of operations required to transform a source string of characters into a given target string. We will consider three possible operations, `insertion` of a character into source string, `deletion` of a character from source string and `substituting` a character from the source string with any charcter from the vocabulary. We will assign a cost of 1 for insertion/deletion and cost of 2 for substitution (a substitution can be thought of as a deletion followed by an insertion, also a substitution of a character with itself has zero cost).

#### The minimim edit distance between two strings can be computed using `dynamic programming`. Given a string `X` of length `n` and a string `Y` of length `m`, we first define `D[i,m]` as the minimum edit distance between `X[0:i]` and `Y[0:j]`, i.e. the substring containing first i characters of X and first j characters of Y (note that X[0] and Y[0] are defined as the empty string). Dynamic programming allows us to compose the solutions to these subproblems to compute the edit distance between X and Y which is given by `D[n,m]`. To use dynamic programming, we first note the base case, D[i,0] = i (this is because going from a source string containing first i characters of X to the empty string target requires i delete operations and hence cost i)  and D[0,j] = j (because going from the empty source string to the target containing first j characters of the Y requires j insert operations and hence cost j). Then we note that the following recursion relation allows us to compute all other values of D[i,j] starting from the base case:

#### $D[i,j] = min(D[i-1,j] + del\_cost(X[i]), D[i,j-1] + insert\_cost(Y[j]), D[i-1,j-1] + substitute\_cost(X[i],Y[j]))$

#### Note that this equation takes the minimum of three possible paths through the D[i,j] matrix. Intuitively, consider that we know the costs D[i-1,j], D[i,j-1] and D[i-1,j-1]. Now note that if we delete the ith character (X[i]) from X[1:i] to get X[1:i-1] and then transform it to Y[1:j] which costs D[i-1,j], then the cost of transforming X[1:i] to Y[1:j] is simply the cost of D[i-1,j] plus the cost of that one delete operation. Similarly a second path would be to first transform X[1:i] to get Y[1:j-1] and then insert the jth character of Y (Y[j]) into Y[1:j-1], which has cost D[i,j-1] plus the cost of that one insertion. Finally, the third possible path is that given X[1:i], we can just transform the first (i-1) characters in X, i.e. X[1:i-1] into Y[1:j-1], then just swap the ith character (X[i]) with the jth character of Y (Y[j]), which has cost D[i-1,j-1] plus the cost of the swapping. Then the optimal cost D[i,j] is just the minimum of these three possibilities.  


In [39]:
def minimum_edit_dist(s1, s2):
    n = len(s1)
    m = len(s2)
    D = np.zeros(shape=(n+1,m+1))
    # base case initialization
    D[0,:] = np.arange(0,m+1)
    D[:,0] = np.arange(0,n+1)
    parent_pointers = {}
    for i in range(1,n+1):
        for j in range(1,m+1):
            c1 = D[i-1,j] + 1
            c2 = D[i,j-1] + 1
            c3 = D[i-1,j-1] + 2 if (s1[i-1] != s2[j-1]) else D[i-1,j-1]
            D[i,j] = min(c1, c2, c3)
            costs = [c1,c2,c3]
            parents = [(i-1,j), (i,j-1), (i-1,j-1)]
            min_cost = D[i,j]
            p = [parents[i] for i in range(3) if costs[i]==min_cost]
            parent_pointers[(i,j)] = p

    # get the minimum edit distance
    min_dist = D[n,m]
    print("D = ")
    print(D)
    print(f"Parent pointers: {parent_pointers}")


In [40]:
minimum_edit_dist("intention", "execution")

D = 
[[ 0.  1.  2.  3.  4.  5.  6.  7.  8.  9.]
 [ 1.  2.  3.  4.  5.  6.  7.  6.  7.  8.]
 [ 2.  3.  4.  5.  6.  7.  8.  7.  8.  7.]
 [ 3.  4.  5.  6.  7.  8.  7.  8.  9.  8.]
 [ 4.  3.  4.  5.  6.  7.  8.  9. 10.  9.]
 [ 5.  4.  5.  6.  7.  8.  9. 10. 11. 10.]
 [ 6.  5.  6.  7.  8.  9.  8.  9. 10. 11.]
 [ 7.  6.  7.  8.  9. 10.  9.  8.  9. 10.]
 [ 8.  7.  8.  9. 10. 11. 10.  9.  8.  9.]
 [ 9.  8.  9. 10. 11. 12. 11. 10.  9.  8.]]
Parent pointers: {(1, 1): [(0, 1), (1, 0), (0, 0)], (1, 2): [(0, 2), (1, 1), (0, 1)], (1, 3): [(0, 3), (1, 2), (0, 2)], (1, 4): [(0, 4), (1, 3), (0, 3)], (1, 5): [(0, 5), (1, 4), (0, 4)], (1, 6): [(0, 6), (1, 5), (0, 5)], (1, 7): [(0, 6)], (1, 8): [(1, 7)], (1, 9): [(1, 8)], (2, 1): [(1, 1), (2, 0), (1, 0)], (2, 2): [(1, 2), (2, 1), (1, 1)], (2, 3): [(1, 3), (2, 2), (1, 2)], (2, 4): [(1, 4), (2, 3), (1, 3)], (2, 5): [(1, 5), (2, 4), (1, 4)], (2, 6): [(1, 6), (2, 5), (1, 5)], (2, 7): [(1, 7)], (2, 8): [(1, 8), (2, 7), (1, 7)], (2, 9): [(1, 8)], (3, 1): [(2, 1