# DP: Edit Distance Algorithm

For string-to-string correction problem. Edit distance algorithm counts the minimum number of edit operations required to transform one word into the other. A single operation may be either insertion, deletion, or substitution.

In [1]:
import numpy as np

### Initialize

Because the objective of this algorithm is to compute the shortest edit distance, the values in E are set to infinite.

In [2]:
word1 = list("LIKE")
word2 = list("LOVE")

In [3]:
E = np.full((len(word1)+1, len(word2)+1), float("inf"))

In [4]:
for i in range(len(word1)+1):
    E[i][0] = i
for j in range(len(word2)+1):
    E[0][j] = j

In [5]:
print(E)

[[  0.   1.   2.   3.   4.]
 [  1.  inf  inf  inf  inf]
 [  2.  inf  inf  inf  inf]
 [  3.  inf  inf  inf  inf]
 [  4.  inf  inf  inf  inf]]


### Execute

The last value of the array E is the edit distance to obtain.

In [6]:
for i in range(1, len(word1)+1):
    for j in range(1, len(word2)+1):
        if word1[i-1] == word2[j-1]:
            E[i][j] = min(E[i-1][j-1]+0, E[i-1][j]+1, E[i][j-1]+1)
        else:
            E[i][j] = min(E[i-1][j-1]+1, E[i-1][j]+1, E[i][j-1]+1)

In [7]:
print(E)

[[ 0.  1.  2.  3.  4.]
 [ 1.  0.  1.  2.  3.]
 [ 2.  1.  1.  2.  3.]
 [ 3.  2.  2.  2.  3.]
 [ 4.  3.  3.  3.  2.]]


In [8]:
print("Edit Distance:", E[len(word1)][len(word2)])

Edit Distance: 2.0
