# Levenshtein Distance (Edit Distance Problem)

Levenshtein distance is the minimum number of single character edits(insertions, deletions and substituitions) required to change one word to another. Each of the operations have unit cost.

Input : X, Y
For example: X = kitten; Y = sitten
For converting X to Y, we can replace k with s with a cost of 1. So, Levenshtein distance here would be 1.

In [21]:
def get_levenshtein_distance(X,Y):
    #Building memoisation table
    result = [[0 for i in range(len(X)+1)] for j in range(len(Y)+1)]
    
    for i in range(len(Y)+1):
        result[i][0] = i
    for i in range(len(X)+1):
        result[0][i] = i
    
    # Row by row   
    for i in range(1,len(Y)+1):
        # Column by column
        for j in range(1,len(X)+1):
            if Y[i-1] ==  X[j-1]:
                result[i][j] = result[i-1][j-1]
            else:
                result[i][j] = 1 + min(result[i-1][j], result[i][j-1], result[i-1][j-1])
                
        
    return result
    

In [25]:
X = 'kitten'
Y = 'sitting'
table = get_levenshtein_distance(X,Y)
for row in table:
    print(row)
print('Levenshtein distance :{}'.format(table[len(Y)][len(X)]))

[0, 1, 2, 3, 4, 5, 6]
[1, 1, 2, 3, 4, 5, 6]
[2, 2, 1, 2, 3, 4, 5]
[3, 3, 2, 1, 2, 3, 4]
[4, 4, 3, 2, 1, 2, 3]
[5, 5, 4, 3, 2, 2, 3]
[6, 6, 5, 4, 3, 3, 2]
[7, 7, 6, 5, 4, 4, 3]
Levenshtein distance :3
