# Introduction

Edit distance, also known as Levenshtein distance, measures the minimum number of single-character edits (insertions, deletions, or substitutions) required to change one string into another. 

It's often used in applications such as spell checking, DNA sequencing, and text mining.

The dynamic programming approach to compute the edit distance involves creating a matrix that will store the edit distances between all prefixes of the two words. The key idea is to build up the solution for longer prefixes using solutions for shorter prefixes.

Steps:

1.`Initialize a matrix`: Create a matrix `dp` of size `(m+1) x (n+1)`, where `m` is the length of the first word, and `n` is the length of the second word. `dp[i][j]` represents the edit distance between the first `i` characters of the first word and the first `j` characters of the second word.

2.`Base cases`:

`dp[i][0] = i` for all `i` from `0` to `m`, because transforming any prefix of the first word to an empty second word requires `i` deletions.

`dp[0][j] = j` for all `j` from `0` to `n`, because transforming an empty first word to any prefix of the second word requires `j` insertions.

3.`Fill the matrix`: Use the following recurrence relation to fill the matrix:

If the characters match (`word1[i-1] == word2[j-1]`), then `dp[i][j] = dp[i-1][j-1]`.

If the characters do not match, then `dp[i][j]` is the minimum of:

`dp[i-1][j] + 1 `(deletion)

`dp[i][j-1] + 1` (insertion)

`dp[i-1][j-1] + 1` (substitution)

4.`Result`: The value `dp[m][n]` will be the edit distance between the two words.

In [1]:
def edit_distance(s1, s2):
    m, n = len(s1), len(s2)
    # Initialize a matrix to store the edit distances
    dp = [[0] * (n + 1) for _ in range(m + 1)]

    # Initialize the first row and column
    for i in range(m + 1):
        dp[i][0] = i
    for j in range(n + 1):
        dp[0][j] = j

    # Fill in the rest of the matrix
    for i in range(1, m + 1):
        for j in range(1, n + 1):
            if s1[i - 1] == s2[j - 1]:
                dp[i][j] = dp[i - 1][j - 1]
            else:
                dp[i][j] = 1 + min(dp[i - 1][j],  # deletion
                                   dp[i][j - 1],  # insertion
                                   dp[i - 1][j - 1])  # substitution

    return dp[m][n]


In [2]:
# Example usage:
s1 = "kitten"
s2 = "sitting"
print("Edit distance between '{}' and '{}': {}".format(s1, s2, edit_distance(s1, s2)))

Edit distance between 'kitten' and 'sitting': 3


In [3]:
# Example usage:
s1 = "perturb"
s2 = "superb"
print("Edit distance between '{}' and '{}': {}".format(s1, s2, edit_distance(s1, s2)))

Edit distance between 'perturb' and 'superb': 5
