Write a function that takes in two strings and returns the minimum number of edit operations that need to be performed on the rst string to obtain the second string. 

There are three edit operations: 
- insertion of a character, 
- deletion of a character, and 
- substitution of a character for another.

Example:

input:
```
s1: "abc"
s2: "yabd" 
```
output: 
```2```

Reason: ```insert"y"; substitute "c"for "d"```


In [1]:
"""
    Approach: Dynamic Programming
        2D array to keep track the change

         - y a b d
       - 0 1 2 3 4 ==> Explanation: - => -    no change:   0
       a 1                          - => y    1 insertion: 1
       b 2                          - => ya   2 insertion: 2
       c 3                          - => yab  3 insertion: 3
         ^                          - => yabd 4 insertion: 4
         Explanation:
         a   => -   1 deletion: 1
         ab  => -   2 deletion: 2
         abc => -   3 deletion: 3
    
          - y a b d
        - 0 1 2 3 4
        a 1 X Y Z Q <== X position - a => y   replacement: 1   
        b 2             Y position - a => ya  a == a => insertion: 1
        c 3             Z position - a => yab 2 insertion(y, b): 2
                        Q position - a => yabd 3 insertion(y, b, d): 3
                        
        Rules:
            if c1 == c2 ==> use diag value
            else:
                get min of 3 neighbours value (up, left, diag) + 1

Time Complexity = O(n * m): n - number of characters in str1; m - number of characters in str2
Space Complexity = O(n * m):  n - number of characters in str1; m - number of characters in str2
"""
def levenshtein_distance(s1, s2):
    # s1 - row
    # s2 - col
    change = [[0 for c in range(0, len(s2)+1)] for r in range(len(s1)+1)]
    # init
    for c in range(len(s2)+1):
        change[0][c] = c
    for r in range(len(s1)+1):
        change[r][0] = r
    # process
    for r in range(1, len(s1)+1):
        for c in range(1, len(s2)+1):
            c1 = s1[r-1] # important, easy to make off by one  mistake here
            c2 = s2[c-1]
            if (c1 == c2): 
                change[r][c] = change[r-1][c-1] #diag value
            else:
                # get min value between 3 neighbours - up, left, diag
                up = change[r-1][c] 
                left = change[r][c-1]
                diag = change[r-1][c-1]
                change[r][c] = min(up, left, diag) + 1
    return change[-1][-1]

s1 = "abc"
s2 = "yabd"

print(levenshtein_distance(s1, s2))

2


In [2]:
"""
    [Better Approach - Save the Space Complexity]
    Idea:
        We mainly use 2 rows in the change array
            => current row AND previous row

Time Complexity: O(n * m)
Space Complexity: O(min(n, m))
"""
def levenshtein_distance(s1, s2):
    # determine the smallest len and put into the s2
    if len(s1) < len(s2):
        s2, s1 = s1, s2
    
    change = [[0 for c in range(0, len(s2)+1)] for r in range(2)]
    # init
    for c in range(len(s2)+1):
        change[0][c] = c
    change[1][0] = 1

    # process
    row = 1
    for r in range(1, len(s1)+1):
        for c in range(1, len(s2)+1):

            c1 = s1[r-1] # important, easy to make off by one  mistake here
            c2 = s2[c-1]

            if (c1 == c2): 
                if row == 1:
                    change[1][c] = change[0][c-1] #diag value
                else:
                    change[0][c] = change[1][c-1]
            else:
                # get min value between 3 neighbours - up, left, diag
                up = change[0][c] if row == 1 else change[1][c]
                left = change[1][c-1] if row == 1 else change[0][c-1]
                diag = change[0][c-1] if row == 1 else change[1][c-1]
                if row == 1:
                    change[1][c] = min(up, left, diag) + 1
                else:
                    change[0][c] = min(up, left, diag) + 1
        # swap position   
        row = 0 if row == 1 else 1
       
    return change[1][-1] if row == 0 else change[0][-1] # important, return the row, before change

s1 = "abc"
s2 = "yabd"

print(levenshtein_distance(s1, s2))

2
