|
| 1 | +# 72. Edit Distance |
| 2 | + |
| 3 | +## Levenshtein Distance |
| 4 | +- Run-time: O(M*N) |
| 5 | +- Space: O(M*N |
| 6 | +- M = Length of word1 |
| 7 | +- N = Length of word2 |
| 8 | + |
| 9 | +The solution is called the "Levenshtein Distance". |
| 10 | + |
| 11 | +To build some intuition, we should be able to notice that recursion is possible. |
| 12 | +Since there are three choices(insert, delete, replace), the run-time would equate to 3^(max(length of word1, length of word2). |
| 13 | +Because there is a recursion solution, lets come up with a dynamic programming solution instead. |
| 14 | + |
| 15 | +A 2d array should come to mind, columns as word1, rows as word2 for each letter of each word. |
| 16 | +Besides the three operations, there are also two other choices, whether the letter from each word matches. |
| 17 | +If the letters match or not, then what we care about is the previous minimum operations. |
| 18 | +Using the 2d array, we can figure out the previous minimum operations. |
| 19 | +For any given dp element, the left, top and top-left values are what we care about. |
| 20 | + |
| 21 | +``` |
| 22 | +Columns = word1 |
| 23 | +Rows = word2 |
| 24 | +
|
| 25 | +Insert: |
| 26 | + '' a b |
| 27 | +'' 0 1 2 |
| 28 | +a 1 0 1 |
| 29 | +b 2 1 0 |
| 30 | +c 3 2 1 |
| 31 | +
|
| 32 | +Delete: |
| 33 | + '' a b c |
| 34 | +'' 0 1 2 3 |
| 35 | +a 1 0 1 2 |
| 36 | +b 2 1 0 1 |
| 37 | +
|
| 38 | +Replace: |
| 39 | + '' a b c |
| 40 | +'' 0 1 2 3 |
| 41 | +a 1 0 1 2 |
| 42 | +b 2 1 0 1 |
| 43 | +d 3 2 1 1 |
| 44 | +``` |
| 45 | + |
| 46 | +So for any given dp element, dp[i][j] = 1 + min(dp[i-1][j-1], d[i-1][j], dp[i][j-1]). |
| 47 | +The only important thing to consider is when the letters match. |
| 48 | +For that scenario, dp[i-1][j-1] + 1 does not apply, it doesn't need any operations done for that dp[i][j]. |
| 49 | + |
| 50 | +``` |
| 51 | +class Solution: |
| 52 | + def minDistance(self, word1: str, word2: str) -> int: |
| 53 | +
|
| 54 | + def create_dp(): |
| 55 | + dp = [[0] * (len(word1) + 1) for _ in range(len(word2) + 1)] |
| 56 | + for idx in range(len(word1) + 1): |
| 57 | + dp[0][idx] = idx |
| 58 | + for idx in range(len(word2) + 1): |
| 59 | + dp[idx][0] = idx |
| 60 | + return dp |
| 61 | +
|
| 62 | + dp = create_dp() |
| 63 | + for col_idx, ch1 in enumerate(word1, 1): |
| 64 | + for row_idx, ch2 in enumerate(word2, 1): |
| 65 | + top_left = dp[row_idx-1][col_idx-1] |
| 66 | + if ch1 == ch2: |
| 67 | + top_left -= 1 |
| 68 | + dp[row_idx][col_idx] = 1 + min(top_left, # top left corner |
| 69 | + dp[row_idx][col_idx-1], # left |
| 70 | + dp[row_idx-1][col_idx]) # above |
| 71 | + return dp[-1][-1] |
| 72 | +``` |
0 commit comments