### [72\. Edit Distance](https://leetcode.com/problems/edit-distance/)

Difficulty: **Hard**


Given two words _word1_ and _word2_, find the minimum number of operations required to convert _word1_ to _word2_.

You have the following 3 operations permitted on a word:

1.  Insert a character
2.  Delete a character
3.  Replace a character

**Example 1:**

```
Input: word1 = "horse", word2 = "ros"
Output: 3
Explanation: 
horse -> rorse (replace 'h' with 'r')
rorse -> rose (remove 'r')
rose -> ros (remove 'e')
```

**Example 2:**

```
Input: word1 = "intention", word2 = "execution"
Output: 5
Explanation: 
intention -> inention (remove 't')
inention -> enention (replace 'i' with 'e')
enention -> exention (replace 'n' with 'x')
exention -> exection (replace 'n' with 'c')
exection -> execution (insert 'u')
```

### 解釋

比較兩字串 將字串1變成字串2需要最少幾次編輯

一次編輯:

1. 插入一個字

2. 刪除一個字

3. 取代一個字

### LCS Solution: 失敗

先找到最長共同子字串的長度 LCS

兩邊剩下的字元為: r1, r2

**錯誤答案**

最少編輯次數 = 取代次數 + 刪除/插入次數

res = min(len(r1), len(r2)) + abs(len(r1) - len(r2))

**更正**

要考慮到間格

找到 LCS 後, 必須把間隔標示出來

每個間隔之間多出來的字要分開套用上面的轉換式

再將結果總和

**失敗**: 無法保證LCS取到最短編輯距離

In [11]:
class Solution_Fail:
    def minDistance(self, word1: str, word2: str):
        if not word1 or not word2:
            return None
        
        m, n = len(word1), len(word2)
        remain1 = []
        remain2 = []

        LCS = [[None] * (n+1) for _ in range(m+1)]

        for i in range(m+1):
            for j in range(n+1):
                if i == 0 or j == 0:
                    LCS[i][j] = 0
                elif word1[i-1] == word2[j-1]:
                    LCS[i][j] = LCS[i-1][j-1] + 1
                else:
                    LCS[i][j] = max(LCS[i-1][j], LCS[i][j-1])
        
        print(LCS[m][n])
        len1 = m - LCS[m][n]
        len2 = n - LCS[m][n]

        return min(len1, len2) + abs(len1 - len2)

[參考影片](https://www.youtube.com/watch?v=MiqoA-yF-0M)

### DP Solution: Sub-string

每次互相比較最後的字元

字元相同: 不用編輯 \[i-1]]\[j-1]

字元不同:

1. 取代: 兩個字串都要縮短一個字元 \[i-1]]\[j-1]+1

2. 插入: 輸出字串去掉一個字元 \[i-1]]\[j]+1

3. 刪除: 輸入字串去掉一個字元 \[i]]\[j-1]+1

選擇三個選項中最小的

DP 表格
```
replace   insert
delete    *curr*
```

Time: O(MN)

Space: O(MN)

In [34]:
class Solution:
    def minDistance(self, word1: str, word2: str):
        ED = [[0] * (len(word1) + 1) for _ in range(len(word2) + 1)]

        for j in range(len(word1)+1):
            ED[0][j] = j

        for i in range(len(word2)+1):
            ED[i][0] = i

        for i, c in enumerate(word2):
            for j, d in enumerate(word1):
                if c == d:
                    ED[i+1][j+1] = ED[i][j]
                else:
                    ED[i+1][j+1] = min(ED[i][j], ED[i+1][j], ED[i][j+1]) + 1
        return ED[-1][-1]

In [None]:
### LRU_CACHE
class Solution:
    def minDistance(self, s1: str, s2: str) -> int:
        @lru_cache(None)
        def dp(i,j):
            if i<0 or j<0: return max(i,j)+1
            return dp(i-1,j-1) if s1[i]==s2[j] else min(dp(i-1,j),dp(i-1,j-1),dp(i,j-1))+1
        return dp(len(s1)-1,len(s2)-1)

In [None]:
### deque
from collections import deque
class Solution:
    def minDistance(self, word1: str, word2: str) -> int:
        visit, q = set(), deque([(word1, word2, 0)])
        
        while q:
            w1, w2, d = q.popleft()
            if w1 == w2:
                return d
            
            if (w1, w2) not in visit:
                visit.add((w1, w2))
                
                while w1 and w2 and w1[0] == w2[0]:
                    w1 = w1[1:]
                    w2 = w2[1:]
                d +=1 
                q.extend([(w1[1:], w2[1:], d), (w1[1:], w2, d), (w1, w2[1:], d)])

### Transform 2D to 1D

[參考](https://leetcode.com/problems/edit-distance/discuss/25846/C%2B%2B-O(n)-space-DP)

從轉換看來, 只需要當前這一列與前一列就夠了

不過用些技巧, 可以只要一列長度為目標字串的陣列

In [51]:
class Solution:
    def minDistance(self, word1: str, word2: str):
        m, n = len(word1), len(word2)
        ED = [0] * (n+1)
        for j in range(n+1):
            ED[j] = j
        for i in range(1,m+1):
            pre = ED[0]
            ED[0] = i
            for j in range(1,n+1):
                temp = ED[j]
                if word1[i-1] == word2[j-1]:
                    ED[j] = pre
                else:
                    ED[j] = min(pre, ED[j-1], ED[j]) + 1
                pre = temp
            print(ED)
        return ED[n]

In [52]:
Solution.minDistance(_, 'horse', 'ros')

[1, 1, 2, 3]
[2, 2, 1, 2]
[3, 2, 2, 2]
[4, 3, 3, 2]
[5, 4, 4, 3]


3