Skip to content

Commit

Permalink
add “最小编辑距离”(Minimum Edit Distance)算法
Browse files Browse the repository at this point in the history
  • Loading branch information
xiaominghe2014 committed May 11, 2024
1 parent 2e86345 commit 8e30f50
Show file tree
Hide file tree
Showing 3 changed files with 44 additions and 0 deletions.
3 changes: 3 additions & 0 deletions include/base/XString.h
Original file line number Diff line number Diff line change
Expand Up @@ -58,6 +58,9 @@ namespace XString
std::vector<std::string> lettersShape(const std::string &letters);

std::size_t findFirstCharNotEscapedBefore(const std::string& str, char ch);

//“最小编辑距离”(Minimum Edit Distance)算法 eg. Levenshtein Distance算法
int levenshteinDistance(std::string word1, std::string word2);

template<class T_out,class T_in>
T_out convert(const T_in& in)
Expand Down
8 changes: 8 additions & 0 deletions main.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -707,10 +707,18 @@ void testDecodeAndEncode()
}
}

void testString(){
std::string word1 = "word hehe";
std::string word2 = "word hhh";
int distance = XString::levenshteinDistance(word1, word2);
LOG_I("%s => %s = %d", word1.c_str(),word2.c_str(),distance);
}

int main(int argc, char *argv[])
{
std::cout <<XString::toStringAddEnter(XString::lettersShape("xlib-test"))<< std::endl;
setLog();
testString();
// testPatch();
testAStar();
testMath();
Expand Down
33 changes: 33 additions & 0 deletions src/base/XString.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -466,4 +466,37 @@ std::vector<std::string> XString::lettersShape(const std::string &letters)
return std::string::npos;
}

/**
Levenshtein Distance算法流程:
1. 创建一个二维数组dp,其中dp[i][j]表示将第一个字符串的前i个字符转换为第二个字符串的前j个字符所需的最小操作次数。
2. 初始化dp数组,使得dp[i][0] = i和dp[0][j] = j,分别表示将一个非空字符串转换为空字符串和将一个空字符串转换为非空字符串所需的操作次数。
3. 遍历两个字符串的每个字符,计算dp[i][j]的值:
- 如果两个字符相同,则dp[i][j] = dp[i-1][j-1],不需要额外操作。
- 否则,dp[i][j] = min(dp[i-1][j]+1, dp[i][j-1]+1, dp[i-1][j-1]+1),分别对应删除、插入和替换操作。
4. 最终返回dp[m][n],其中m和n分别为两个字符串的长度,即为它们之间的编辑距离。
*/
int XString::levenshteinDistance(std::string word1, std::string word2){
int m = word1.length();
int n = word2.length();
vector<vector<int>> dp(m + 1, vector<int>(n + 1, 0));

for (int i = 0; i <= m; i++) {
dp[i][0] = i;
}
for (int j = 0; j <= n; j++) {
dp[0][j] = j;
}

for (int i = 1; i <= m; i++) {
for (int j = 1; j <= n; j++) {
if (word1[i - 1] == word2[j - 1]) {
dp[i][j] = dp[i - 1][j - 1];
} else {
dp[i][j] = 1 + min({dp[i - 1][j], dp[i][j - 1], dp[i - 1][j - 1]});
}
}
}
return dp[m][n];
}

XLIB_END

0 comments on commit 8e30f50

Please sign in to comment.