# edit distance 


 **edit distance**  is a way of quantifying how dissimilar two strings (e.g., words) are to one another by counting the minimum number of operations required to transform one string into the other. Edit distances find applications in natural language processing, where automatic spelling correction can determine candidate corrections for a misspelled word by selecting words from a dictionary that have a low distance to the word in question. In bioinformatics, it can be used to quantify the similarity of DNA sequences, which can be viewed as strings of the letters A, C, G and T.

Different definitions of an edit distance use different sets of string operations. Levenshtein distance operations are the removal, insertion, or substitution of a character in the string. Being the most common metric, the term Levenshtein distance is often used interchangeably with edit distance.[1]

![](https://github.com/TarekSherif/NoteBookImages/raw/master/edit-distance/1.svg)

![](https://github.com/TarekSherif/NoteBookImages/raw/master/edit-distance/2.ppm)

# Importing the necessary Libraries

In [1]:
from Levenshtein import distance
import numpy as np


# function definition

In [2]:

def get_distance_matrix(str_list):
    """ Construct a levenshtein distance matrix for a list of strings"""
    dist_matrix = np.zeros(shape=(len(str_list), len(str_list)))

    print ("Starting to build distance matrix. This will iterate from 0 till ", len(str_list) )
    for i in range(0, len(str_list)):
        print (i)
        for j in range(i+1, len(str_list)):
                dist_matrix[i][j] = distance(str_list[i], str_list[j]) 
    for i in range(0, len(str_list)):
        for j in range(0, len(str_list)):
            if i == j:
                dist_matrix[i][j] = 0 
            elif i > j:
                dist_matrix[i][j] = dist_matrix[j][i]

    return dist_matrix



# function call

In [3]:
str_list = [
    "part", "spartan"
  
]
get_distance_matrix(str_list)

Starting to build distance matrix. This will iterate from 0 till  2
0
1


array([[0., 3.],
       [3., 0.]])

# **References**


* https://en.wikipedia.org/wiki/Edit_distance

* https://en.wikipedia.org/wiki/Levenshtein_distance
 
* https://en.wikipedia.org/wiki/Word_error_rate

# **Thankyou for Reading and Do Upvote If you liked !!!**