Skip to content

Implementation of Wagner–Fischer algorithm for Levenshtein distance between two strings

License

Notifications You must be signed in to change notification settings

mayank-02/minimum-edit-distance

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Minimum Edit Distance

The similarity between two strings may be measured in many ways. One of such a string metric is known as the Levenshtein distance, which is a type of edit distance.

The edit distance between two strings is the minimum number of single-character insertions, deletions, or substitutions required to change one string into the other.

The class Levenshtein.py implements the Wagner–Fischer algorithm allowing to pass varying costs for insertion, deletion and substitution.

For more info on Levenshtein distance, refer wiki.

Requirements

  • Python 3.6+
  • No dependencies on any other library

Usage

from Levenshtein import Levenshtein

source = "hello"
target = "world"

l = Levenshtein(source, target, costs=(1, 1, 2))

min_distance = l.distance()
# min_distance = 4

operations = l.edit_ops()
# operations =
# [{'type': 'Substitution', 'i': 0, 'j': 0},
#  {'type': 'Substitution', 'i': 1, 'j': 1},
#  {'type': 'Substitution', 'i': 2, 'j': 2},
#  {'type': 'Match',        'i': 3, 'j': 3},
#  {'type': 'Substitution', 'i': 4, 'j': 4}]

l.print_distance_matrix()
# Distance Matrix:
# -  -  w  o  r  l  d
# -  0  2  4  6  8 10
# h  2  1  3  5  7  9
# e  4  3  2  4  6  8
# l  6  5  4  3  4  6
# l  8  7  6  5  3  5
# o 10  9  7  7  5  4

l.print_edit_ops()
# Edit Operations:
# Type           i  j
# --------------------
# Substitution   0  0
# Substitution   1  1
# Substitution   2  2
# Match          3  3
# Substitution   4  4

Check out Levenshtein.py for more details.

Authors

Mayank Jain

License

MIT

About

Implementation of Wagner–Fischer algorithm for Levenshtein distance between two strings

Topics

Resources

License

Stars

Watchers

Forks

Languages