Fuzzy-Match

Fuzzy string matching in Python. By default it uses Trigrams to calculate a similarity score and find matches by splitting strings into ngrams with a length of 3. The length of the ngram can be altered if desired. Cosine, Levenshtein Distance, and Jaro-Winkler Distance algorithims are also available as alternatives.

Usage

>>> from fuzzy_match import match
>>> from fuzzy_match import algorithims

Trigram

>>> algorithims.trigram("this is a test string", "this is another test string")
    0.703704

Cosine

>>> algorithims.cosine("this is a test string", "this is another test string")
    0.7999999999999998

Levenshtein

>>> algorithims.levenshtein("this is a test string", "this is another test string")
    0.7777777777777778

Jaro-Winkler

>>> algorithims.jaro_winkler("this is a test string", "this is another test string")
    0.798941798941799

Match

>>> choices = ["simple strings", "strings are simple", "sim string", "string to match", "matching simple strings", "matching strings again"]
>>> match.extract("simple string", choices, limit=2)
    [('simple strings', 0.8), ('sim string', 0.642857)]
>>> match.extractOne("simple string", choices)
    ('simple strings', 0.8)

You can also pass additional arguments to extract and extractOne to set a score cutoff value or use one of the other algorithims mentioned above. Here is an example:

>>> match.extract("simple string", choices, match_type='levenshtein', score_cutoff=0.7)
    [('simple strings', 0.9285714285714286), ('sim string', 0.7692307692307693)]

match_type options include trigram, cosine, levenshtein, jaro_winkler

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
fuzzy_match		fuzzy_match
tests		tests
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Fuzzy-Match

Usage

Trigram

Cosine

Levenshtein

Jaro-Winkler

Match

About

Releases

Packages

Languages

License

darwinagain/fuzzy-match

Folders and files

Latest commit

History

Repository files navigation

Fuzzy-Match

Usage

Trigram

Cosine

Levenshtein

Jaro-Winkler

Match

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages