Skip to content

The smart-match module contains functions for calculating strings/sets similarity.

License

Notifications You must be signed in to change notification settings

jiayingwang/smart-match

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Introduction

The smart-match module contains functions for calculating strings/sets similarity.

Concept

  1. similarity: A value in a range of [0, 1], which represents how similar the two strings are. The larger the value, the more similar the two strings are.

  2. dissimilarity: A value in a range of [0, 1], which represents how dissimilar the two strings are. The larger the value, the more dissimilar the two strings are. For a pair of strings, similarity = 1 - dissimilarity

  3. distance: How far the two strings are. Notice that not all the methods support distance method.

  4. score The larger the score, the more similar the two strings are. Notice not all the methods have score method.

We support three levels of string matching.

  1. char: Similarity computation based on characters in the strings.

  2. term: Similarity computation based on terms in the strings.

  3. gram: Similarity computation based on q-grams in the strings.

Methods

We support the following methods.

Method similarity dissimilarity distance score
Levenshtein (default)
Euclidean
Damerau Levenshtein
Block Distance
Cosine
Tanimoto Coefficient
Dice
Simon White
Longest Common Substring
Longest Common SubSequence
Overlap Coefficient
Generalized Overlap Coefficient
Jaccard
Generalized Jaccard
Hamming
Jaro
Jaro Winkler
Needleman Wunch
Smith Waterman
Smith Waterman Gotoh
Monge Elkan

Installation

pip install smart-match

Usage

import smart_match
print(smart_match.similarity('hello', 'hero'))
print(smart_match.dissimilarity('hello', 'hero'))
print(smart_match.distance('hello', 'hero'))

Output:

0.6
0.4
2

Check Wiki for more details.

License

smart-match is a free software. See the file LICENSE for the full text.

Authors

qrcode_for_wechat_official_account

About

The smart-match module contains functions for calculating strings/sets similarity.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages