An Implementation of Jaro Distance Algorithm by Matthew A. Jaro
De-duplicate short strings such as names by computing similarity and distance between a pair of strings using wink-jaro-distance
. It is an implementation of Jaro Distance Algorithm that determines the similarity/distance by taking into account the insertions, deletions and transpositions.
Use npm to install:
npm install wink-jaro-distance --save
// Load Jaro Distance Function
var jaro = require( 'wink-jaro-distance' );
console.log( jaro( 'father', 'farther') );
// -> { distance: 0.04761904761904756, similarity: 0.9523809523809524 }
console.log( jaro( 'Angelina', 'Angelica') );
// -> { distance: 0.08333333333333337, similarity: 0.9166666666666666 }
console.log( jaro( 'Flikr', 'Flicker' ) );
// -> { distance: 0.09523809523809523, similarity: 0.9047619047619048 }
console.log( jaro( 'abcdef', 'fedcba' ) );
// -> { distance: 0.6111111111111112, similarity: 0.38888888888888884 }
Try experimenting with this example on Runkit in the browser.
Computes Jaro distance and similarity between strings s1
and s2
.
Original Reference: UNIMATCH: A Record Linkage System: Users Manual pp 104.
jaro( 'daniel', 'danielle' );
// -> { distance: 0.08333333333333337, similarity: 0.9166666666666666 }
jaro( 'god', 'father' );
// -> { distance: 1, similarity: 0 }
Returns object containing distance
and similarity
values between 0 and 1.
Computes Jaro distance and similarity between strings s1
and s2
.
Original Reference: UNIMATCH: A Record Linkage System: Users Manual pp 104.
Parameters
Examples
jaro( 'daniel', 'danielle' );
// -> { distance: 0.08333333333333337, similarity: 0.9166666666666666 }
jaro( 'god', 'father' );
// -> { distance: 1, similarity: 0 }
Returns object containing distance
and similarity
values between 0 and 1.
If you spot a bug and the same has not yet been reported, raise a new issue or consider fixing it and sending a pull request.
Wink is a family of open source packages for Statistical Analysis, Natural Language Processing and Machine Learning in NodeJS. The code is thoroughly documented for easy human comprehension and has a test coverage of ~100% for reliability to build production grade solutions.
wink-jaro-distance is copyright 2017-18 GRAYPE Systems Private Limited.
It is licensed under the terms of the MIT License.