Spark functions to run popular phonetic and string matching algorithms
-
Updated
Feb 22, 2022 - Scala
Spark functions to run popular phonetic and string matching algorithms
A fuzzy matching string distance library for Scala and Java that includes Levenshtein distance, Jaro distance, Jaro-Winkler distance, Dice coefficient, N-Gram similarity, Cosine similarity, Jaccard similarity, Longest common subsequence, Hamming distance, and more..
🎯 String metrics and phonetic algorithms for Scala (e.g. Dice/Sorensen, Hamming, Jaccard, Jaro, Jaro-Winkler, Levenshtein, Metaphone, N-Gram, NYSIIS, Overlap, Ratcliff/Obershelp, Refined NYSIIS, Refined Soundex, Soundex, Weighted Levenshtein).
Add a description, image, and links to the jaro-winkler topic page so that developers can more easily learn about it.
To associate your repository with the jaro-winkler topic, visit your repo's landing page and select "manage topics."