Add the dependency to your
dependencies: cadmium_distance: github: cadmiumcr/distance
The Jaro-Winkler algorithm returns a number between 0 and 1 which tells how closely two strings match (1 being perfect and 0 being not at all).
jwd = Cadmium::Distance::JaroWinkler.new jwd.distance("dixon","dicksonx") # => 0.8133333333333332 jwd.distance("same","same") # => 1 jwd.distance("not","same") # => 0.0
The Levenshtein distance algorithm returns the number of edits (insertions, modifications, or deletions) required to transform one string into another.
Cadmium::Distance::Levenshtein.distance("doctor", "doktor") # => 1 Cadmium::Distance::Levenshtein.distance("doctor", "doctor") # => 0 Cadmium::Distance::Levenshtein.distance("flad", "flaten") # => 3
Pair Distance uses arbitrary n-grams to calculate how similar one string is to another. By calculating the bi-grams for a string, the pair distance algorithm first checks how many occurrences of each bi-gram occur in both strings, then it calculates their similarity with the formula
simularity = (2 · intersections) / (s1size + s2size).
Cadmium::Distance::Pair.distance("night", "nacht") # => 0.25
- Fork it (https://github.com/cadmiumcr/distance/fork)
- Create your feature branch (
git checkout -b my-new-feature)
- Commit your changes (
git commit -am 'Add some feature')
- Push to the branch (
git push origin my-new-feature)
- Create a new Pull Request
- Chris Watson - creator and maintainer