Edit distance algorithms inc. Jaro, Damerau-Levenshtein, and Optimal Alignment
Crystal
Switch branches/tags
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Failed to load latest commit information.
spec
src
.gitignore
.travis.yml
LICENSE
README.md
shard.yml

README.md

edits

Build Status docrystal.org

A collection of edit distance algorithms in Crystal.

Includes Levenshtein, Restricted Edit (Optimal Alignment) and Damerau-Levenshtein distances, and Jaro and Jaro-Winkler similarity.

Installation

Add this to your application's shard.yml:

dependencies:
  edits:
    github: tcrouch/edits.cr

Usage

require "edits"

Levenshtein

Edit distance, taking into account deletion, addition and substitution.

Edits::Levenshtein.distance "raked", "bakers"
# => 3
Edits::Levenshtein.distance "iota", "atom"
# => 4
Edits::Levenshtein.distance "acer", "earn"
# => 4

# Max distance
Edits::Levenshtein.distance "iota", "atom", 2
# => 2
Edits::Levenshtein.most_similar "atom", ["atlas", "tram", "rota", "racer"]
# => "atlas"

Restricted Edit (Optimal Alignment)

Edit distance, accounting for deletion, addition, substitution and transposition (two adjacent characters are swapped). This variant is restricted by the condition that no sub-string is edited more than once.

Edits::RestrictedEdit.distance "raked", "bakers"
# => 3
Edits::RestrictedEdit.distance "iota", "atom"
# => 3
Edits::RestrictedEdit.distance "acer", "earn"
# => 4

# Max distance
Edits::RestrictedEdit.distance "iota", "atom", 2
# => 2
Edits::RestrictedEdit.most_similar "atom", ["iota", "tome", "mown", "tame"]
# => "tome"

Damerau-Levenshtein

Edit distance, accounting for deletions, additions, substitution and transposition (two adjacent characters are swapped).

Edits::DamerauLevenshtein.distance "raked", "bakers"
# => 3
Edits::DamerauLevenshtein.distance "iota", "atom"
# => 3
Edits::DamerauLevenshtein.distance "acer", "earn"
# => 3

Jaro & Jaro-Winkler

Edits::Jaro.similarity "information", "informant"
# => 0.90235690235690236
Edits::Jaro.distance "information", "informant"
# => 0.097643097643097643

Edits::JaroWinkler.similarity "information", "informant"
# => 0.94141414141414137
Edits::JaroWinkler.distance "information", "informant"
# => 0.05858585858585863

Contributing

  1. Fork it ( https://github.com/tcrouch/edits.cr/fork )
  2. Create your feature branch (git checkout -b my-new-feature)
  3. Commit your changes (git commit -am 'Add some feature')
  4. Push to the branch (git push origin my-new-feature)
  5. Create a new Pull Request

Contributors