Skip to content
levenshtein with the ability to inject hints into the ruleset for possibly cheaper rules
Ruby
Find file
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Failed to load latest commit information.
lib
spec
README.rdoc
Rakefile
VERSION
hintable_levenshtein.gemspec

README.rdoc

Hintable Levenshtein

Levenshtein distances but with extra hints. Perhaps adding or deleting a space is not as big as a change as other things, or substituting a 'c' for a 'k' is again a cheaper operation than just any arbitrary change.

Just an example

english_rules = [
  HintableLevenshtein::RuleSet.new(0.3, HintableLevenshtein::Rule.insert(/[\.,!]/)),
  HintableLevenshtein::RuleSet.new(0.3, HintableLevenshtein::Rule.delete(/[\.,!]/)),
  HintableLevenshtein::RuleSet.new(0.4, HintableLevenshtein::Rule.substitute('!' => '.')),
  HintableLevenshtein::RuleSet.new(0.4, HintableLevenshtein::Rule.substitute('!' => ',')),
  HintableLevenshtein::RuleSet.new(0.75, HintableLevenshtein::Rule.insert(' '), HintableLevenshtein::Rule.insert(' ')),
  HintableLevenshtein::RuleSet.new(0.5, HintableLevenshtein::Rule.insert(' ')),
  HintableLevenshtein::RuleSet.new(0.5, HintableLevenshtein::Rule.delete(' ')),
  HintableLevenshtein::RuleSet.new(0.7, HintableLevenshtein::Rule.substitute('z' => 's')),
  HintableLevenshtein::RuleSet.new(0.7, HintableLevenshtein::Rule.substitute('k' => 'c')),
  HintableLevenshtein::RuleSet.new(0.7, HintableLevenshtein::Rule.substitute('u' => 'o')),
  HintableLevenshtein::RuleSet.new(0.7, HintableLevenshtein::Rule.substitute('e' => 'a')),
  HintableLevenshtein::RuleSet.new(0.7, HintableLevenshtein::Rule.substitute('i' => 'y')),
  HintableLevenshtein::RuleSet.new(1, HintableLevenshtein::Rule.delete),
  HintableLevenshtein::RuleSet.new(1, HintableLevenshtein::Rule.insert),
  HintableLevenshtein::RuleSet.new(1, HintableLevenshtein::Rule.substitute)
]

a = "hello kitten pizza!!"
b = "hello    cittin pissssa.."

puts "normal levenshtein: #{HintableLevenshtein.new.distance(a, b)}"
puts "hinted levenshtein: #{HintableLevenshtein.new(english_rules).distance(a, b)}"

Would output:

normal levenshtein: 11.0
hinted levenshtein: 7.15
Something went wrong with that request. Please try again.