Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with
or
.
Download ZIP
levenshtein with the ability to inject hints into the ruleset for possibly cheaper rules
Ruby
Branch: master

Fetching latest commit…

Cannot retrieve the latest commit at this time

Failed to load latest commit information.
lib
spec
README.rdoc
Rakefile
VERSION
hintable_levenshtein.gemspec

README.rdoc

Hintable Levenshtein

Levenshtein distances but with extra hints. Perhaps adding or deleting a space is not as big as a change as other things, or substituting a 'c' for a 'k' is again a cheaper operation than just any arbitrary change.

Just an example

english_rules = [
  HintableLevenshtein::RuleSet.new(0.3, HintableLevenshtein::Rule.insert(/[\.,!]/)),
  HintableLevenshtein::RuleSet.new(0.3, HintableLevenshtein::Rule.delete(/[\.,!]/)),
  HintableLevenshtein::RuleSet.new(0.4, HintableLevenshtein::Rule.substitute('!' => '.')),
  HintableLevenshtein::RuleSet.new(0.4, HintableLevenshtein::Rule.substitute('!' => ',')),
  HintableLevenshtein::RuleSet.new(0.75, HintableLevenshtein::Rule.insert(' '), HintableLevenshtein::Rule.insert(' ')),
  HintableLevenshtein::RuleSet.new(0.5, HintableLevenshtein::Rule.insert(' ')),
  HintableLevenshtein::RuleSet.new(0.5, HintableLevenshtein::Rule.delete(' ')),
  HintableLevenshtein::RuleSet.new(0.7, HintableLevenshtein::Rule.substitute('z' => 's')),
  HintableLevenshtein::RuleSet.new(0.7, HintableLevenshtein::Rule.substitute('k' => 'c')),
  HintableLevenshtein::RuleSet.new(0.7, HintableLevenshtein::Rule.substitute('u' => 'o')),
  HintableLevenshtein::RuleSet.new(0.7, HintableLevenshtein::Rule.substitute('e' => 'a')),
  HintableLevenshtein::RuleSet.new(0.7, HintableLevenshtein::Rule.substitute('i' => 'y')),
  HintableLevenshtein::RuleSet.new(1, HintableLevenshtein::Rule.delete),
  HintableLevenshtein::RuleSet.new(1, HintableLevenshtein::Rule.insert),
  HintableLevenshtein::RuleSet.new(1, HintableLevenshtein::Rule.substitute)
]

a = "hello kitten pizza!!"
b = "hello    cittin pissssa.."

puts "normal levenshtein: #{HintableLevenshtein.new.distance(a, b)}"
puts "hinted levenshtein: #{HintableLevenshtein.new(english_rules).distance(a, b)}"

Would output:

normal levenshtein: 11.0
hinted levenshtein: 7.15
Something went wrong with that request. Please try again.