speed up levenshtein_distance about 18% #812

Merged
merged 1 commit into from Feb 6, 2014

Conversation

Projects
None yet
2 participants
Contributor

tenderlove commented Feb 6, 2014

This commit speeds up the levenshtein_distance method about 18%.

Here is my benchmark:

require 'benchmark/ips'
require 'rubygems/text'

include Gem::Text

def min3 a, b, c
  if a < b && a < c
    a
  elsif b < a && b < c
    b
  else
    c
  end
end

def levenshtein_distance2 str1, str2
  s = str1
  t = str2
  n = s.length
  m = t.length
  max = n/2

  return m if (0 == n)
  return n if (0 == m)
  return n if (n - m).abs > max

  d = (0..m).to_a
  x = nil

  str1.each_char.each_with_index do |char1,i|
    e = i+1

    str2.each_char.each_with_index do |char2,j|
      cost = (char1 == char2) ? 0 : 1
      x = min3(
        d[j+1] + 1,  # insertion
        e + 1,       # deletion
        d[j] + cost) # substitution

      d[j] = e
      e = x
    end

    d[m] = x
  end

  return x
end

str1 = "hello world"
str2 = " hello world"

Benchmark.ips do |x|
  x.report("original")  { levenshtein_distance str1, str2 }
  x.report("new")       { levenshtein_distance2 str1, str2 }
end

My results:

[aaron@higgins rubygems (master)]$ ruby -I lib test.rb
Calculating -------------------------------------
            original       888 i/100ms
                 new      1007 i/100ms
-------------------------------------------------
            original     8746.8 (±8.7%) i/s -      43512 in   5.026144s
                 new    10351.0 (±5.2%) i/s -      52364 in   5.073970s

It speeds up errors only about 200ms on my machine:

[aaron@higgins rubygems (faster_levenshtein)]$ time ruby -I lib -S gem install minitist 
ERROR:  Could not find a valid gem 'minitist' (>= 0) in any repository
ERROR:  Possible alternatives: minitest, minigit, minilisp, minit, linguist

real    0m7.957s
user    0m4.138s
sys 0m0.099s
[aaron@higgins rubygems (faster_levenshtein)]$ git checkout master
Switched to branch 'master'
[aaron@higgins rubygems (master)]$ time ruby -I lib -S gem install minitist 
ERROR:  Could not find a valid gem 'minitist' (>= 0) in any repository
ERROR:  Possible alternatives: minitest, minigit, minilisp, minit, linguist

real    0m8.182s
user    0m4.417s
sys 0m0.102s

I think the majority of time is now network here (on this connection anyway).

drbrain added a commit that referenced this pull request Feb 6, 2014

Merge pull request #812 from tenderlove/faster_levenshtein
Speed up levenshtein_distance about 18%

@drbrain drbrain merged commit d7ac5b4 into rubygems:master Feb 6, 2014

1 check was pending

default The Travis CI build is in progress
Details

drbrain added a commit that referenced this pull request Feb 6, 2014

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment