speed up levenshtein distance #809

Merged
merged 1 commit into from Feb 5, 2014

Conversation

Projects
None yet
3 participants
Contributor

tenderlove commented Feb 5, 2014

This speeds up levenshtein distance calculation by about 1.6x. Levenshtein distance is used when you make a typo on gem install.

Here is my benchmark:

require 'benchmark/ips'
require 'rubygems/text'

include Gem::Text

def min3 a, b, c
  if a < b && a < c
    a
  elsif b < a && b < c
    b
  else
    c
  end
end

def levenshtein_distance2 str1, str2
  s = str1
  t = str2
  n = s.length
  m = t.length
  max = n/2

  return m if (0 == n)
  return n if (0 == m)
  return n if (n - m).abs > max

  d = (0..m).to_a
  x = nil

  n.times do |i|
    e = i+1

    m.times do |j|
      cost = (s[i] == t[j]) ? 0 : 1
      x = min3(
        d[j+1] + 1,  # insertion
        e + 1,       # deletion
        d[j] + cost) # substitution

      d[j] = e
      e = x
    end

    d[m] = x
  end

  return x
end

str1 = "hello world"
str2 = " hello world"

Benchmark.ips do |x|
  x.report("original")  { levenshtein_distance str1, str2 }
  x.report("new")       { levenshtein_distance2 str1, str2 }
end

Results:

[aaron@higgins rubygems (master)]$ ruby -I lib test.rb
Calculating -------------------------------------
            original       800 i/100ms
                 new      1398 i/100ms
-------------------------------------------------
            original     7969.3 (±5.5%) i/s -      40000 in   5.036031s
                 new    14185.3 (±3.3%) i/s -      71298 in   5.031767s
speed up levenshtein distance
This speeds up levenshtein distance calculation by about 1.6x
Contributor

tenderlove commented Feb 5, 2014

I forgot to mention, this saves ~1.5 seconds for me on a failed install:

[aaron@higgins rubygems (faster_levenshtein)]$ time ruby -I lib -S bin/gem install minitist
ERROR:  Could not find a valid gem 'minitist' (>= 0) in any repository
ERROR:  Possible alternatives: minitest, minigit, minilisp, minit, linguist

real    0m6.296s
user    0m2.496s
sys 0m0.087s
[aaron@higgins rubygems (faster_levenshtein)]$ git checkout master
Switched to branch 'master'
[aaron@higgins rubygems (master)]$ time ruby -I lib -S bin/gem install minitist
ERROR:  Could not find a valid gem 'minitist' (>= 0) in any repository
ERROR:  Possible alternatives: minitest, minigit, minilisp, minit, linguist

real    0m7.837s
user    0m3.970s
sys 0m0.093s
[aaron@higgins rubygems (master)]$

@drbrain drbrain added this to the 2.3 milestone Feb 5, 2014

drbrain added a commit that referenced this pull request Feb 5, 2014

@drbrain drbrain merged commit 796ce1b into rubygems:master Feb 5, 2014

1 check passed

default The Travis CI build passed
Details

drbrain added a commit that referenced this pull request Feb 5, 2014

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment