Permalink
Browse files

Levenshtein, programming style

The version before this commit was I believe correct.
I would therefore totally understand if this pull request was rejected.

However I think levenshtein provides a very nice example to demonstrate
some of the functionality and pitfalls of coffeescript.

First I am not too fond of the "return unless" statement. It
feels a bit uncanny to find a non indented return in the middle
of a function.

Most important I am not fond of what this statement is trying to hide. In
most languages, checking for the empty string is not useful. 

see Discussion section.
  • Loading branch information...
1 parent 7437de8 commit c94b78a51873d6dcf66370182c69c434fe51486f @fulmicoton fulmicoton committed Jan 31, 2013
Showing with 25 additions and 18 deletions.
  1. +25 −18 chapters/strings/matching-strings.md
@@ -13,30 +13,37 @@ Calculate the edit distance, or number of operations required to transform one s
{% highlight coffeescript %}
-Levenshtein =
- (str1, str2) ->
-
+ levenshtein = (str1, str2) ->
+
l1 = str1.length
l2 = str2.length
+ prevDist = [0..l2]
+ nextDist = [0..l2]
+
+ for i in [1..l1] by 1
+ nextDist[0] = i
+ for j in [1..l2] by 1
+ if (str1.charAt i-1) == (str2.charAt j-1)
+ nextDist[j] = prevDist[j-1]
+ else
+ nextDist[j] = 1 + Math.min prevDist[j], nextDist[j-1], prevDist[j-1]
+ [prevDist,nextDist]=[nextDist, prevDist]
+
+ prevDist[l2]
- return Math.max l1, l2 unless l1 and l2
+{% endhighlight %}
- i = 0; j = 0; distance = []
+## Discussion
- distance[i] = [i] for i in [0..l1]
- distance[0][j] = j for j in [0..l2]
+You can use either Hirschberg or Wagner–Fischer's algorithm to calculate a Levenshtein distance. This example uses Wagner–Fischer's algorithm.
- for i in [1..l1]
- for j in [1..l2]
- distance[i][j] = Math.min distance[i - 1][j] + 1,
- distance[i][j - 1] + 1,
- distance[i - 1][j - 1] +
- if (str1.charAt i - 1) is (str2.charAt j - 1) then 0 else 1
+This version of Levenshtein algorithm is linear in memory, quadratic in time.
- distance[l1][l2]
-
-{% endhighlight %}
+str.charAt i is preferred here to str[i] because the latter syntax is not supported by some browsers (e.g. IE7).
-## Discussion
+At first glance the use of "by 1" in the two loops might look useless. It is actually here to avoid a common danger
+of the coffeescript [i..j] syntax. If str1 or str2 is an empty string, then [1..l1] or [1..l2] will return [1,0].
+The loops with the "by 1" statement also compiles to cleaner / slightly more performant javascript.
-You can use either Hirschberg or Wagner–Fischer's algorithm to calculate a Levenshtein distance. This example uses Wagner–Fischer's algorithm.
+Finally the optimization of recycling of arrays at the end of the loops is mainly here to
+demonstrate the syntax of coffeescript for swapping two variables.

0 comments on commit c94b78a

Please sign in to comment.