Skip to content
Browse files

Improved perfomance of minkowski-distance function in stats

  • Loading branch information...
1 parent 1c1cb32 commit 53daef3672af16e606dbb2068bdacdafa766c942 David Williams committed Mar 17, 2012
Showing with 26 additions and 12 deletions.
  1. +15 −12 modules/incanter-core/src/incanter/stats.clj
  2. +11 −0 modules/incanter-core/test/incanter/stats_tests.clj
View
27 modules/incanter-core/src/incanter/stats.clj
@@ -15,7 +15,6 @@
;; March 11, 2009: First version
-
(ns ^{:doc "This is the core statistical library for Incanter.
It provides probability functions (cdf, pdf, quantile),
random number generation, statistical tests, basic
@@ -3077,6 +3076,13 @@ Legendre[2] discusses a variant of the W statistic which accommodates ties in th
;;TODO: add graphical approaches to similarity: http://en.wikipedia.org/wiki/SimRank
;;TODO: string similarity measures: http://en.wikipedia.org/wiki/String_metric
+(defn fast-abs
+ "Fast absolute value function"
+ [x]
+ (if (< x 0)
+ (*' -1 x)
+ x))
+
(defn minkowski-distance
"http://en.wikipedia.org/wiki/Minkowski_distance
http://en.wikipedia.org/wiki/Lp_space
@@ -3088,17 +3094,14 @@ Minkowski distance is typically used with p being 1 or 2. The latter is the Eucl
In the limiting case of p reaching infinity we obtain the Chebyshev distance."
[a b p]
{:pre [(= (count a) (count b))]}
- (pow
- (apply
- tree-comp-each
- +
- (fn [[x y]]
- (pow
- (abs
- (- x y))
- p))
- (map vector a b))
- (/ 1 p)))
+ (pow
+ (reduce +
+ (map
+ #(pow
+ (fast-abs
+ (pow (- %1 %2) p)))
+ a b))
+ (/ 1 p)))
(defn euclidean-distance
"http://en.wikipedia.org/wiki/Euclidean_distance
View
11 modules/incanter-core/test/incanter/stats_tests.clj
@@ -235,6 +235,11 @@
(is (= 1 (damerau-levenshtein-distance b c)))
(is (= 3 (damerau-levenshtein-distance a c)))))
+(deftest fast-abs-test
+ (is
+ (= 9223372036854775808
+ (fast-abs -9223372036854775808))))
+
(deftest euclid
(is
(= 2.8284271247461903
@@ -247,6 +252,12 @@
(manhattan-distance [2 4 3 1 6]
[3 5 1 2 5]))))
+(deftest minkowski-3
+ (is
+ (= 2.2894284851066637
+ (minkowski-distance
+ [2 4 3 1 6] [3 5 1 2 5] 3))))
+
(deftest chebyshev
(is
(== 2

0 comments on commit 53daef3

Please sign in to comment.
Something went wrong with that request. Please try again.