Various distance and similarity measures for machine learning.
A gem to test what metric is best for certain kinds of datasets in machine learning. Besides the Array class, I also want to support NMatrix.

This is a fork of the gem Distance Measure, which has a similar objective, but isn't actively maintained and doesn't support NMatrix. Thank you, @reddavis. :)


gem install measurable

This gem is currently being tested on MRI Ruby 1.9.3, 2.0, 2.1.0, 2.1 (HEAD) and on Rubinius 2.x (HEAD). I hope to add JRuby support in the future.

Available distance measures

I'm using the term "distance measure" without much concern for the strict mathematical definition of a metric. If the documentation for one of the methods isn't clear about it being or not a metric, please open an issue.

The following are the similarity measures supported at the moment:

How to use

The API I intend to support is something like this:

require 'measurable'

# Calculate the distance between two points in space.
Measurable.euclidean([1, 1], [0, 0]) # => 1.41421

# Calculate the norm of a vector, i.e. its distance from the origin.
Measurable.euclidean([1, 1]) # => 1.4142135623730951

# Get the cosine distance between
Measurable.cosine_distance([1, 2], [2, 3]) # => 0.007722123286332261

# Calculate sum of squares directly.
Measurable.euclidean_squared([3, 4]) # => 25

Most of the methods accept arbitrary enumerable objects instead of Arrays. For example, it's possible to use NMatrix.


The documentation is hosted on rubydoc.


