Generates histograms similar to R’s hist and numpy’s histogram functions. The interface is relatively stable and a decent test/spec suite is in place, but please consider this alpha software. Inspired by Richard Cottons’s matlab implementation and the wikipedia histogram article.
require 'histogram/array' # enables Array#histogram data = [0,1,2,2,2,2,2,3,3,3,3,3,3,3,3,3,5,5,9,9,10] # by default, uses Freedman-Diaconis method to calculate optimal number of bins # and the bin values are midpoints between the bin edges (bins, freqs) = data.histogram # equivalent to: data.histogram(:fd, :tp => :avg)
# :fd, :sturges, :scott, or :middle (median value between the three methods) data.histogram(:middle) (bins, freqs) = data.histogram(20) # use 20 bins (bins, freqs) = data.histogram([-3,-1,4,5,6]) # custom bins # bins are midpoints, but can be set as minima (bins, freqs) = data.histogram([-3,-1,4,5,6], :tp => :min) # custom bins with :min # can also set the bin_width (which interpolates between the min and max of the set) (bins, freqs) = data.histogram(:bin_width => 0.5)
Sometimes, we want to create histograms where the bins are calculated based on all the data sets. That way, the resulting frequencies will all line up:
# returns [bins, freq1, freq2 ...] (bins, *freqs) = data.histogram(30, :other_sets => [[3,3,4,4,5], [-1,0,0,3,3,6]])
# histogramming with weights (uses the second array for weights) w_heights = [data, [3,3,8,8,9,9,3,3,3,3]] w_heights.histogram(20)
require 'histogram/narray' # enables NArray#histogram # if the calling object is an NArray, the output is two NArrays: NArray.float(20).random!(3).histogram(20) # => [bins, freqs] # are both NArray.float objects
gem install histogram