Statistical Distributions multi library wrapper. Uses Ruby by default and C (statistics2/GSL) or Java extensions where available.
Ruby HTML
Latest commit 5ced0e2 Jan 23, 2016 @clbustos Merged from vaibhav

README.md

Distribution

Build Status Code Climate

Distribution is a gem with several probabilistic distributions. Pure Ruby is used by default, C (GSL) or Java extensions are used if available. Some facts:

  • Very fast ruby 1.9.3+ implementation, with improved method to calculate factorials and other common functions.
  • All methods tested on several ranges. See spec/.
  • Code for normal, Student's t and chi square is lifted from the statistics2 gem. Originally at this site.
  • The code for some functions and RNGs was lifted from Julia's Rmath-julia, a patched version of R's standalone math library.

The following table lists the available distributions and the methods available for each one. If a field is marked with an x, that distribution doesn't have that method implemented.

Distribution PDF CDF Quantile RNG Mean Mode Variance Skewness Kurtosis Entropy
Uniform x x x x x x x x x x
Normal x x x x x x x x x x
Lognormal x x x x x x x x
Bivariate Normal x x x x x x x x
Exponential x x x x x x x x
Logistic x x x x x x x x
t-Student x x x x x x x x
Chi Square x x x x x x x x
Fisher-Snedecor x x x x x x x x
Beta x x x x x x x x
Gamma x x x x x x x x
Weibull x x x x x x x x
Binomial x x x x x x x x
Poisson x x x x x x x x
Hypergeometric x x x x x x x x

Installation

$ gem install distribution

You can install GSL for better performance:

  • For Mac OS X: brew install gsl
  • For Ubuntu / Debian: sudo apt-get install gsl

After successfully installing the library:

$ gem install rb-gsl

Examples

You can find automatically generated documentation on RubyDoc.

# Returns Gaussian PDF for x.
pdf = Distribution::Normal.pdf(x)

# Returns Gaussian CDF for x.
cdf = Distribution::Normal.cdf(x)

# Returns inverse CDF (or p-value) for x.
pv = Distribution::Normal.p_value(x)

# API.

# You would normally use the following
p = Distribution::T.cdf(x)

# to get the cumulative probability of `x`. However, you can also:

include Distribution::Shorthand
tdist_cdf(x)

API Structure

Distribution::<name>.(cdf|pdf|p_value|rng)

On discrete distributions, exact Ruby implementations of pdf, cdf and p_value could be provided, using

  Distribution::<name>.exact_(cdf|pdf|p_value)

module Distribution::Shorthand provides (you guess?) shortands method to call all methods

  <Distribution shortname>_(cdf|pdf|p|r)

On discrete distributions, exact cdf, pdf and p_value are

  <Distribution shortname>_(ecdf|epdf|ep)

Shortnames for distributions:

  • Normal: norm
  • Bivariate Normal: bnor
  • T: tdist
  • F: fdist
  • Chi Square: chisq
  • Binomial: bino
  • Hypergeometric: hypg
  • Exponential: expo
  • Poisson: pois
  • Beta: beta
  • Gamma: gamma
  • LogNormal: lognormal
  • Uniform: unif

Roadmap

This gem wasn't updated for a long time before I started working on it, so there are a lot of work to do. The first priority is cleaning the interface and removing cruft whenever possible. After that, I want to implement more distributions and make sure that each one has a RNG.

Short-term

  • Define a minimal interface for continuous and discrete distributions (e.g. mean, variance, mode, skewness, kurtosis, pdf, cdf, quantile, cquantile).
  • Implement Distribution::Uniform with the default Ruby Random.
  • Clean up the implementation of normal distribution. Implement the necessary functions.
  • The same for Student's t, chi square, Fisher-Snedecor, beta, gamma, lognormal, logistic.
  • The same for discrete distributions: binomial, hypergeometric, bernoulli (still missing), etc.

Medium-term

  • Implement DSFMT for the uniform random generator.
  • Cauchy distribution.

Long-term

  • Implementing everything in the distributions x functions table above.

Issues

  • On JRuby and Rubinius, BivariateNormal returns incorrect pdf

For current issues see the issue tracker pages.

OMG! I want to help!

Everyone is welcome to help! Please, test these distributions with your own use cases and give a shout on the issue tracker if you find a problem or something is strange or hard to use. Documentation pull requests are totally welcome. More generally, any ideas or suggestions are welcome -- even by private e-mail.

If you want to provide a new distribution, run lib/distribution:

$ distribution --new your_distribution

This should create the main distribution file, the directory with Ruby and GSL engines and specs on the spec/ directory.