# [StatsLib](https://github.com/kthohr/stats)

StatsLib is a templated C++ library of statistical distribution functions, featuring unique compile-time computing capabilities and seamless integration with several popular linear algebra libraries.

Features:
* A header-only library of probability density functions, cumulative distribution functions, quantile functions, and random sampling methods.
* Functions are written in a specialized C++11 `constexpr` format, enabling the library to operate as both a compile-time and run-time computation engine.
* Designed with a simple **R**-like syntax.
* Optional vector-matrix functionality with wrappers to support:
    * [Armadillo](http://arma.sourceforge.net/)
    * [Blaze](https://bitbucket.org/blaze-lib/blaze)
    * [Eigen](http://eigen.tuxfamily.org/index.php)
* Matrix-based operations are parallelizable with OpenMP.
* Released under a permissive, non-GPL license.

Author: [Keith O'Hara](https://www.kthohr.com)

License: Apache Version 2

## Distributions

Functions to compute the cdf, pdf, quantile, as well as random sampling methods, are available for the following distributions:

* Bernoulli
* Beta
* Binomial
* Cauchy
* Chi-squared
* Exponential
* F
* Gamma
* Inverse-Gamma
* Laplace
* Logistic
* Log-Normal
* Normal (Gaussian)
* Poisson
* Student's t
* Uniform
* Weibull

In addition, pdf and random sampling functions are available for several multivariate distributions:

* inverse-Wishart
* Multivariate Normal
* Wishart

## This Notebook

To run a code cell, use `shift + enter`.


In [1]:
// include libraries
#include <iostream>              // for printing
#include "../include/stats.hpp"

## Syntax and Examples

Functions are called using an **R**-like syntax. Some general rules:

* density functions: `stats::d*`. For example, the Normal (Gaussian) density is called using
``` cpp
stats::dnorm(<value>,<mean parameter>,<standard deviation>);
```
* cumulative distribution functions: `stats::p*`. For example, the Gamma CDF is called using
``` cpp
stats::pgamma(<value>,<shape parameter>,<scale parameter>);
```
* quantile functions: `stats::q*`. For example, the Beta quantile is called using
``` cpp
stats::qbeta(<value>,<a parameter>,<b parameter>);
```
* random sampling: `stats::r*`. For example, to generate a single draw from the Logistic distribution:
``` cpp
stats::rlogis(<location parameter>,<scale parameter>,<seed value or random number engine>);
```

Some examples:

In [2]:
// evaluate the normal PDF at x = 1, mu = 0, sigma = 1
double dval_1 = stats::dnorm(1.0,0.0,1.0);
 
// evaluate the normal PDF at x = 1, mu = 0, sigma = 1, and return the log value
double dval_2 = stats::dnorm(1.0,0.0,1.0,true);
 
// evaluate the normal CDF at x = 1, mu = 0, sigma = 1
double pval = stats::pnorm(1.0,0.0,1.0);
 
// evaluate the Laplacian quantile at p = 0.1, mu = 0, sigma = 1
double qval = stats::qlaplace(0.1,0.0,1.0);

// draw from a t-distribution dof = 30
double rval = stats::rt(30);

In [3]:
std::cout << dval_1 << std::endl;
std::cout << dval_2 << std::endl;
std::cout << pval << std::endl;
std::cout << qval << std::endl;
std::cout << rval << std::endl;

0.241971
-1.41894
0.841345
-1.60944
0.20657


### Seeding

Random number seeding is available in two formats: seed values and random number engines.

* Seed values are passed as unsigned integers. For example, to generate a draw from a normal distribution N(1,2) with seed value 1776:
``` cpp
stats::rnorm(1,2,1776);
```
* Random engines in StatsLib use the 64-bit Mersenne-Twister generator (`std::mt19937_64`) and are passed by reference. Example:
``` cpp
std::mt19937_64 engine(1776);
stats::rnorm(1,2,engine);
```

In [4]:
stats::ullint_t seed_val = 1776UL;

double ran_val_1 = stats::rnorm(1,2,seed_val);

std::mt19937_64 engine(seed_val);
double ran_val_2 = stats::rnorm(1,2,engine);

std::cout << "random draws: " << ran_val_1 << ", " << ran_val_2 << std::endl;

random draws: 3.17135, 3.17135


## Compile-time Computation Capabilities

StatsLib is designed to operate equally well as a compile-time computation engine. Compile-time computation allows the compiler to replace function calls (e.g., `dnorm(0,0,1)`) with static values in the source code. That is, functions are evaluated during the compilation process, rather than at run-time. This capability is made possible due to the templated `constexpr` design of the library and can be verified by inspecting the assembly code generated by the compiler. 

The compile-time features are enabled using the `constexpr` specifier. The example below computes the pdf, cdf, and quantile function of the Laplace distribution:

In [5]:
constexpr double cdens  = stats::dlaplace(1.0,1.0,2.0); // answer = 0.25
constexpr double cprob  = stats::plaplace(1.0,1.0,2.0); // answer = 0.5
constexpr double cquant = stats::qlaplace(0.1,1.0,2.0); // answer = -2.218875...

In [6]:
std::cout << "result: "<< cdens << ", " << cprob << ", " << cquant << std::endl;

result: 0.25, 0.5, -2.21888
