# Probability Distributions.

Dissatisfied with the limitations of the sympy stats module, I needed to create my own for another project. 
The `distributions.py` module defines a large number of statistical distributions as `symbol`s, and lets 
you work out the expected value of integer powers of them.

To get started, `import` the `distributions.py` file. You'll need to have `sympy` installed.

In [1]:
import sympy as sp
import distributions as d

# rename the expected value and probability functions:
E = d.E
pr = d.pr

# NB: d.log applies sympy.log with forced expansion

In [2]:
# code to display a tuple of sympy expr as math
import IPython.display as display

def showmath(*args):
    return display.Math(',\\;'.join([sp.latex(a) for a in args]))

## Gaussian distribution.

To create a gaussian distribution object, call `d.Gaussian`. This takes three parameters:

* the name of the symbol. Typically you will want to use 'x', as this is common when referring to a variable taken from a Gaussian/normal distribution.
* the mean. This defaults to a symbol `mu`.
* the standard deviation. This defaults to a symbol `sigma`

The mean and variance are stored as properties `mu` and `sigma` of the Gaussian symbol.

[wikipedia](https://en.wikipedia.org/wiki/Normal_distribution) 

In [3]:
# create a gaussian r.v. with mean mu, standard deviation sigma:
z = d.Gaussian('z')

In [4]:
# the first & second moments and the variance
showmath(E(z), E(z**2), E(z**2)-E(z)**2)

<IPython.core.display.Math object>

In [5]:
# the Fisher information with respect to the mean
E(sp.diff(d.log(pr(z)),'mu')**2)

sigma**(-2)

## Binomial distribution.

To create a binomial distribution, call `d.Binomial'. This takes three parameters:

* the name of the symbol. Typically you will want to use 'y', as this is common when referring to a variable taken from a Gaussian/normal distribution.
* the probability of success. This defaults to a symbol `p`.
* the number of attempts. This defaults to a symbol `n`

The probability and count are stored as properties `p` and `n` of the Binomial symbol.

[wikipedia](https://en.wikipedia.org/wiki/Binomial_distribution)

In [6]:
y = d.Binomial('y')

In [7]:
# to make variance easy, define a Var function:
def Var(x):
    return sp.factor(E(x**2)-E(x)**2)

In [8]:
showmath(E(y),E(y**2),Var(y))

<IPython.core.display.Math object>

In [12]:
# the Fisher information with respect to the probability p
E(sp.factor(sp.diff(d.log(pr(y)),y.p))**2)

-n/(p*(p - 1))

## Poisson distribution.

To create a Poisson distribution, call `d.Poisson'. This takes two parameters:

* the name of the symbol. 
* the rate. This defaults to a symbol `lambda`.

The rate symbol is stored as a property `_lambda` of the Poisson distribution symbol.

[wikipedia](https://en.wikipedia.org/wiki/Poisson_distribution)

In [13]:
y = d.Poisson('y')

In [14]:
showmath(E(y),E(y**2),Var(y))

<IPython.core.display.Math object>

In [16]:
# the Fisher information with respect to the rate
E(sp.factor(sp.diff(d.log(pr(y)),y._lambda))**2)

1/lambda

## Negative Binomial distribution.

To create a negative binomial distribution, call `d.NegativeBinomial'. This takes three parameters:

* the name of the symbol. 
* the probability. This defaults to a symbol `p`.
* the count. This defaults to the symbol 'k'

The rate and count symbols are stored as properties `p` and `k` of the distribution symbol.

[wikipedia](https://en.wikipedia.org/wiki/Negative_binomial_distribution)

In [17]:
x = d.NegativeBinomial('x')

In [18]:
showmath(E(x), E(x**2), Var(x))

<IPython.core.display.Math object>

In [19]:
# the Fisher information with respect to the probability
E(sp.factor(sp.diff(d.log(pr(x)),x.p))**2)

r/(p*(p - 1)**2)

## Geometric distribution.

