### Probability Distributions

Utility functions for the Probability distributions are available in the `stats` submodule of the scipy module. About 100 continuous probability distributions and 13 discrete probability distributions have been implemented.
#### Importing stats submodule

In [1]:
from scipy import stats
import scipy as sp

To see the listing of implemented Probability distributions issue the command  
`sp.info(stats)`

In [2]:
sp.info (stats)

Statistical functions (:mod:`scipy.stats`)

.. module:: scipy.stats

This module contains a large number of probability distributions as
well as a growing library of statistical functions.

Each univariate distribution is an instance of a subclass of `rv_continuous`
(`rv_discrete` for discrete distributions):

.. autosummary::
   :toctree: generated/

   rv_continuous
   rv_discrete
   rv_histogram

Continuous distributions

.. autosummary::
   :toctree: generated/

   alpha             -- Alpha
   anglit            -- Anglit
   arcsine           -- Arcsine
   argus             -- Argus
   beta              -- Beta
   betaprime         -- Beta Prime
   bradford          -- Bradford
   burr              -- Burr (Type III)
   burr12            -- Burr (Type XII)
   cauchy            -- Cauchy
   chi               -- Chi
   chi2              -- Chi-squared
   cosine            -- Cosine
   crystalball       -- Crystalball
   dgamma            -- Double Gamma
   dweibull          -- Double

All continuous dirtributions are instances of `stats.rv_continuous` class, and all discrete distributions are instances of `stats.rv_discrete` class.

The main public methods for continuous RVs are:

- `rvs`: Random Variates  
- `pdf`: Probability Density Function  
- `cdf`: Cumulative Distribution Function  
- `sf`: Survival Function (1-CDF)  
- `ppf`: Percent Point Function (Inverse of CDF)  
- `isf`: Inverse Survival Function (Inverse of SF)  
- `stats`: Return mean, variance, (Fisher’s) skew, or (Fisher’s) kurtosis  
- `moment`: non-central moments of the distribution  

For example, consider normal distribution specified by object `norm`

#### Generating random variables

In [2]:
stats.norm.rvs(size=10) # Generate 10 variables from N(0, 1)

array([ 0.43085926,  0.67254952, -0.95839281,  0.35245828,  2.26446228,
       -0.41961991, -0.51400714, -0.71632327, -0.7152411 , -0.99744972])

A random seed for the generator can be specified by `random_state` parameter

In [3]:
stats.norm.rvs(size = 5, random_state = 5437)

array([ 0.73734273,  0.47851971,  0.4444418 , -1.78402926, -0.67984083])

#### Computing normal probabilities

In [4]:
stats.norm.cdf(0) # Prob(Z < 0)

0.5

In [5]:
stats.norm.cdf(1) - stats.norm.cdf(-1) # Prob(-1 < Z < 1)

0.6826894921370859

#### Computing normal percentage points

In [6]:
stats.norm.ppf(0.975) # 97.5% point for N(0,1)

1.959963984540054

In [7]:
stats.norm.ppf([0.9, 0.95, 0.975]) # The basic methods are vectorized

array([1.28155157, 1.64485363, 1.95996398])

#### Specifying Location and scale parameters

All continuous distributions accept a location parameter (`loc`) and a scale parameter (`scale`).  
Note that for $N(\mu, \sigma^2)$ distribution $\mu$ is the location parameter and $\sigma$ is the scale parameter.

In [8]:
stats.norm.rvs(loc=5, scale=0.5, size=10) # Generate 10 variables from N(5, 0.5^2)

array([4.83578617, 5.19305524, 4.71003807, 5.72902836, 4.08977656,
       5.30340984, 4.44239776, 4.89827784, 4.99213072, 4.16235073])

In [9]:
stats.norm.ppf(0.95, loc=10, scale=1) # 95% point for N(10, 1^2)

11.644853626951472

#### Frozen distribution
When a probability distribution with specified parameters is to be frequently used, passing the same set of `loc` and `scale` parameters every time become bothersome. In order to solve this issue, the concept of _freezing_ a probability distribution is used.

For example, the normal distribution is frozen with `loc = 15` and `scale = 2` as shown below.

In [10]:
norm15_2 = stats.norm(loc=15, scale = 2)

Now, `norm15_2` is the frozen normal distribution. Hereafter, we can use `norm15_2` in the same manner as `norm`.

In [11]:
print('Mean = ', norm15_2.mean())
print('SD =', norm15_2.std())

Mean =  15.0
SD = 2.0


In [12]:
norm15_2.rvs(size = 10) # Generate random variates from N(15, 2^2)

array([15.87450787, 12.40909838, 15.43189631, 17.35883211, 13.17013766,
       13.54418382, 16.13681298, 18.49590787, 14.26931385, 16.28271067])