# Synopsis

This notebook reviews some of the most frequently used discrete random variables.

* uniform
* binomial
* Poisson
* geometric
* negative binomial


# Read libraries

In [None]:
%load_ext autoreload
%autoreload 2
%matplotlib inline

from colorama import Back, Fore, Style
from copy import copy, deepcopy
from pathlib import Path
from sys import path

path.append( str(Path.cwd().parent) )


In [None]:
import matplotlib.pyplot as plt
import numpy as np
import scipy.stats as stats

from Amaral_libraries.my_stats import half_frame


In [None]:
my_fontsize = 15

# Discrete random variables

In this notebook, I give an overview of some commonly found distributions of discrete random variables.  This overview has two goals.  The first and most important reason is to provide a vocabulary with which to describe or model random processes.  Just as when trying to communicative in a new language, you need to memorize words in order to be able to express yourself effectively, when describing the properties of a random process you will find it useful to make reference to these distributions.

The second reason is that it provides, in some cases, the justification for the type of random processes that may give rise to a specified distribution.


## Uniform distribution 

The default assumption in the absence of additional information is that all possible outcomes of a random variable $X$ are equi-probable. In the case of discrete outcomes $-$ rolling a die, tossing a coin, drawing a raffle $-$ this means that the probability of every outcome is equal to the inverse of the number $n$ of possible outcomes.

The probability mass function is:

> $Pr(X=k) = \frac{1}{n}$

In [None]:
n = 6
x = np.arange(1, n + 1)
rv = stats.randint(1, n + 1)

fig = plt.figure( figsize = (6, 4.5) )
fig.text(0, 1, 'Uniform', fontsize = 1.5*my_fontsize)
ax = fig.add_subplot(1,1,1)

half_frame(ax, "k", "", font_size = my_fontsize)

# Calculate and plot histogram
ax.vlines(x-0.2, 0, rv.sf(x), color = "r", linewidth = 5, alpha = 0.5, 
          label = "Survival - Pr(X > k)")
ax.vlines(x, 0, rv.pmf(x), color = "g", linewidth = 5, alpha = 0.5, 
          label = "Probability mass  - Pr(X > k)")
ax.vlines(x+0.2, 0, rv.cdf(x), color = "b", linewidth = 5, alpha = 0.5, 
          label = "Cumulative - Pr(X <= k)")

# Format legend
ax.legend(loc = (1.0, 0.6), frameon = False, markerscale = 1.8, 
          fontsize = my_fontsize)

ax.set_xlim(0.5, n+0.5)
ax.set_ylim(0, 1)

plt.show()

## Binomial distribution 

Consider a Bernoulli process of length $n$.  Translation:  You repeat some random process $n$ times and the probabilities of the different outcomes do not change during the entire process.  Examples include tossing the same coin $n$ times; rolling the same die $n$ times; draw a single card from $n$ well shuffled decks. 

A fair coin has 50% chance of landing on heads and a 50% chance of landing on tails. The binomial distribution with p = 0.5 and n = 20 specifies the probability of tossing $k$ heads when flipping a coin $20$ times. 

The probability mass function is:

> $Pr(X=k) = C^n_k ~ p^{k} ~ (1-p)^{n-k}$


In [None]:
p = .5
n = 20
x = np.arange(0, n+1, 1)
rv = stats.binom(n, p)

fig = plt.figure( figsize = (6, 4.5) )
fig.text(0, 1, 'Binomial', fontsize = 1.5*my_fontsize)
ax = fig.add_subplot(1,1,1)

half_frame(ax, "k", "", font_size = my_fontsize)

# Calculate and plot histogram
ax.vlines(x-0.2, 0, rv.sf(x), color = "r", linewidth = 3, alpha = 0.5, 
          label = "Survival")
ax.vlines(x, 0, rv.pmf(x), color = "g", linewidth = 4, alpha = 0.5, 
          label = "Probability mass")
# ax.vlines(x+0.2, 0, rv.cdf(x), color = "b", linewidth = 3, alpha = 0.5, 
#           label = "Cumulative")

# Format legend
ax.legend(loc = (1.0, 0.6), frameon = False, markerscale = 1.8, 
          fontsize = my_fontsize)
ax.set_ylim(0, 1)
ax.set_xlim(0, n + .1)

plt.show()

## Poisson distribution 

A Poisson process is a random process in which the density of events is constant over some space, such as length or time, and events are independent.  The process was discovered independently and repeatedly in several settings, including experiments on radioactive decay, failure of components such as light bulbs, telephone call arrivals, costumer arrivals at a store, natural catastrophes, accidents, and so on.

The Poisson distribution describes the number of events that one can expect to observe within some bounded region.  The distribution is characterized by a single parameter, $\lambda$, which can be interpreted as the average number of points per some unit of extent such as length, area, volume, or time, depending on the underlying mathematical space, and it is also called the mean density or mean rate.

The probability mass function is:

> $Pr(X=k) =  \frac{\lambda^k e^{-\lambda}}{k!}$

In [None]:
mu = 15
x = np.arange(stats.poisson.ppf(0.0001, mu), stats.poisson.ppf(0.9999, mu))
rv = stats.poisson(mu)

fig = plt.figure( figsize = (6, 4.5) )
fig.text(0, 1, 'Poisson', fontsize = 1.5*my_fontsize)
ax = fig.add_subplot(1,1,1)

half_frame(ax, "k", "Probability mass", font_size = my_fontsize)

# Calculate and plot histogram
ax.vlines(x, 0, rv.pmf(x), color = "g", linewidth = 5, alpha = 0.5, 
          label = "Poisson")

# Format legend
ax.legend(loc = "best", frameon = False, markerscale = 1.8, 
          fontsize = my_fontsize)
ax.set_ylim(0, .15)

plt.show()

**An interesting property of the binomial distribution is that for $n \gg 1$ and $\lambda = np$, it converges to a Poisson distribution**.



In [None]:
p = 0.1
n = 40
x = np.arange(stats.binom.ppf(0.0001, n, p), stats.binom.ppf(0.9999, n, p))
rv1 = stats.binom(n, p)

mu = n * p
rv2 = stats.poisson(mu)


fig = plt.figure( figsize = (10, 4.5) )
fig.text(0, 1, 'Binomial vs Poisson', fontsize = 1.5*my_fontsize)
ax =  fig.add_subplot(1,1,1) 

half_frame(ax, "k", "Probability mass", font_size = my_fontsize)

# Calculate and plot histogram
ax.vlines(x-0.2, 0, rv1.pmf(x), color = "b", linewidth = 4, alpha = 0.7, 
          label = "Binomial")
ax.vlines(x+0.2, 0, rv2.pmf(x), color = "r", linewidth = 4, alpha = 0.7, 
          label = "Poisson")

ax.set_ylim(0, 0.3)
ax.set_xlim(-0.50, 40)


# Format legend
ax.legend(loc = "best", frameon = False, markerscale = 1.8, 
          fontsize = my_fontsize)


plt.show()

## Geometric distribution 

How long will you have to wait until you roll a 6? or until you flip heads? 

If you assume that every time your turn on a light bulb, there is a constant probability that it will fail, how many times can you expect to use the bulb before you have to replace it? 

If you have a Bernoulli process and you ask how many events you need to observe until you get a desired outcome, then you are looking at a geometric distribution.  This can be seen as a special case of the thinking that yields the binomial distribution.

The probability mass function is:

> $Pr(X=k) = (1-p)^{k-1}p$

In [None]:
p = 0.5
x = np.arange(1, stats.geom.ppf(0.999, p))
rv1 = stats.geom(p)


fig = plt.figure( figsize = (6, 4.5) )
fig.text(0, 1, 'Geometric', fontsize = 1.5*my_fontsize)
ax = fig.add_subplot(1,1,1)

half_frame(ax, "k", "", font_size = my_fontsize)

# Calculate and plot histogram
# ax.vlines(x-0.2, 0, rv1.sf(x), color = "r", linewidth = 5, alpha = 0.5, 
#           label = "Survival")
ax.vlines(x, 0, rv1.pmf(x), color = "g", linewidth = 5, alpha = 0.5, 
          label = "Probability mass")
ax.vlines(x+0.2, 0, rv1.cdf(x), color = "b", linewidth = 5, alpha = 0.5, 
          label = "Cumulative")

ax.set_ylim(0, 1)
ax.set_xlim(0, 10)

# Format legend
ax.legend(loc = (1.0, 0.6), frameon = False, markerscale = 1.8, 
          fontsize = my_fontsize)

plt.show()

## Negative binomial distribution 

Imagine that instead of the switch turning on a single bulb, it turns on a very very large number of bulbs.  How many times $n$ can you expect to be able to turn the lights on until $k$ of then fail? 

Returning to the flipping of coins, how many times $n$ must you flip a coin until you get $k$ tails?

This process is described by the negative binomial distribution. have Is the result of counting the number of events until a specific number of desired ones occur. 

The probability mass function is:

> $Pr(X=(k, n)) = C^{k+n-1}_{n-1} ~ p^n ~ (1-p)^k$

In [None]:
p = 0.5
n = 10
x = np.arange(stats.nbinom.ppf(0.0001, n, p), stats.nbinom.ppf(0.9999, n, p))
rv1 = stats.nbinom(n, p)

# print(f"The type of the variable rv1 is {type(rv1)}\n")
# print(f"The type of the variable rv1.pmf is {type(rv1.pmf(x))}\n")

fig = plt.figure( figsize = (8, 4.5) )
fig.text(0, 1, 'Negative binomial', fontsize = 1.5*my_fontsize)
ax = fig.add_subplot(1,1,1)

half_frame(ax, "k", "", 
           font_size = my_fontsize)

# Calculate and plot histogram
ax.vlines(x-0.2, 0, rv1.sf(x), color = "r", linewidth = 2, alpha = 0.5, 
          label = "Survival")
ax.vlines(x, 0, rv1.pmf(x), color = "g", linewidth = 4, alpha = 0.5, 
          label = "Probability mass")
# ax.vlines(x+0.2, 0, rv1.cdf(x), color = "b", linewidth = 2, alpha = 0.5, 
#           label = "Cumulative")

ax.set_xlim(0, 30)

# Format legend
ax.legend(loc = (1.0, 0.6), frameon = False, markerscale = 1.8, 
          fontsize = my_fontsize)
plt.show()

# Calculating (some of the) moments of these distributions


The $n^{th}$ moment of a discrete variable drawn from a distribution $p(k)$ is given by

> $m_n = \sum_{k \in S} k^n~p(k)$

where $S$ is the sample space of the random variable.


## Zero-th order moment

For n = 0, we have simply:

> $m_0 = \sum_{k \in S} k^0 ~p(k) = \sum_{k \in S} ~p(k) = 1$


## First moments

The first moment is also called the **mean** and the **expected value**:

> $m_1 = \sum_{k \in S} k^1 ~p(k) = \mu$

It is relatively straightforward to calculate for several of the distributions discussed above.

**Uniform:**

>$m_1 = \sum_{k = ki}^{k_f} \frac{k}{k_f - k_i + 1} = (k_f - k_i + 1)~\frac{k_f + k_i}{2~(k_f - k_i + 1)} = \frac{k_f + k_i}{2}$

**Poisson:**

> $m_1 = \sum_{k =  0}^{\infty} k~\frac{\lambda^k e^{-\lambda}}{k!}$ 
>
> $~~~~ = e^{-\lambda} ~\sum_{k =  1}^{\infty} \lambda ~\frac{\lambda^{k-1}}{(k-1)!}$
>
> $~~~~ = \lambda~ e^{-\lambda} ~\sum_{j =  0}^{\infty} \frac{\lambda^{j}}{j!}$ 
>
> $~~~~ = \lambda ~e^{-\lambda}~ e^{ \lambda}$
>
> $~~~~ = \lambda$

**Binomial:**

> $m_1 = \sum_{k =  0}^{n} k~C_k^n ~p^k~(1-p)^{n-k}$
>
> $~~~~ = \sum_{k =  1}^{n} \frac{k ~ n!}{k!~ (n-k)!} ~ p^k ~ (1-p)^{n-k}$
>
> $~~~~ = \sum_{j =  0}^{n-1} \frac{n!}{j!~(n-1 -j)!} ~p^{j+1}~ (1-p)^{n-1 -j}$
> 
> $~~~~ = \sum_{j =  0}^{m} \frac{n ~m!}{j!~(m-j)!} ~p ~p^j ~(1-p)^{m-j}$
> 
> $~~~~ = np ~\sum_{j =  0}^{m} \frac{m!}{j!~(m-j)!}~p^j ~(1-p)^{m-j}$
>
> $~~~~ = np$



## Second moments

The second moment is defined as:

> $m_2 = \sum_{k \in S} k^2~p(k)$

More useful in practical terms then the second moment is actually the **centered second moment**, which is also called **Variance**:

> $V = \sum_{k \in S} (k - \mu)^2~p(k) = m_2 - \mu^2$

The Variance is also relatively straightforward to calculate for several of the distributions discussed above.

**Uniform:**

> $V = \frac{(k_f - k_i + 1)^2 - 1}{12}$ 


**Binomial:**

> $V = np(1-p)$ 

**Poisson:**

> $V = \lambda$





# Generating pseudo-random variables according to a specified distribution

In [None]:
n = 10
p = 0.8
L = 100

# This generates np.array of random variables 
random_distribution = stats.nbinom.rvs(n, p, size = L)

# print(random_distribution)
# print(len(random_distribution))

# Create histogram
plt.hist(random_distribution, bins = 10, density = True, cumulative= True)

# Plot it
plt.savefig( Path.cwd() / f'figure_dist_nbinom_n{n}_p{p}_size{L}.png' )



## Dependence of sample mean on sample size

I want you to study how the mean of a sample drawn from a **distribution of your choice** changes with sample size `L`. You can use and modify the code two cells above to create a function that returns the sample mean of `L` binomial random variables and then plots the sample mean versus `L`.

In [None]:
# Choose a distribution from those given above and select the parameters values

# Select a sampple size

# Generate sample with specific size and calculate its mean

# Repeat many time, and plot all the means



## Dependence of sample standard deviation on sample size

Repeat the analysis above but for the standard deviation of a sample drawn from a **binomial distribution**.
