In [None]:
import numpy as np
name_list = ['Andrea', 'Sreelatha', 'Sara', 'Eva', 'Maaike', 'Victor', 'Zuzanna']
np.random.choice(name_list)

# Discrete Probability Distributions

## Introduction

In this lesson we will focus on discrete probability distributions. First let's have a look at the basics.

### Random Variables
Consider an experiment where we are rolling a die twice.



> $S = \{ (1,1), (1,2), (1,3), (1,4), (1,5), (1,6), (2,1), (2,2), (2,3), .......... \}$



This can be represented by a random variable X as:



> X = {Sum of numbers on the die when rolled twice}





*   P{X = 2} = P{(1, 1)} = 1/36
*   P{X = 3} = P{(1, 2), (2, 1)} =2/36
*   P{X = 4} = P{(1, 3), (2, 2), (3, 1)} = 3/36
*   P{X = 5} = P{(1, 4), (2, 3), (3, 2), (4, 1)} = 4/36
*   P{X = 6} = P{(1, 5), (2, 4), (3, 3), (4, 2), (5, 1)} = 5/36
*   P{X = 7} = P{(1, 6), (2, 5), (3, 4), (4, 3), (5, 2), (6, 1)} = 6/36
*   P{X = 8} = P{(2, 6), (3, 5), (4, 4), (5, 3), (6, 2)} = 5/36
*   P{X = 9} = P{(3, 6), (4, 5), (5, 4), (6, 3)} = 4/36
*   P{X = 10} = P{(4, 6), (5, 5), (6, 4)} = 3/36
*   P{X = 11} = P{(5, 6), (6, 5)} = 2/36
*   P{X = 12} = P{(6, 6)} = 1/36



The sum of all the probabilities added together is 1:



> $\sum_{x_{i}} P(X=x_{i}) = 1$



### Parameters

Parameters are an important concept in statistics. Before we move on to some examples, consider these two 'text-book' definitions:

    "[A] set of facts which describes and puts limits on how something should happen or be done."
*Cambridge Dictionary*

    "[A]n arbitrary constant whose value characterizes a member of a system (such as a family of curves)"
*Merriam Webster Dictionary*


To visualise the concept of a parameter, let us look at a familiar example: the Gaussian. A Gaussian or normal distribution has two parameters, $\mu$ and $\sigma$. $\mu$ is the mean and $\sigma$ the standard deviation. (note that the variance is usually denoted as $\sigma^{2}$). Let us plot a couple of Gaussians, each with a different mean and standard deviation. 

In [None]:
import numpy as np
import seaborn as sns

gaussian = np.random.normal(loc=1, scale=0.5, size=10000)
sns.distplot(gaussian)

In [None]:
gaussian = np.random.normal(loc=152, scale=23, size=10000)
sns.distplot(gaussian)

In [None]:
gaussian = np.random.normal(loc=1, scale=300, size=10000)
sns.distplot(gaussian)

As can be seen, even though the 'sizes' differ, their general shapes are the same due to the parameters. 

### Probability Mass Function 

The probability mass function (PMF) is a function for a discrete random variable that provides the distribution of that discrete random variable. As such, it is simply the **frequency distribution** of a random variable. We say that,

$$p_{X}(x_{i}) = P(X = x_{i})$$

Whereas until now we've mostly looked at the probabilities of one single data point, the PMF describes the entire event space. 


In [None]:
pmf = np.array([8,8,8,8,8,6,6,6,1,1])

In [None]:
import matplotlib.pyplot as plt

In [None]:
plt.hist(pmf)
plt.show()

## Discrete Random Variables


### Bernoulli Random Variables




When in an experiment, the outcome can be classified as either a “success” or as a “failure”. Therefore the random variable X equal 1 if the outcome is a success and a 0 if it is a failure.

$Parameter: p$

$P(X = 1) = p$  where $p$ is the probability of success

$P(X = 0) = 1 − p$

The following example uses Python Scipy to plot the distribution of Bernoulli's random variable. Note that the Bernoulli number sequence is dynamically generated with the probabilistic distribution of P(X = 1) = 0.8 and P(X = 0) = 0.2. That's why the distribution is approximately 0.2/0.8 instead of exactly 0.2/0.8.

Also note that if we sample from a distribution, we denote this as 

$$x_{i} \sim Ber(p).$$

In the example provided below, we sample from a random Bernoulli variable with a success probability of 0.8. Hence,

$$x_{i} \sim Ber(0.8)$$

In [None]:
import matplotlib.pyplot as plt
from scipy.stats import bernoulli

In [None]:
# 1-p = 0.6
p = 0.4
X = bernoulli.rvs(p, size=100)
plt.hist(X)
plt.show()

### Binomial Random Variable

In the previous example we had only experiment where we checked if the event was a success or failure. Consider a similar setup but here you are conducting ‘n’ such independent experiments and recording the results of every experiment.

$ parameters: (n, p)$

$ N:$ Number of trials/experiments

$ P:$ probability of success

The probability mass function of a binomial random variable having parameters (n, p) is given by

$$(X = k) = \binom{n}{k} p^{{k}} (1-p)^{n-k},$$


where $$\binom{n}{k} = \frac{n!}{(n-k)! k!}.$$

Here, $i$ denotes the success of the trials and $n$ the number of trials carried out. 

#### Example 1



> "An online retailer offers next-day shipping for an extra fee. The retailer says that 95 percent of customers who pay for next-day shipping actually receive the item the next day, and those who don't are issued a refund. Suppose we take a sample of 20 next-day orders, and let X represent the number of these orders that arrive the next day. Assume that the arrival statuses of orders are independent." ([Retrieved from Khan Academy
](https://www.khanacademy.org/math/ap-statistics/random-variables-ap/binomial-random-variable/e/binomial-probability))


Thus, we are interested in:

$$P(X=19).$$ 

We have that $n = 20$ since there are 20 trials in total and 19 successes, denoted as $k = 19$.

We can plug this in into our formula and obtain

$$P(X = 19) = \binom{20}{19} p^{19} (1-p)^{20-19}.$$

Furthermore, since the assumed probability of success is $0.95$, we know that $p = 0.95$ and $1 - p = 1 - 0.95 = 0.05$. Substituting that in our formula, we obtain:

$$P(X = 19) = \binom{20}{19} 0.95^{19} 0.05^{20-19}.$$ 

Simplifying this equation, we obtain:

$$P(X = 19) = \binom{20}{19} 0.95^{19} 0.05.$$ 

We can then look at the first term, $\binom{20}{19}$, which can be rewritten as:

$$\binom{20}{19} = \frac{20!}{(20-19)! * 19!} = \frac{20 * 19 * 18... * 1}{(1) * (19 * 18... * 1)} = \frac{20}{1}.$$

Then, we have that:

$$P(X = 19) = \frac{20}{1} * 0.95^{19} * 0.05. = 20 * 0.377
 * 0.05 \approx 0.377.$$ 


In [None]:
%matplotlib inline
from scipy import stats
from scipy.stats import binom

In [None]:
n=20
p=0.95
binomial = binom(n,p)

x = np.arange(0,30)
fig, ax = plt.subplots(1, 1)
ax.plot(x, binom.pmf(x, n, p), 'bo')
ax.vlines(x, 0, binom.pmf(x, n, p), colors='b', lw=5, alpha=0.5)

In [None]:
print(binomial.mean())
print(binomial.var())
print(binomial.std())

In [None]:
# Print the probability of X=19
print(binomial.pmf(19))

In [None]:
# Sample data points from the distribution
sample = binomial.rvs(10000)
np.array(np.unique(sample, return_counts=True)).T

#### Example 2

Have seen how to use a coin flip. This process can be modeled as Bernoulli random variable. However, it may also be modeled using a Binomial. 

For instance, we can describe the probability of a coin landing on its head as followis.  

$$P("Heads") = \frac{1}{2}.$$

Now, we can determine the probability that a coin lands on its head using the Binomial formula and actually proof that this is the case (assuming we have a fair coin). Suppose that we conduct 1 trial. Then, the probability that a coin will land on its head is 0.5. Hence, $p=0.5$. 

Thus, 

$$P(X = "Heads") = \binom{1}{1} 0.5^{1} (1-0.5)^{1-1}.$$

This gives us, 

$$P(X = "Heads") = 1 * 0.5^{1} (1-0.5)^{1-1}.$$

Since $$(1-0.5)^{1-1} = (1-0.5)^{0} = 1$$ we have,

$$P(X = "Heads") = 0.5.$$

In [None]:
n=1
p=0.5
binomial = binom(n,p)

x = np.arange(0,30)
fig, ax = plt.subplots(1, 1)
ax.plot(x, binom.pmf(x, n, p), 'bo')
ax.vlines(x, 0, binom.pmf(x, n, p), colors='b', lw=5, alpha=0.5)

In [None]:
print(binomial.pmf(1))

#### Example 3

We can compute the probability of a coin landing on its head 3 times in a row as 

$$P("HHH") = \frac{1}{2} * \frac{1}{2} * \frac{1}{2} = \left ( \frac{1}{2} \right )^{3} = \frac{1}{8} = 0.125$$

Again, we assume that we have a fair coin and that therefore $p=0.5$. Let us again assume that we only do one trial. Then, the probability of the coin landing on its head 3 times in a row will be:

$$P(X = "HHH") = \binom{3}{3} 0.5^{3} (1-0.5)^{3-3}.$$

This gives us, 

$$P(X = "HHH") = 1 * 0.5^{3} (1-0.5)^{3-3}.$$

Since $$(1-0.5)^{1-1} = (1-0.5)^{0} = 1$$ we have,

$$P(X = "HHH") = 0.5^{3} = 0.125.$$



In [None]:
n=3
p=0.5
binomial = binom(n,p)

x = np.arange(0,30)
fig, ax = plt.subplots(1, 1)
ax.plot(x, binom.pmf(x, n, p), 'bo')
ax.vlines(x, 0, binom.pmf(x, n, p), colors='b', lw=5, alpha=0.5)

In [None]:
print(binomial.pmf(3))

#### Example 4 (Extension of Example 3)

We can compute the probability of a coin landing on its head 3 times in a row as given that we perform $n$ number of trials. In example 3, we assumed that we did 3 trials. Here, however, let us assume we conduct 20 trials. Then out of the 20 trials, 3 out of 20 should have "HHH":

$$P(X = "HHH") = \binom{20}{3} 0.5^{3} (1-0.5)^{20-3}.$$

This gives us, 

$$P(X = "HHH") = 1140 * 0.5^{3} * (1-0.5)^{17}.$$

Thus, we have

$$P(X = "HHH") = 1140 * 0.125 * 0.00690541387.$$

This gives us,

$$P(X = "HHH") = 1140 * 0.125 * 0.00690541387 = 0.00108718872.$$

As such, the probability that the coin lands on its head 3 times in a row when we draw 100 samples is about 0.0011. 


In [None]:
n=20
p=0.5
binomial = binom(n,p)

x = np.arange(0,100)
fig, ax = plt.subplots(1, 1)
ax.plot(x, binom.pmf(x, n, p), 'bo')
ax.vlines(x, 0, binom.pmf(x, n, p), colors='b', lw=5, alpha=0.5)

In [None]:
print(binomial.pmf(3))

### Poisson Random Variable

This is usually used as a counting variable (count number of occurrences of an event within a time frame). For this reason Poisson processes are also known as counting processes.

It is defined as:

$$P(X = k) = e^{-\lambda} \frac{\lambda^{k}}{k !}.$$

$Parameter: λ$  (rate of the process)





![](https://miro.medium.com/max/6384/1*4EbJuTFOvvh6mXVDcE8D3Q.png)


Let’s take a look at the example: Imagine that the number of accidents occurring on a highway each day is a Poisson random variable with parameter λ = 3, what is the probability that one accident occurs today. λ can be understood as the expected number of events in the interval.

This is the same as saying P(X=1).

Then, we can denote the probability of not any accidents occuring as follows:

$$P(X=1) = e^{-3} \frac{3^{1}}{1!}.$$

This gives us,

$$P(X=1) = 2.71828^{-3} * \frac{3}{1}.$$

When we write this out, we get

$$0.04978706836 * 3 \approx 0.149.$$


In [None]:
%matplotlib inline 
from scipy import stats
from scipy.stats import poisson

In [None]:
param = 3
po = stats.poisson(param)
# Draw random samples
print(po.rvs(10))

In [None]:
x = np.arange(0,20)
fig, ax = plt.subplots(1, 1)
ax.plot(x, poisson.pmf(x, param), 'bo', ms=1, label='poisson pmf')
#Plot axis vertical lines
ax.vlines(x, 0, poisson.pmf(x, param), colors='b', lw=5, alpha=0.5)

In [None]:
print(po.pmf(1))

## Summary 

In this lesson we learnt about discrete random variables, discrete probability distributions,  and how they are characterized by their parameters and PMF (probability mass function). We also talked about some important distributions including Bernoulli's, binomial, and Poisson distribution along with its application in python using SciPy library. 