# Poisson Distribution

In [None]:
import matplotlib.pyplot as plt
import numpy as np
import scipy.stats as st

from ipywidgets import interact, IntSlider, FloatSlider

%matplotlib inline

### Introduction

The **Poisson distribution** is a discrete probability distribution that expresses the probability of a given number of events occurring in a fixed interval of time or space if these events **occur with a known constant mean rate** and **independently of the time** since the last event. The Poisson distribution can also be used for the number of events in other specified intervals such as distance, area or volume:

* The number of phone calls in a certain period
* The number of meteorites greater than 1 meter diameter that strike Earth in a year
* The number of technological innovations made in a year
* The number of defective computer chips produced at a plant

### Details

A Poisson distribution depends on the rate parameter $\lambda$, which describe the ‘average number of occurrance’ of certain event. Because the average number of occurrences cannot be negative or 0, the rate parameter $\lambda$ can only be a positive real number. Random variable that follows a Poisson distribution can only take nonnegative integer, as the time of occurrence can only take value 0,1,2,…  

Formally, A discrete random variable $X$ is said to have a Poisson distribution with parameter $\lambda > 0$, if, for $k = 0, 1, 2, ...$, the probability mass function (PMF) of $X$ is given by:

**PMF:**

$$f(k; \lambda) = Prob \{ X = k\} = \frac {\lambda^k e^{-\lambda}}{k!} $$


In [None]:
def Poisson_plot_pmf_cmf(mu):
    prv = st.poisson(mu)
    fig, ax = plt.subplots(1, 2, figsize=(9, 6))
    
    xvals = np.arange(21)
    ax[0].bar(xvals, prv.pmf(xvals), width=0.5)
    ax[0].set_title("Probability Mass Function")
    
    xvals = np.arange(21)
    ax[1].plot(xvals, prv.cdf(xvals))
    ax[1].set_title("Cumulative Distribution Function")
    
    return None

In [None]:
fs_lambda = FloatSlider(
    value=5, min=0.1, max=15,
    step=0.1, description="Rate $\lambda$",
    style={"description_width": "10%"},
    layout={"width": "80%"}
)

output = interact(Poisson_plot_pmf_cmf, mu=fs_lambda)

### Mean and Variance
**Mean**
$$
\mu = EX  = k \ Prob \{ X = k \} = \sum_{k=0}^{\infty} \frac{\lambda^{k-1}}{ (k-1)! } e^{-\lambda} =\lambda \sum_{k=0}^{\infty} \frac{\lambda^{k}}{ k! } e^{-\lambda} = \lambda
$$
  
**Variance**
$$
\sigma^2 = \lambda
$$

### Property: Poisson Limit Theorem: 

The Poisson distribution can be considered as  the continuous time limit of a binomial distribution. This property is formally **Poisson limit theorem:**  
  
  
Consider $n$ times Bernoulli independent repeated trials, and the probability of event $A$ occurring in each trial is $p_n$. If $n p_n \to \lambda$ ($\lambda$ is a constant) when $n \to \infty$, then for any $k = 1,2,3...$

$$
\lim_{n\to \infty}{n \choose k} (1 - p_n)^{n-k} p_n^k = \frac {\lambda^k e^{-\lambda}}{k!}
$$

### Example:  Phone Calls in a Day

Consider phone calls independently occur with a known constant average rate 4 times a day. We can model the number of calls received using a Poisson distribution with a parameter of $\lambda = 4$.

In [None]:
# define the Poisson distribution and check the mean
prv = st.poisson(mu = 4)

prv.mean()

In [None]:
# check the variance
prv.var()

In [None]:
# generate 20 random variable based on the distribution
prv.rvs(20)

Question: What is the probability of receiving fewer than 6 calls that day?
  
There are two ways to think of the case:  
  
**Poisson Distribution**: We can use the Poisson distribution with $\lambda = 4$ to model the distribution of phone call in one day. Probability of receiving fewer than 6 calls is given below:

In [None]:
prv.cdf(6)

**Binomial distribution：**

**Binomial distribution 1**: Consider the probability of receiving a phone call in each hour is $\frac{4}{24}$, we can use a Binomial distribution with $24$ trails to model the phone call in one day. Probability of receiving fewer than 6 calls is given below:

In [None]:
st.binom.cdf(k=6, n=24, p = 4/24)

**Binomial distribution 2**: Consider the probability of receiving a phone call in each minute is $\frac{4}{24 \times 60}$, we can use a Binomial distribution with $24 \times 60$ trails to model the phone call in one day. Probability of receiving fewer than 6 calls is given below:

In [None]:
st.binom.cdf(k=6, n=24*60, p = 4/24/60)

**Binomial distribution 3**: Consider the probability of receiving a phone call in each second is $\frac{4}{24 \times 60 \times 60}$, we can use a Binomial distribution with $24 \times 60 \times 60$ trails to model the phone call in one day. Probability of receiving fewer than 6 calls is given below:

In [None]:
st.binom.cdf(k=6, n=24*60*60, p = 4/24/60/60)

We can keep on cutting the time interval smaller and smaller. By Poisson limit theorem, we can prove that the sequence of binomial distribution converges to the Poisson distribution. We have just illustrated the example numerically.

**Note**: In creating this notebook, I referenced the "Distribution Explorer" material found at https://distribution-explorer.github.io/index.html, and also wikipedia https://en.wikipedia.org/wiki/Poisson_distribution . I certainly encourage others to find additional information there.