# The Poisson Process: Section 0 - Preliminaries

## Exponential Distribution

The exponential distribution is a **continuous**, non-negative probability distribution on $[0, \infty)$.

Let $X \sim \text{Exponential}(\lambda)$:
* $\displaystyle F(x) \overset{\mathsf{def}}{=} \mathsf{P}(X \leq x) = 1 - e^{-\lambda x}$
* $\displaystyle f(x) \overset{\mathsf{def}}{=} \frac{d F(x)}{dx} = \lambda e^{-\lambda x}$
* $\displaystyle M_X(\theta) \overset{\mathsf{def}}{=} \mathsf{E}\left[e^{\theta X}\right] = \frac{\lambda}{\lambda - \theta}$ (for $\theta < \lambda$).

It is the *unique* continuous distribution that satisfies the “memoryless” property:
$$\mathsf{P}(X > t+s\,|\,X > t) = \mathsf{P}(X > s)$$
Suppose $X \sim \text{Exponential}(\lambda)$ represents the lifetime of a lightbulb – after $X$ hours, the bulb burns out.  If the bulb has lasted $t$ hours, what is the probability it lasts at least $s$ hours more?  By the memoryless property, it is the same as the probability a new lightbulb lasts $s$ hours.  In this sense, the lightbulb has no “memory” of being on for $t$ hours; it might as well be new.

## Poisson Distribution

The Poisson distribution is a **discrete**, non-negative probability distribution on $\{0, 1, 2, \dots\}$

Let $X \sim \text{Poisson}(\lambda)$:
* $\displaystyle \mathsf{P}(X = k) = \frac{\lambda^k}{k!} e^{-\lambda}$

## Erlang Distribution

The Erlang distribution is a **continuous**, non-negative probability distribution on $[0, \infty)$.

Let $X \sim \text{Erlang}(n, \lambda)$:
* $\displaystyle F(x) = 1 - e^{-\lambda x} \sum_{k=0}^{n-1} \frac{(\lambda x)^k}{k!}$
* $\displaystyle f(x) = \frac{\lambda^n x^{n-1}}{(n-1)!} e^{-\lambda x}$
* $\displaystyle M_X(\theta) = \left(\frac{\lambda}{\lambda - \theta}\right)^n$ (for $\theta < \lambda$).

(Note, this is a special case of the [Gamma distribution](https://en.m.wikipedia.org/wiki/Gamma_distribution) where the shape parameter is an integer.)

## Moment Generating Functions

Suppose $X_1, \dots, X_n$ are independent and identically distributed $\text{Bernoulli}(p)$ random variables:
$$\mathsf{P}(X_i = 1) = p = 1 - \mathsf{P}(X_i = 0)$$
Define $S_n$ to be the sum
$$S_n \overset{\mathsf{def}}{=} \sum_{i=1}^n X_i$$
What is the distribution of $S_n$?

You may recall that $S_n \sim \text{Binomial}(n, p)$:
$$\mathsf{P}(S_n = k) = \binom{n}{k} p^k (1 - p)^{n-k} \qquad k = 0, 1, 2, \dots, n$$
How can we derive this using moment generating functions?

The MGF of a $X \sim \text{Bernoulli}(p)$ random variable is:
$$\begin{aligned}
    \mathsf{E}\left[e^{-\theta X}\right] &= \sum_{k=0}^1 e^{-\theta k} \mathsf{P}(X = k) \\
    %&= \mathsf{P}(X = 0) + e^{-\theta} \mathsf{P}(X = 1) \\
    %&= 1 - p + e^{-\theta} p \\
    &= p e^{-\theta} + 1 - p
  \end{aligned}$$

The MGF of a $Y \sim \text{Binomial}(n, p)$ random variable is:
$$\begin{aligned}
    \mathsf{E}\left[e^{-\theta Y}\right] &= \sum_{k=0}^n e^{-\theta k} \mathsf{P}(Y = k) \\
    &= \sum_{k=0}^n e^{-\theta k} \binom{n}{k} p^k (1 - p)^{n-k} \\
    &= \sum_{k=0}^n \binom{n}{k} \left(p e^{-\theta}\right)^k (1 - p)^{n-k} \\
    &= (p e^{-\theta} + 1 - p)^n
  \end{aligned}$$
(The last step uses the [binomial theorem](https://en.m.wikipedia.org/wiki/Binomial_theorem#Theorem_statement).)

We want to show that the MGF of $S_n$ is equal to the MGF of a binomial random variable:
$$\begin{aligned}
    \mathsf{E}\left[e^{-\theta S_n}\right] &= \mathsf{E}\left[\exp\left(-\theta \sum_{i=0}^n X_i\right)\right] \\
    &= \mathsf{E}\left[\prod_{i=0}^n e^{-\theta X_i}\right] \\
    &= \prod_{i=0}^n \mathsf{E}\left[e^{-\theta X_i}\right] &\text{(independent)} \\
    &= \left(\mathsf{E}\left[e^{-\theta X_1}\right]\right)^n &\text{(identically distributed)} \\
    &= \left(p e^{-\theta} + 1 - p\right)^n &\text{(MGF of $\text{Bernoulli}(p)$)} \\    
  \end{aligned}$$

### Exercise 0.1

Derive the MGF of a $X \sim \text{Exponential}(\lambda)$ random variable.

### Exercise 0.2

Derive the MGF of a $X \sim \text{Erlang}(n, \lambda)$ random variable.

*Hint:* If $n$ is an integer
$$\int_0^\infty x^{n-1} e^{-x} = (n-1)!$$
This is a special case of the [Gamma function](https://en.m.wikipedia.org/wiki/Gamma_function).

## Memoryless Property of the Geometric Distribution

Among continuous distributions, the memoryless property is unique to the Exponential distribution.  However, the Geometric distribution is a discrete distribution that also has the memoryless property.

For $X \sim \text{Geometric}(p)$
$$\mathsf{P}(X = k) = (1 - p)^{k-1} p \qquad\qquad \mathsf{P}(X > k) = (1 - p)^k$$
for $k = 1, 2, \dots$.

Often, this is interpreted in the context of a sequence of independent and identically distributed $\text{Bernoulli}(p)$ random variables where $X$ represents the number of attempts needed to reach the first “success” (a Bernoulli with value $1$).  For example, if $X = 10$ then the first $9$ attempts were failures and the $10$th attempt was the first success.  In this context, if you have observed $n$ failures without a success, what is the probability of observing an additional $m$ failures without a success?  The memoryless property says that it's the same as the probability of starting anew and observing $m$ failures without a success.  This should be almost obvious, since every attempt is independent, so there is no need to “remember” the first $n$ failures — they have no effect on any future attempt.

Mathematically, we show that $X \sim \text{Geometric}(p)$ satisfies the memoryless property as follows:
$$\begin{aligned}
    \mathsf{P}(X > n+m \,|\, X > n) &= \frac{\mathsf{P}(X > n + m \text{ and } X > n)}{\mathsf{P}(X > n)} \\
    &= \frac{\mathsf{P}(X > n + m)}{\mathsf{P}(X > n)} &\text{since }\{X > n\} \subseteq \{X > n + m\}\\
    &= \frac{(1 - p)^{n+m}}{(1 - p)^{n}} \\
    &= (1 - p)^m \\
    &= \mathsf{P}(X > m)
  \end{aligned}$$