# Exponential Distribution and Poisson Process
- Ross: Chapters 5.1, 5.2, 5.3.1-5.3.4

# **Chapter 5 - The Exponential Distribution and the Poisson Process - Ross**

## **5.1 Introduction**

**Motivation:**
- We must make enough simplifying assumptions to enable us to handle the mathematics but not so many that the mathematical model no longer resembles the real-world phenomenon\
- Common assumption that certain random variables are exponentially distributed
  - exponential distribution is both relatively easy to work with and is often a good approximation to the actual distribution
  - it does not deteriorate with time (no theta decay)
    - an item that has been in use for ten (or any number of) hours is as good as a new item in regards to the amount of time remaining until the item fails

## **5.2 The Exponential Distribution**
### ***5.2.1 Definition***

A continuous random variable $X$ is said to have an exponential distribution with parameter $\lambda$, $\lambda > 0$, if its probability density function (PDF) is given by
$$
f(x) = 
\begin{cases} 
\lambda e^{-\lambda x}, & \text{if } x \geq 0 \\
0, & \text{if } x < 0
\end{cases}
$$
or, equivalently, if its cumulative distribution function (CDF) is given by
$$
F(x) = \int_{-\infty}^x f(y) \, dy =
\begin{cases}
1 - e^{-\lambda x}, & \text{if } x \geq 0 \\
0, & \text{if } x < 0
\end{cases}
$$

The mean of the exponential distribution, $E[X]$, is given by
$$
E[X] = \int_{-\infty}^\infty x f(x) \, dx = \int_0^\infty \lambda x e^{-\lambda x} \, dx
$$

Integrating by parts (with $u = x$ and $dv = \lambda e^{-\lambda x} \, dx$) yields
$$
E[X] = -xe^{-\lambda x} \bigg|_0^\infty + \int_0^\infty e^{-\lambda x} \, dx = \frac{1}{\lambda}
$$

The moment generating function (MGF) $\phi(t)$ of the exponential distribution is given by
$$
\phi(t) = E[e^{tX}] = \int_0^\infty e^{tx} \lambda e^{-\lambda x} \, dx = \frac{\lambda}{\lambda - t}
$$
for $t < \lambda$ (Equation 5.1).

All the moments of $X$ can now be obtained by differentiating Eq. (5.1). For example,
$$
E[X^2] = \left. \frac{d^2}{dt^2} \phi(t) \right|_{t=0} = \left. \frac{2\lambda}{(\lambda - t)^3} \right|_{t=0} = \frac{2}{\lambda^2}
$$

Consequently, the variance $\operatorname{Var}(X)$ is given by
$$
\operatorname{Var}(X) = E[X^2] - (E[X])^2 = \frac{2}{\lambda^2} - \frac{1}{\lambda^2} = \frac{1}{\lambda^2}
$$

**Example 5.1 (Exponential Random Variables and Expected Discounted Returns)**:

Suppose that you are receiving rewards at randomly changing rates continuously throughout time. Let $R(x)$ denote the random rate at which you are receiving rewards at time $x$. For a value $\alpha \geq 0$, called the discount rate, the quantity
$$
R = \int_0^\infty e^{-\alpha x} R(x) \, dx
$$
represents the total discounted reward. (In certain applications, $\alpha$ is a continuously compounded interest rate, and $R$ is the present value of the infinite flow of rewards.)

The expected value of $R$, $E[R]$, is given by
$$
E[R] = E\left[ \int_0^\infty e^{-\alpha x} R(x) \, dx \right] = \int_0^\infty e^{-\alpha x} E[R(x)] \, dx
$$

- Therefore, the expected total discounted reward is equal to the expected total (undiscounted) reward earned by a random time that is exponentially distributed with a rate equal to the discount factor.

### ***5.2.2 Properties of the Exponential Distribution***

A random variable X is said to be without memory or ***memoryless*** if

$$
P\{X > s + t \mid X > t\} = P\{X > s\} \;\;\; \text{ for all s,t ≥ 0 }
$$

$$
\implies
$$

$$
P\{X > s + t\} = P\{ X > s\} P\{X > t\} 
$$

- Think of X as being the lifetime of some instrument; ***memoryless*** tells us that the probability that the instrument lives for at least $s + t$ hours given that it has survived $t$ hours is the same as the initial probability that it lives for at least $s$ hours
- if the instrument is alive at time $t$, then the distribution of the remaining amount of time that it survives is the same as the original lifetime distribution
  - does not remember that it has already been in use for a time $t$

**Example 5.2:** Suppose that the amount of time one spends in a bank is exponentially distributed with mean ten minutes, that is, $λ = \frac{1}{10}$ . What is the probability that a customer will spend more than fifteen minutes in the bank? What is the probability that a customer will spend more than fifteen minutes in the bank given that she is still in the bank after ten minutes?

$$
P\{X > 15\} = e^{-15 \lambda} = e^{-3/2} = 0.223
$$

$$
P\{X > 5\} = e^{-5 \lambda} = e^{-1/2} = 0.607
$$

**Example 5.4**: The dollar amount of damage involved in an automobile accident is an exponential random variable with mean 1000. Of this, the insurance company only pays that amount exceeding (the deductible amount of) 400. Find the expected value and the standard deviation of the amount the insurance company pays per accident.

$$
I = 
\begin{cases} 
1, & \text{if } X > 400 \\
0, & \text{if } X \leq 400
\end{cases}
$$

Amount paid:
$$
Y =(X − 400)^{+} \; 
$$

$$
E[Y \mid I = 1] = 1000,
$$
$$
E[Y \mid I = 0] = 0,
$$
$$
\text{Var}(Y \mid I = 1) = (1000)^2,
$$
$$
\text{Var}(Y \mid I = 0) = 0.
$$

$$
E[Y] = E[E[Y \mid I]] = 103 E[I] = 103 e^{-0.4} \approx 670.32,
$$
$$
\text{Var}(Y) = E[\text{Var}(Y \mid I)] + \text{Var}(E[Y \mid I]) = 10^6 e^{-0.4} + 10^6 e^{-0.4}(1 - e^{-0.4}) \approx 944.09^2
$$

**Example 5.5:** A store must decide how much of a certain commodity to order so as to meet next month’s demand, where that demand is assumed to have an exponential distribution with rate λ. If the commodity costs the store c per pound, and can be sold at a price of s>c per pound, how much should be ordered so as to maximize the store’s expected profit? Assume that any inventory left over at the end of the month is worthless and that there is no penalty if the store cannot meet all the demand

profit:
$$
P = s \min(X, t) - ct
$$

$$
\min(X, t) = X - (X - t)^t
$$

upon conditioning whether X>t and then using the lack of memory property of the exponential:
$$
E[(X − t)^+] = E[(X − t)^+|X>t]P(X > t) + E[(X − t)^+|X ≤ t]P(X ≤ t) = E[(X − t)^+|X>t]e^{−λt} = \frac{1}{\lambda}e^{−λt}
$$

$$
E[\min(X,t)] = \frac{1}{λ} − \frac{1}{λ}e^{−λt}
$$

$$
\implies
$$

$$
E[P] = \frac{s}{λ} − \frac{s}{λ}e^{−λt} − ct, \; t = \frac{1}{\lambda}\log(s/c)
$$

- suppose that:
  - all unsold inventory can be returned for the amount r < $\min(s,c)$ per pound
  - penalty cost p per pound of unmet demand

$$
E[P] = \frac{s}{λ} − \frac{s}{λ}e^{−λt} − ct + rE[(t − X)^+] − pE[(X − t)^+]
$$

$$
E[(t − X)^+] = t − E[\min(X,t)] = t − \frac{1}{λ} + \frac{1}{λ}e^{−λt}
$$

$$
E[P] = \frac{s - r}{λ} − \frac{r - s - p}{λ}e^{−λt} − (c - r)t, \; t = \frac{1}{\lambda}\log(\frac{s + p - r}{c - r})
$$

The memoryless property is further illustrated by the failure rate function/hazard rate function of the exponential distribution. \ 
Consider a continuous positive random variable X having distribution function $F$ and density $f$. The ***failure (or hazard) rate*** function $λ(t)$ is defined by:

$$
\lambda(t) = \frac{f(t)}{1 - F(t)}
$$

- Suppose that an item, having lifetime $X$, has survived for $t$ hours, and we desire the probability that it does not survive for an additional time $dt$
$$
P\{X \in (t,t + dt) \mid X > t\}  = \frac{f(t)dt}{1 − F(t)} = λ(t)dt
$$

- $λ(t)$ represents the conditional probability density that a t-year-old item will fail

<br>

- Suppose now that the lifetime distribution is exponential. Then, by the memoryless property, it follows that the distribution of remaining life for a $t$-year-old item is the same as for a new item. Hence, $λ(t)$ should be constant:
$$
λ(t) = \frac{\lambda e^{-\lambda t}}{e^{-\lambda t}} = \lambda
$$
- failure rate function for the exponential distribution is constant
- prameter $λ$ is often referred to as the rate of the distribution
- failure rate function $λ(t)$ uniquely determines the distribution $F$

**Example 5.6**: Let $X_1, \ldots, X_n$ be independent exponential random variables with respective rates $\lambda_1, \ldots, \lambda_n$, where $\lambda_i \neq \lambda_j$ when $i \neq j$. Let $T$ be independent of these random variables and suppose that
$$
\sum_{j=1}^n P_j = 1 \text{ where } P_j = P\{T = j\}
$$

- random variable $X_T$ is said to be a ***hyperexponential random variable***

- Distribution function $F$ of $X = X_T$, condition on $T$
$$
1 - F(t) = P\{X > t\} = \sum_{i=1}^n P_i e^{-\lambda_i t}
$$
$$
f(t) = \sum_{i=1}^n \lambda_i P_i e^{-\lambda_i t}
$$

- Failure rate function of a hyperexponential random variable:
$$
\lambda(t) = \frac{\sum_{j=1}^n P_j \lambda_j e^{-\lambda_j t}}{\sum_{i=1}^n P_i e^{-\lambda_i t}} = \sum_{j=1}^n \lambda_j P\{T = j \mid X > t\}
$$

- If $λ_1 < λ_i$, for all $i > 1$
$$
\lim_{t \to \infty} P\{T = 1 \mid X > t\} = 1
$$

- when $i \ne 1$, 
$$
\lim_{t \to \infty} P\{T = i \mid X > t\} = 0
$$

- above shows:
$$
\lim_{t \to \infty}\lambda(t) = \min_i(\lambda_i)
$$

### ***5.2.3 Further Properties of the Exponential Distribution***

**Proposition 5.1:** If $X_1, \dots, X_n$ are independent exponential random variables with common rate $\lambda$, then $\sum_{i=1}^n X_i$ is a gamma $(n, λ)$ random variable. That is, its density function is:
$$
f(t) = \lambda e^{-\lambda t} \frac{(\lambda t)^{n-1}}{(n-1)!}, \quad t > 0
$$


To determine the probability that one exponential random variable is smaller than another. That is, suppose that $X_1$ and $X_2$ are independent exponential random variables with respective means $1/λ_1$ and $1/λ_2$; what is $P\{X_1 < X_2\}$
$$
P\{X_1 < X_2\} = \int_0^\infty P\{X_1 < X_2 \mid X_1 = x\} \lambda_1 e^{-\lambda_1 x} \, dx = \int_0^\infty \lambda_1 e^{-(\lambda_1 + \lambda_2)x} \, dx
$$
$$
= \frac{\lambda_1}{\lambda_1 + \lambda_2}
$$

Suppose that $X_1,X_2, \dots, X_n$ are independent exponential random variables, with $X_i$ having rate $μ_i$, $i = 1, \dots, n$. It turns out that the smallest of the $X_i$ is exponential with a rate equal to the sum of the $μ_i$.

$$
P\{\min(X_1, \ldots, X_n) > x\} = P\{X_i > x \text{ for each } i = 1, \ldots, n\}
$$
$$
= \prod_{i=1}^n e^{-\mu_i x} 
$$
$$
= \exp \Biggl\{- \left(\sum_{i=1}^n \mu_i \right)x \Biggr\}
$$

**Proposition 5.2:**. If $X_1, \dots, X_n$ are independent exponential random variables with respective rates $λ_1, \dots, λ_n$, then $\min_i(X_i)$ is exponential with rate $\sum_{i = 1}^n λ_i$ Further, $\min_i(X_i)$ and the rank order of the variables $X_1,X_2, \dots, X_n$ are independent.

**Example 5.10:**. Suppose that customers are in line to receive service that is provided sequentially by a server; whenever a service is completed, the next person in line enters the service facility. However, each waiting customer will only wait an exponentially distributed time with rate $\theta$; if its service has not yet begun by this time then it will immediately depart the system. These exponential times, one for each waiting customer, are independent. In addition, the service times are independent exponential random variables with rate $\mu$. Suppose that someone is presently being served and consider the person who is nth in line.
>> (a) Find $P_n$, the probability that this customer is eventually served. \
>> (b) Find $W_n$, the conditional expected amount of time this person spends waiting in line given that she is eventually served

-  smallest of these n+1 independent exponentials is the departure time of the nth person in line, the conditional probability that this person will be served is 0
- given that this person’s departure time is not the smallest, the conditional probability that this person will be served is the same as if it were initially in position n − 1

a:
$$
P_n, (n+1) = \frac{(n - 1)\theta + \mu}{n\theta + \mu} P_{n-1}
$$
$$
P_n, (n-1) = \frac{(n - 2)\theta + \mu}{n\theta + \mu} P_{n-2}
$$
$$
P_{n} = \frac{\theta + \mu}{n\theta + \mu} P_{n_1} = \frac{\mu}{\theta + \mu}
$$

b:
- use the fact that the minimum of independent exponentials is, independent of their rank ordering, exponential with a rate equal to the sum of the rates
- the time until the nth person in line enters service is the minimum of these n + 1 random variables plus the additional time thereafter

$$
W_n = \frac{1}{n\theta + \mu} + W_{n-1}
$$
$$
W_n = \sum_{i+1}^n \frac{1}{i\theta + \mu} 
$$


### ***5.2.4 Convolutions of Exponential Random Variables***

Let $X_i, i = 1, \dots, n$ are independent exponential random variables with respective rates $λ_i, i = 1, \dots, n$, and suppose that $\lambda_i \ne \lambda_j$ for $i \ne j$. The random variable $\sum_{i=1}^n X_i$ is said to be a hypoexponential random variable. \
To compute its probability density function, n = 2 case: 

$$
f_{X_1+X_2}(t) = \int_0^t f_{X_1}(s) f_{X_2}(t-s) \, ds = \int_0^t \lambda_1 e^{-\lambda_1 s} \lambda_2 e^{-\lambda_2 (t-s)} \, ds
$$
$$
= \frac{\lambda_1}{\lambda_1 - \lambda_2} \lambda_2 e^{-\lambda_2 t} \left(1 - e^{-(\lambda_1 - \lambda_2) t}\right)
$$
$$
= \frac{\lambda_1}{\lambda_1 - \lambda_2} \lambda_2 e^{-\lambda_2 t} + \frac{\lambda_2}{\lambda_2 - \lambda_1} \lambda_1 e^{-\lambda_1 t}
$$

For $n = 3$, the PDF is given by:
$$
f_{X_1+X_2+X_3}(t) = \sum_{i=1}^3 \lambda_i e^{-\lambda_i t} \left(\prod_{j \neq i} \frac{\lambda_j}{\lambda_j - \lambda_i}\right)
$$

General result for any $n$:
$$
f_{X_1+\cdots+X_n}(t) = \sum_{i=1}^n C_{i,n} \lambda_i e^{-\lambda_i t}
$$
where
$$
C_{i,n} = \prod_{j \neq i} \frac{\lambda_j}{\lambda_j - \lambda_i}
$$


Example 5.11. Let $X_1, \dots, X_m$ be independent exponential random variables with respective rates $λ_1, \dots, λ_m$, where $λ_i \ne λ_j$ when $i \ne j$. Let N be independent of these random variables and suppose that $\sum_{n=1}^1 P_n = 1$, where $P_n = P\{N = n\}$. The random variable: 

$$
Y = \sum_{j = 1}^N X_i
$$

is said to be a ***Coxian random variable***. \
Conditioning on N gives its density function:

$$
f_Y(t) = \sum_{n=1}^m f_Y(t \mid N = n) P_n = \sum_{n=1}^m P_n \left(\sum_{i=1}^n C_{i,n} \lambda_i e^{-\lambda_i t}\right)
$$

<br>

If we interpret $N$ as a lifetime measured in discrete time periods, then $r(n)$ denotes the probability that an item will die in its nth period of use given that it has survived up to that time. Thus, $r(n)$ is the discrete time analog of the failure rate function $λ(t)$, and is correspondingly referred to as the discrete time failure (or hazard) rate function.

$$
r(n) = P\{N = n \mid N \geq n\}
$$

- Suppose that an item must go through m stages of treatment to be cured.
- However, suppose that after each stage there is a probability that the item will quit the program\
- the probability that an item that has just completed stage $n$ quits the program is (independent of how long it took to go through the $n$ stages) equal to $r(n)$
-  the total time that an item spends in the program is a Coxian random variable

### ***5.2.5 The Dirichlet Distribution***

Consider an experiment with possible outcomes $1, 2, \ldots, n$, having respective probabilities $P_1, \ldots, P_n$, such that $\sum_{i=1}^n P_i = 1$. We assume a probability distribution on the vector $(P_1, \ldots, P_n)$. Because $\sum_{i=1}^n P_i = 1$, we cannot define a density on $P_1, \ldots, P_n$, but what we can do is to define one on $P_1, \ldots, P_{n-1}$ and then take $P_n = 1 - \sum_{i=1}^{n-1} P_i$. The Dirichlet distribution assumes that $(P_1, \ldots, P_{n-1})$ is uniformly distributed over the set $S = \{(p_1, \ldots, p_{n-1}) : \sum_{i=1}^{n-1} p_i < 1, 0 < p_i, i = 1, \ldots, n-1\}$. Thus, the Dirichlet joint density function is 
$$
f_{P_1, \ldots, P_{n-1}}(p_1, \ldots, p_{n-1}) = C, \text{ for } 0 < p_i, \text{ where } i = 1, \ldots, n-1, \text{ and } \sum_{i=1}^{n-1} p_i < 1
$$
Because integrating the preceding density over the set $S$ yields that
$$
1 = C \Pr(U_1 + \ldots + U_{n-1} < 1)
$$
where $U_1, \ldots, U_{n-1}$ are independent uniform (0, 1) random variables.


**Proposition 5.3.** Let $X_1, \ldots, X_n$ be independent exponential random variables with rate $\lambda$, and let $S = \sum_{i=1}^n X_i$. Then, $\left( \frac{X_1}{S}, \frac{X_2}{S}, \ldots, \frac{X_{n-1}}{S} \right)$ has a Dirichlet distribution.

## **5.3 The Poisson Process**
### ***5.3.1 Counting Processes***

A stochastic process $\{N(t),t \geq 0\}$ is said to be a counting process if $N(t)$ represents the total number of “events” that occur by time $t$. 
- a: If we let $N(t)$ equal the number of persons who enter a particular store at or prior to time t, then $\{N(t),t ≥ 0\}$ is a counting process in which an event corresponds to a person entering the store. Note that if we had let $N(t)$ equal the number of persons in the store at time $t$, then $\{N(t),t ≥ 0\}$ would not be a counting process (why not?).
- b: If we say that an event occurs whenever a child is born, then $\{N(t),t ≥ 0\}$ is a counting process when $N(t)$ equals the total number of people who were born by time $t$. (Does $N(t)$ include persons who have died by time $t$? Explain why it must.)
- c: If $N(t)$ equals the number of goals that a given soccer player scores by time $t$, then $\{N(t),t \geq 0\}$ is a counting process. An event of this process will occur whenever the soccer player scores a goal.

<br>

For a counting process $N(t)$, must satisfy:
- $N(t) > 0$
- $N(t)$ is integer valued
- If $s < t$, then $N(s) \leq N(t)$
- For $s < t$, $N(t) - N(s)$ equals the number of events that occur in the interval $(s,t]$

<br>

- A counting process is said to possess independent increments if the numbers of events that occur in disjoint time intervals are independent
  - number of events that occur by time 10 ($N(10)$) must be independent of the number of events that occur between times 10 and 15 ($N(15) - N(10)$)
  - assuming independence is case by case >> reasonable for *a* not *b* 
- A counting process is said to possess ***stationary increments*** if the distribution of the number of events that occur in any interval of time depends only on the length of the time interval
  - the process has stationary increments if the number of events in the interval $(s,s + t)$ has the same distribution for all $s$
  - example *a* can have reasonable assumption of stationary increments (if there were no times of day at which people were more likely to enter the store i.e no rush hours)

### ***5.3.2 Definition of the Poisson Process***

**Definition 5.1:** The function $f(·)$ is said to be $o(h)$ if

$$
\lim_{h \rightarrow 0} \frac{f(h)}{h} = 0
$$

<br>

**Example 5.12:**

> a) function $f(x) = x^2$ is $o(h)$

$$
\lim_{h \rightarrow 0} \frac{f(h)}{h} = \lim_{h \rightarrow 0} \frac{h^2}{h} = \lim_{h \rightarrow 0} h = 0
$$

> b) function $f(x) = x$ is not $o(h)$

$$
\lim_{h \rightarrow 0} \frac{f(h)}{h} = \lim_{h \rightarrow 0} \frac{h}{h} = \lim_{h \rightarrow 0} 1 \ne 0
$$

> c) $f(·)$ is $o(h)$ and $g(·)$ is $o(h)$, then so is $f (·) + g(·)$ \
> d) If $f(·)$ is $o(h$), then so is $g(·) = cf(·)$ \
> e) any finite linear combination of functions, each of which is $o(h)$, is $o(h)$

The $o(h)$ notation can be used to make statements more precise. For instance, if $X$ is continuous with density $f$ and failure rate function $\lambda(t)$, then the approximate statements
$$
P(t < X < t + h) \approx f(t)h
$$
$$
P(t < X < t + h \mid X > t) \approx \lambda(t)h
$$
can be precisely expressed as
$$
P(t < X < t + h) = f(t)h + o(h)
$$
$$
P(t < X < t + h \mid X > t) = \lambda(t)h + o(h)
$$

**Definition 5.2**: The counting process $\{N(t), t \geq 0\}$ is said to be a Poisson process with rate $\lambda > 0$ if the following axioms hold:
1. $N(0) = 0$
2. $\{N(t), t \geq 0\}$ has independent increments
3. $P(N(t + h) - N(t) = 1) = \lambda h + o(h)$
4. $P(N(t + h) - N(t) \geq 2) = o(h)$

**Lemma 5.1:** $\{Ns(t),t ≥ 0\}$ is a Poisson process with rate $λ$ \
**Lemma 5.2:** If $T_1$ is the time of the first event of the Poisson process $\{N(t),t ≥ 0\}$, then $P(T_1 > t) = P(N(t) = 0) = e^{−λt}$ \
**Proposition 5.4:** $T_1,T_2, \dots$ are independent and identically distributed exponential random variables with rate $λ$ 

<br>

Another quantity of interest is $S_n$, the time of the $n$-th event. Because the interarrival times are the times between successive events, it is easily seen that
$$
S_n = \sum_{i=1}^n T_i, \quad n \geq 1
$$
Thus, from Propositions 5.4 and 5.1, it follows that $S_n$ is a gamma $(n, \lambda)$ random variable with density function
$$
f_{S_n}(s) = \lambda e^{-\lambda s} (\lambda s)^{n-1} \frac{1}{(n-1)!}, \quad s > 0
$$

<br>

**Theorem 5.1:** If $\{N(t),t \geq 0\} is a Poisson process with rate $λ$, then $N(t)$ is a Poisson random variable with rate $λt$. That is, 
$$
P(N(t) = n) = e^{−λt}(λt)n/n!, \quad n ≥ 0
$$

- A counting process for which the distribution of the number of events in an interval depends only on the length of the interval and not its location is said to have stationary increments. Thus, a Poisson process has stationary increments

**Example 5.13:** Suppose that people immigrate into a territory according to a Poisson process with rate λ = 2 per day
> a) Find the probability there are 10 arrivals in the following week (of 7 days)
> b) Find the expected number of days until there have been 20 arrivals

<br>

- a: number of arrivals in 7 days is Poisson with mean $7 \lambda = 14$,  probability there will be 10 arrivals is $e^{−14}(14)^{10}/10!$
- b: $E[S_{20}] = 20/λ = 10$


### ***5.3.3 Further Properties of Poisson Processes***

Consider a Poisson process $\{N(t), t \geq 0\}$ having rate $\lambda$, and suppose that each time an event occurs, it is classified as either a type I or a type II event. Suppose further that each event is classified as a type I event with probability $p$ or a type II event with probability $1-p$, independently of all other events. For example, suppose that customers arrive at a store in accordance with a Poisson process having rate $\lambda$; and suppose that each arrival is male with probability $\frac{1}{2}$ and female with probability $\frac{1}{2}$. Then a type I event would correspond to a male arrival and a type II event to a female arrival.

Let $N_1(t)$ and $N_2(t)$ denote respectively the number of type I and type II events occurring in $[0, t]$. Note that $N(t) = N_1(t) + N_2(t)$.

**Proposition 5.5:**  $\{N_1(t), t \geq 0\}$ and  $\{N_2(t), t \geq 0\}$ are both Poisson processes having respective rates $λp$ and $λ(1 − p)$. Furthermore, the two processes are independent.

**Example 5.14:**. If immigrants to area A arrive at a Poisson rate of ten per week, and if each immigrant is of English descent with probability 1/12 , then what is the probability that no people of English descent will emigrate to area A during the month of February?

>> the number of Englishmen emigrating to area A during the month of February is Poisson distributed with mean $4 \times 10 \times \frac{1}{12} = \frac{10}{3}$ $\implies$ probability is $e^{-10/3}$ 

**Example 5.15:** Suppose nonnegative offers to buy an item that you want to sell arrive according to a Poisson process with rate λ. Assume that each offer is the value of a continuous random variable having density function f (x). Once the offer is presented to you, you must either accept it or reject it and wait for the next offer. We suppose that you incur costs at a rate c per unit time until the item is sold, and that your objective is to maximize your expected total return, where the total return is equal to the amount received minus the total cost incurred. Suppose you employ the policy of accepting the first offer that is greater than some specified value y. (Such a type of policy, which we call a y-policy, can be shown to be optimal.) What is the best value of y? What is the maximal expected net return?

- $X$ is value of random offer
- $\bar{F} = P\{X > x\} = \int_{x}^{\infty} f(u) du$ is $X$'s tail distribution function
- each offer will be greater than y with probability $\bar{F}(y) \implies$ Poisson process with rate $λ\bar{F}(y)$
  - the time until an offer is accepted is an exponential random variable with rate $λ\bar{F}(y)$
- $R(y)$ denote the total return from the policy that accepts the first offer that is greater than $y$

$$
E[R(y)] = E[\text{accepted offer}] - cE[\text{time to accept}]
= E[X \mid X > y] - \frac{c}{\lambda \overline{F}(y)}
$$
$$
= \int_0^\infty x f_{X \mid X > y}(x) \, dx - \frac{c}{\lambda \overline{F}(y)}
$$
$$
= \int_y^\infty \frac{x f(x)}{\overline{F}(y)} \, dx - \frac{c}{\lambda \overline{F}(y)}
$$
$$
= \frac{1}{\overline{F}(y)} \left(\int_y^\infty x f(x) \, dx - \frac{c}{\lambda} \right)
$$
$$
\frac{d}{dy}E[R(y)] = 0 \implies \text{the optimal value of y satisfies} \quad y \bar{F}(y) = \int_{y}^{\infty} x f(x) dx - \frac{c}{\lambda}
$$

Hence, the optimal policy is the one that accepts the first offer that is greater than $y^∗$, where $y^∗$ is such that
$$
\int_{y^*}^{\infty} (x - y^*) f(x) dx = \frac{c}{\lambda}
$$

$$
\implies
$$

$$
E[R(y^*)] = y^*
$$

- the optimal critical value is also the maximal expected net return
- note that when an offer is rejected the problem basically starts anew and so the maximal expected additional net return from then on is the maximal expected net return
  - implies that it is optimal to accept an offer if and only if it is at least as large as the maximal expected additional net return

Example 5.17 (The Coupon Collecting Problem) [wiki link](https://en.wikipedia.org/wiki/Coupon_collector%27s_problem)

The probability that n events occur in one Poisson process before m events have occurred in a second and independent Poisson process: 

Let $\{N1(t),t \geq 0\}$ and $\{N2(t),t \geq 0\}$ be two independent Poisson processes having respective rates $λ_1$ and $λ_2$. Also, let $S_n^1$ denote the time of the $n$-th event of the first process, and $S_2^m$ the time of the $m$-th event of the second process:

$$
P\{S_n^1 < S_m^2 \}
$$

n = m = 1 case:

$$
P\{S_1^1 < S_1^2 \} = \frac{\lambda_1}{\lambda_1 + \lambda_2}
$$

two events occur in the $N_1(t)$ process before a single event has occurred in the $N_2(t)$ process case:

$$
P\{S_2^1 < S_1^2 \} = \left(\frac{\lambda_1}{\lambda_1 + \lambda_2} \right)^2
$$

Each event that occurs is going to be an event of the $N_1(t)$ process with probability $λ_1/(λ_1 + λ_2)$ or an event of the $N_2(t)$ process with probability $λ_2/(λ_1 + λ_2$), independent of all that has previously occurred. \
The probability that the $N_1(t)$ process reaches $n$ before the $N_2(t)$ process reaches $m$ is just the probability that $n$ heads will appear before $m$ tails if one flips a coin having probability $p = λ_1/(λ_1 + λ_2)$ of a head appearing

$$
P\{S_n^1 < S_m^2 \} = \sum_{k=n}^{n+m-1} \binom{n+m-1}{k} \left(\frac{\lambda_1}{\lambda_1 + \lambda_2}\right)^k \left(\frac{\lambda_2}{\lambda_1 + \lambda_2}\right)^{n+m-1-k}
$$

### ***5.3.4 Conditional Distribution of the Arrival Times***

Suppose we are told that exactly one event of a Poisson process has taken place by time $t$, and we are asked to determine the distribution of the time at which the event occurred. Now, since a Poisson process possesses stationary and independent increments it seems reasonable that each interval in $[0,t]$ of equal length should have the same probability of containing the event. In other words, the time of the event should be uniformly distributed over $[0,t]$

$$
P\{T_1 < s \mid N(t) = 1\} = \frac{s}{t}
$$

Let $Y_1, Y_2, \ldots, Y_n$ be $n$ random variables. We say that $Y_{(1)}, Y_{(2)}, \ldots, Y_{(n)}$ are the order statistics corresponding to $Y_1, Y_2, \ldots, Y_n$ if $Y_{(k)}$ is the $k$-th smallest value among $Y_1, \ldots, Y_n$, $k = 1, 2, \ldots, n$. For instance, if $n = 3$ and $Y_1 = 4$, $Y_2 = 5$, $Y_3 = 1$, then $Y_{(1)} = 1$, $Y_{(2)} = 4$, $Y_{(3)} = 5$. If the $Y_i$, $i = 1, \ldots, n$, are independent identically distributed continuous random variables with probability density $f$, then the joint density of the order statistics $Y^{(1)}, Y^{(2)}, \ldots, Y^{(n)}$ is given by
$$
f(y_1, y_2, \ldots, y_n) = n! \prod_{i=1}^n f(y_i), \text{ where } y_1 < y_2 < \ldots < y_n
$$
The preceding follows since:
1. $(Y^{(1)}, Y^{(2)}, \ldots, Y^{(n)})$ will equal $(y_1, y_2, \ldots, y_n)$ if $(Y_1, Y_2, \ldots, Y_n)$ is equal to any of the $n!$ permutations of $(y_1, y_2, \ldots, y_n)$;
2. The probability density that $(Y_1, Y_2, \ldots, Y_n)$ is equal to $(y_{i1}, \ldots, y_{in})$ is $\prod_{j=1}^n f(y_{ij}) = \prod_{j=1}^n f(y_j)$ when $i1, \ldots, in$ is a permutation of $1, 2, \ldots, n$.

If the $Y_i$, $i = 1, \ldots, n$, are uniformly distributed over $(0, t)$, then we obtain from the preceding that the joint density function of the order statistics $Y^{(1)}, Y^{(2)}, \ldots, Y^{(n)}$ is
$$
f(y_1, y_2, \ldots, y_n) = \frac{n!}{t^n}, \text{ for } 0 < y_1 < y_2 < \ldots < y_n < t
$$


***Theorem 5.2:***. Given that $N(t) = n$, the $n$ arrival times $S_1, \dots, S_n$ have the same distribution as the order statistics corresponding to $n$ independent random variables uniformly distributed on the interval $(0,t)$
- Suppose that there are $k$ possible types of events and that the probability that an event is classified as a type $i$ event, $i = 1,\dots,k$, depends on the time the event occurs. Specifically, suppose that if an event occurs at time $y$ then it will be classified as a type $i$ event, independently of anything that has previously occurred, with probability $P_i(y), i = 1, \dots, k$ where $\sum_{i=1}^k P_i(y) = 1$. Upon using above Theorem 5.2, we can prove the following useful proposition:

**Proposition 5.6:** If $N_i(t), i =1, \dots, k$, represents the number of type $i$ events occurring by time $t$ then $N_i(t), i = 1, \dots, k$, are independent Poisson random variables having means:
$$
E[N_i(t)] = λ \int_0^t P_i(s)ds
$$


**Example 5.18:** (An Infinite Server Queue). Suppose that customers arrive at a service station in accordance with a Poisson process with rate $λ$. Upon arrival the customer is immediately served by one of an infinite number of possible servers, and the service times are assumed to be independent with a common distribution $G$. What is the distribution of $X(t)$, the number of customers that have completed service by time $t$? What is the distribution of $Y(t)$, the number of customers that are being served at time $t$? 

- type I: entering customer with completed service by time t >> service time is less than $t − s$ >> so has probability $G(t - s)$
- type II: entering customer with incompleted service by time t >> service time is $s \leq t$ with probability $\bar{G}(t -s) = 1 - G(t - s)$

From Prop 5.6, the distribution of $X(t)$, the number of customers that have completed service by time $t$, is Poisson distributed with mean:

$$
E[X(t)] = λ \int_0^t G(t − s) ds = λ \int_0^t G(y)dy
$$

The distribution of $Y(t)$, the number of customers being served at time $t$ is Poisson with mean: 

$$
E[Y(t)] = λ \int_0^t \bar{G}(t − s) ds = λ \int_0^t \bar{G}(y)dy
$$

- $X(t)$ and $Y(t)$ are independent


Suppose now that we are interested in computing the joint distribution of $Y(t)$ and $Y(t + s)$ — that is, the joint distribution of the number in the system at time $t$ and at time $t + s$. To accomplish this, say that an arrival is
- type 1: if he arrives before time t and completes service between t and t + s,
- type 2: if he arrives before t and completes service after t + s,
- type 3: if he arrives between t and t + s and completes service after t + s,
- type 4: otherwise

Hence, an arrival at time $y$ will be type $i$ with probability $P_i(y)$ given by:
$$
P_1(y) = 
\begin{cases} 
G(t + s - y) - G(t - y), & \text{if } y < t \\
0, & \text{otherwise}
\end{cases}
$$

$$
P_2(y) = 
\begin{cases} 
\bar{G}({t} + s - y), & \text{if } y < t \\
0, & \text{otherwise}
\end{cases}
$$

$$
P_3(y) = 
\begin{cases} 
\bar{G}({t} + s - y), & \text{if } t < y < t + s \\
0, & \text{otherwise}
\end{cases}
$$

$$
P_4(y) = 1 - P_1(y) - P_2(y) - P_3(y)
$$


From Prop 5.6, $N_i = N_i(s + t), i = 1, 2, 3$ are independent Poisson random variableswith respective means:
$$
E[N_i] = λ \int_0^{t+s} P_i(y)dy, \quad i = 1, 2, 3
$$

- $Y(t) = N_1 + N_2$
- $Y(t + s) = N_2 + N_3$


Joint distribution of Y(t) and Y(t + s):
$$
Cov[Y(t),Y(t + s)]
$$
$$
= Cov(N_1 + N_2,N_2 + N_3)
$$
$$
= Cov(N_2, N_2) 
$$
- by independence of $N_1, N_2, N_3$
$$
= Var(N_2)
$$
$$
λ \int_0^t \bar{G}(t + s − y)dy = λ \int_0^t \bar{G}(u + s) du
$$
- since the variance of a Poisson random variable equals its mean


The joint distribution of $Y(t)$ and $Y(t + s)$ is as follows:
$$
P\{Y(t) = i, Y(t + s) = j\} = P\{N_1 + N_2 = i, N_2 + N_3 = j\}
$$
$$
= \sum_{l=0}^{\min(i,j)} P\{N_2 = l, N_1 = i - l, N_3 = j - l\}
$$
$$
= \sum_{l=0}^{\min(i,j)} P\{N_2 = l\} P\{N_1 = i - l\} P\{N_3 = j - l\}
$$


Given $S_n$, the time of the $n$-th event, then the first $n − 1$ event times are distributed as the ordered values of a set of $n − 1$ random variables uniformly distributed on $(0,Sn)$

**Proposition 5.7.** Given that $S_n = t$, the set $S_1, \ldots, S_{n-1}$ has the distribution of a set of $n - 1$ independent uniform $(0, t)$ random variables.

**Proof.** We can prove the preceding in the same manner as we did for Theorem 5.2, or we can argue more loosely as follows:
$$
S_1, \ldots, S_{n-1} \mid S_n = t \sim S_1, \ldots, S_{n-1} \mid S_n = t, N(t^-) = n - 1 \sim S_1, \ldots, S_{n-1} \mid N(t^-) = n - 1
$$
where $\sim$ means "has the same distribution as" and $t^-$ is infinitesimally smaller than $t$.


|