## Motivating Example

Xavier and Yolanda head to the roulette table at a casino. They both place bets on red on 3 spins of the roulette wheel before Xavier has to leave. After Xavier leaves, Yolanda places bets on red on 2 more spins of the wheel. Let X be the number of bets that Xavier wins and Y be the number that Yolanda wins.

We know that $X \sim \text{Binomial}(n=3, p=\frac{18}{38})$ distribution so its p.m.f. is 

$$
    f(x) = \binom{3}{x} (\frac{18}{38})^x (1 - \frac{18}{38})^{3-x}
$$

which we can write in tabular form as

| x | 0 | 1 | 2 | 3 |
| --- | --- | --- | --- | --- |
| f(x) | 0.1458 | 0.3936 | 0.3543 | 0.1063 |

We also know that $Y \sim \text{Binomial}(n=5, p=\frac{18}{38})$ so its p.m.f. is 

| x | 0 | 1 | 2 | 3 | 4 | 5
| --- | --- | --- | --- | --- | --- | --- |
| f(x) | 0.0404 | .1817 | .3271 | .2944 | .1325 | .0238 |

But this does not tell us how X and Y are related to each other. In fact, the two random variables have a very distinctive relationship. For example, Y must be greater than or equal to X, since Yolanda made the same three bets that Xavier did, plus two more. In this lesson, we will learn a way to describe the distribution of two (or more) random variables.




## Theory

- So far, we have learnt how to deal with independent binomial events
- But in the motivating example, it is clear that the two processes X and Y are NOT independent. Yolanda cannot win fewer bets than Xavier, because they bet on the same outcome on the same roll!
    - To see how X and Y are related to each other, we can draw them out in a 2D table

- Definition 18.1: The joint relationship between 2 variables is also called the joint distribution of two random variables X and Y, and it is described by the joint p.m.f.
    - $f(x,y) = P(X=x, Y=y)$

### Example 18.1

- Example 18.1: Let’s work out the joint p.m.f. of X, the number of bets that Xavier wins, and Y, the number of bets that Yolanda wins. To do this, we will lay out the values of $f(x,y)$ in a table

$$
\begin{equation}
\begin{array}{rr|cccc}
  & 5 & f(0, 5) & f(1, 5) & f(2, 5) & f(3, 5) \\
  & 4 & f(0, 4) & f(1, 4) & f(2, 4) & f(3, 4) \\
y & 3 & f(0, 3) & f(1, 3) & f(2, 3) & f(3, 3) \\
  & 2 & f(0, 2) & f(1, 2) & f(2, 2) & f(3, 2) \\
  & 1 & f(0, 1) & f(1, 1) & f(2, 1) & f(3, 1) \\
  & 0 & f(0, 0) & f(1, 0) & f(2, 0) & f(3, 0) \\
\hline
& & 0 & 1 & 2 & 3\\
& &   & & x
\end{array}
\end{equation}
$$

- Observe that some combinations in the table above are impossible. For example, y cannot be 0 if x is non-zero! Let's set the impossible ones to 0

$$
\begin{equation}
\begin{array}{rr|cccc}
  & 5 & 0 & 0 & 0 & f(3, 5) \\
  & 4 & 0 & 0 & f(2, 4) & f(3, 4) \\
y & 3 & 0 & f(1, 3) & f(2, 3) & f(3, 3) \\
  & 2 & f(0, 2) & f(1, 2) & f(2, 2) & 0 \\
  & 1 & f(0, 1) & f(1, 1) & 0 & 0 \\
  & 0 & f(0, 0) & 0 & 0 & 0 \\
\hline
& & 0 & 1 & 2 & 3\\
& &   & & x
\end{array}
\end{equation}
$$

- So we understand what a joint distribution is now; it is the distribution of two random variables across all their possible values! In this case, the value is discrete, but in others is may be continuous

- But how can we compute the joint distribution? Let's take an example of $f(1,2)$ in the table above
    - We know that $f(1,2) = P(\text{x=1 and y=2})$ 
    - For x=1, it means that there was 1 win in the first 3 spins
    - This means that, in the last 2 spins, y=1 
    - Rewriting:

$$\begin{align}
    f(1,2) &= P(\text{x=1 and y=2}) \\
    &= P(\text{1 red in first 3 spins and 1 red in next 2 spins}) \\
    &= P(\text{1 red in first 3 spins}) \cdot P(\text{1 red in next 2 spins}) & \text{(spins are independent)} \\
    &= \binom{3}{1} (\frac{18^1}{38^1}) (\frac{20^2}{38^2}) \cdot \binom{2}{1} (\frac{18^1}{38^1}) (\frac{20^1}{38^1})
\end{align}$$

- Computing for all values in the table above:

In [27]:
import scipy
import pandas as pd

X=[x for x in range(4)]
Y=[x for x in range(6)]
joint=[(x,y) for x in X for y in Y]
probs = {}
probsum=0
for pair in joint:
    x,y=pair[0], pair[1]
    if x > y:
        probs[pair] = 0
    elif y - x > 2:
        probs[pair] = 0
    else:
        probs[pair] = scipy.stats.binom.pmf(n=3, p=18/38, k=x) * scipy.stats.binom.pmf(n=2, p=18/38, k=y-x)


In [29]:
import numpy as np
def dict_to_joint_table(probs: dict, X: list, Y: list):
    probs_array=np.zeros((len(X), len(Y)))
    for key,val in probs.items():
        probs_array[key] = val
    assert(np.abs(np.sum(probs_array) - 1) < 0.0001)
    return pd.DataFrame(probs_array)

dict_to_joint_table(probs, X, Y)

Unnamed: 0,0,1,2,3,4,5
0,0.040386,0.072695,0.032713,0.0,0.0,0.0
1,0.0,0.109042,0.196276,0.088324,0.0,0.0
2,0.0,0.0,0.098138,0.176649,0.079492,0.0
3,0.0,0.0,0.0,0.029441,0.052995,0.023848


### Example 18.2

A fair coin is tossed 6 times. Let X be the number of heads in the first 3 tosses. Let Y be the number of heads in the last 3 tosses. Calculate the joint p.m.f. of X and Y, and use it to calculate $P(X+Y \leq 2)$

- We know that X and Y are independent, because it is a fair coin

- We want to know the join pmf of X and Y, or $f(X,Y)$

- X and Y can both take on values [0,1,2,3]

- So the joint PMF can be represented as:

$$
\begin{array}{rr|cccc}
 & 3 & f(0,3) & f(1,3) & f(2,3) & f(3,3) \\
y  & 2 & f(0,2) & f(1,2) & f(2,2) & f(3,2) \\
 & 1 & f(0,1) & f(1,1) & f(2,1) & f(3,1) \\
 & 0 & f(0,0) & f(1,0) & f(2,0) & f(3,0) \\
\hline
& & 0 & 1 & 2 & 3 \\
& & & & x & 
\end{array}
$$

- X,Y follow binomial distribution
    - $\binom{3}{x} 0.5^x 0.5^{3-x}$

In [33]:
X=Y=[x for x in range(4)]
joint=[(x,y) for x in X for y in Y]
probs={}
for pair in joint:
    x,y=pair[0],pair[1]
    probs[pair] = scipy.stats.binom.pmf(n=3, p=1/2, k=x) * scipy.stats.binom.pmf(n=3, p=1/2, k=y)

def dict_to_joint_table(probs: dict, X: list, Y: list):
    probs_array=np.zeros((len(X), len(Y)))
    for key,val in probs.items():
        probs_array[key] = val
    assert(np.abs(np.sum(probs_array) - 1) < 0.0001)
    return pd.DataFrame(probs_array)

dict_to_joint_table(probs, X, Y)

Unnamed: 0,0,1,2,3
0,0.015625,0.046875,0.046875,0.015625
1,0.046875,0.140625,0.140625,0.046875
2,0.046875,0.140625,0.140625,0.046875
3,0.015625,0.046875,0.046875,0.015625


- Using the joint PMF, we can simply sum the cells where X+Y <= 2!

In [38]:
np.sum([value for key, value in probs.items() if key[0]+key[1] <= 2])

0.3437500000000001

- Notice that, due to independence, this is the same as checking what the probaiblity of getting 2 or less heads in 6 tosses!

In [39]:
import scipy
scipy.stats.binom.cdf(n=6, p=1/2, k=2)

0.34375

### Example 18.3

The number of eggs laid by a hen, N, is a Poisson(μ) random variable. Each egg hatches with probability p, independently of any other egg. Let X be the number of eggs that hatch into baby chickens. Find the joint p.m.f. of N and X

- Since $N$ is poisson, the domain of $N$ ranges between 0 and infinity. So there is no way to list the joint probabilities are a table in the same manner we have done so far. Instead, let's write the PMF as a formula:
    - $P(N) = e^{-\mu} \frac{\mu^{N}}{N!}$

- $X$ can take on any number between 0 and $N$. So again, we express the binomial distribution of X as a formula
    - $P(X|N) = \binom{N}{x} p^{x} (1-p)^{N-x}$

- Unlike the previous 2 examples, we cannot re-cast $X$ and $N$ as independent events! However, recall from the earlier section that $P(X | N) = \frac{P(X,N)}{P(N)}$
    - So $f(X,N) = P(X,N) = P(X|N)P(N)$

- Applying this relationship:
    - $$\begin{align}
        f(X,N) &= e^{-\mu} \frac{\mu^{N}}{N!} \cdot \binom{N}{x} p^{x} (1-p)^{N-x} \\
        &= e^{-\mu} \frac{\mu^{N}}{N!} \cdot \frac{N!}{x! (N-x)!} p^{x} (1-p)^{N-x}
        \end{align} $$
    - This applies only when x <= N, so more precisely:
        - $$
            f(X,N) = \left \{ \begin{matrix}
                e^{-\mu} \frac{\mu^{N}}{N!} \cdot \frac{N!}{x! (N-x)!} p^{x} (1-p)^{N-x} & 0 \leq x \leq n \lt \inf \\
                0 & \text{otherwise}
            \end{matrix} \right .
            $$

