# Chapter 7: Joint Distributions

In [1]:
# The source of the content is freely available online
# https://drive.google.com/file/d/1VmkAAGOYCTORq1wxSQqy255qLJjTNvBI/view
# https://projects.iq.harvard.edu/stat110/

<h4>Definition 7.1.1 (Joint CDF)</h4>

The joint CDF of random variables $X$ and $Y$ is the function $F_{X,Y}$ given by:

$F_{X,Y}(x,y) = P(X \le x, Y \le y)$

The joint CDF of $n$ random variables is defined analogously.

<h4>Definition 7.1.12 (Joint PMF)</h4>

The joint PMF of discrete random variables $X$ and $Y$ is the function $p_{X,Y}$ given by:

$P_{X,Y}(x,y) = P(X=x, Y=y)$

The joint PMF of n discrete random variables is defined analogously.

Just as univariate PMFs must be nonnegative and sum to 1, we require valid joint PMFs to be nonnegative and sum to 1, over all possible values of X and Y.

$\sum_x \sum_y P(X=x, Y=y) = 1$

The joint distribution can be used to find the problem of the event $(X,Y) \in A$ for any set $A$ of points in the support of $(X,Y)$.

$P((X,Y) \in A) = \sum_{x,y \in A} \sum p(X=x, Y=y)$

From the joint distribution of $X$ and $Y$, we can get the distribution of $X$ alone by summing over the possible values of $y$.

<h4>Definition 7.1.3 (Marginal PMF)</h4>

For discrete random variables $X$ and $Y$, the marginal PMF of $X$ is:

$P(X=x) = \sum_y P(X=x, Y=y)$

The marginal PMF of $X$ is the PMF of $X$, viewing $X$ individually rather than jointly with $Y$.

If we observe the value of $X$ and want to update our distribution of $Y$ to reflect this information, then instead of using the marginal PMF $P(Y=y)$, which does not take the information about $X$ into account, one should use a PMF that conditions on event $X=x$.

<h4>Definition 7.1.4 (Conditional PMF)</h4>

For discrete random variables $X$ and $Y$, the conditional PMF of $Y$ given $X=x$ is:

$P(Y=y | X=x) = \frac{ P(X=x, Y=y) }{ P(X=x) }$

<h4>Defintion 7.1.7 (Independence of Discrete Random Variables)</h4>

Random variables $X$ and $Y$ are independent if for all $x$ and $y$:

$F_{X,Y} (x,y) = F_X(x) F_Y(y)$

<h4>Definition 7.1.14 (Marginal PDF)</h4>

For continuous random variables $X$ and $Y$ with joint PDF $f_{X,Y}$, the marginal PDF of $X$ is:

$f_X(x) = \int_{-\infty}^{\infty} f_{X,Y}(x,y) ~dy$

Marginalization works analogously with any number of variables. For example, if we have the joint PDF of $X$, $Y$, $Z$, and $W$, but want the joint PDF of $X$, $W$, we just have to integrate over all possible values of $Y$ and $Z$.

$f_{X,W}(x,w) = \int_{-\infty}^{\infty} \int_{-\infty}^{\infty} f_{X,Y,Z,W} (x,y,z,w) ~dy ~dz$

<h4>Definition 7.1.15 (Conditional PMF)</h4>

For continuous random variables $X$ and $Y$ with joint PDF $f_{X,Y}$, the conditional PDF of $Y$ given $X=x$ is:

$F_{Y|X}(y|x) = \frac{ f_{X,Y}(x,y) }{ f_X(x) }$

for all $x$ with $f_X(x) \gt 0$.

We can recover the joint PDF $f_{X,Y}$ if we have the conditional PDF $f_{Y|X}$ and the corresponding marginal function:

$f_{X,Y}(x,y) = f_{Y|X} (y|x) f_X(x)$

Similarly, we can recover the joint PDF if we have $f_{X,Y}$ and $f_Y$:

$f_{X,Y}(x,y) = f_{X|Y}(x|y) f_Y(y)$

This allows us to develop continuous versions of Bayes' rule and LOTP.

<h4>Theorem 7.1.18 (Continuous Form of Bayes' Rule and LOTP)</h4>

<h5>Bayes' Rule</h5>

$f_{Y|X}(y|X) = \frac{ f_{X,Y}(x|y) f_Y(y) }{f_X(x)}$

for $f_X(x) > 0$.

<h5>LOTP</h5>

$f_X(x) = \int_{-\infty}^{\infty f_{X|Y}(x|y) f_Y(y) ~dy}$

<h4>Theorem 7.2.1 (2D LOTUS)</h4>

Let $g$ be a function from $\mathbb{R}^2$ to $\mathbb{R}$. If $X$ and $Y$ are discrete, then:

$E(g(X,Y)) = \sum_x \sum_y g(x,y) P(X=x,Y=y)$

If $X$ and $Y$ are continuous, then:

$E(g(X,Y)) = \int_{-\infty}^{\infty} \int_{-\infty}^{\infty} g(x,y) f_{X,Y}(x,y) ~dx ~dy$

Like its 1D counterpart, this saves us from having to find the distribution of $g(X,Y)$ in order to calculate its expectation. Having the joint PMF or PDF of $X$ and $Y$ is enough.

<h3>Theorem 7.3 Covariance and Correlation</h3>

<h4>Definition 7.3.1 (Covariance)</h4>

The covariance between $X$ and $Y$ is:

$Cov(X,Y) = E( (X-EX)(Y-EY) ) = E(XY) - E(X)E(Y)$

<h4>Theorem 7.4.4 (Multinomial Lumping)</h4>

If $\mathbf{X} \text{~} Mult_k(n, \mathbf{p})$, then for any distinct $i$ and $j$, $X_i + X_j \text{~} Bin(n, p_i + p_j)$. The random vector of counts obtained from merging categories $i$ and $j$ is still Multinomial.

$(X_1 + X_2, X_3, \ldots, X_k) \text{~} Multi_{k-1} (n, (p_1 + p_2, p_3, \ldots, p_k))$

<h3>7.5 Multivariate Normal</h3>

<h4>Definition 7.5.1 (Multivariate Normal Distribution)</h4>

A $k$-dimensional random vector $\mathbf{X} = (X_1, \ldots, X_k)$ is said to have a Multivariate Normal (MVN) distribution if every linear combination of the $X_j$ has a Normal distribution. Similarly, the marginal distribution of each $X_j$ is Normal.

<h4>Theorem 7.5.4</h4>

If $(X_1, X_2, X_3)$ is MVN, then so is the subvector $(X_1, X_2)$.

<h4>Theorem 7.5.5</h4>

If $\mathbf{X} = (X_1, \ldots, X_n)$ and $\mathbf{Y} = (Y_1, \ldots, Y_m)$ are MVN random vectors with $\mathbf{X}$ independent of $\mathbf{Y}$, then the concatenated random vector $\mathbf{W} = (X_1, \ldots, X_n, Y_1, \ldots, Y_m)$ is MVN.

A MVN is fully specified by knowing the mean of each component, the varaince of each component, and the covariance or correlation between any two components. i.e., the parameters of an MVN random vector $(X_1, \ldots, X_k)$ are the mean vector $(\mu_1, \ldots, \mu_k)$, where $E(X_j) = \mu_j$, and the covariance matrix.

# Exercises

<h3>Covariance Exercises</h3>

<h4>Exercise 39</h4>

Two fair six-sided dice are rolled, one green and one orange, with outcomes $X$ and $Y$ respectively for the green and the orange.

<b>Part A:</b>

Compute the covariance of $X+Y$ and $X-Y$.

<i>Answer:</i>

$Cov(X+Y, X-Y) = Cov(X,X) - Cov(X,Y) + Cov(Y,X) - Cov(Y,Y) = 0$

<h4>Exercise 41</h4>

Let $X$ and $Y$ be standardized random variables with correlation $\rho \in (-1,1)$. Find $a,b,c,d$ (in terms of $\rho$) such that $z = aX + bY$ and $W = cX + dY$ are uncorrelated but still standardized.

<i>Answer:</i>

$Cov(Z,W) = Cov(X, cX + dY) = Cov(X, cX) + Cov(X, dY) = c + d \rho = 0$

Also, $Var(W) = c^2 + d^2 + 2 cd \rho = 1$. Solving for $c, d$ gives:

$a=1, b=0, c = -\rho / \sqrt{1 - \rho^2} = 1 \sqrt{1-\rho^2}$

<h4>Exercise 52</h4>

A drunken man wanders around randomly in a large space. At each step, he moves one unit of distance north, south, east, or west, with equal probabilities. Choose coordinates such that his initial position is $(0,0)$ and if he is at $(x,y)$ at some time, then one step later he is at $(x,y+1)$, $(x,y-1)$, $(x+1,y)$, or $(x-1,y)$. Let $(X_n, Y_n)$ and $R_n$ be his position and distance from the origin after n steps, respectively. Find $Cov(X_n, Y_n)$.

<i>Answer:</i>

Write $X_n = \sum_{i=1}^n z_i$ and $Y_n \ sum_{j=1}^n w_j$, where $Z_i$ is $-1$ if his $i^{th}$ step is westward, $1$ if his $i^{th}$ step is eastward, and $0$ otherwise, and similarly for $W_j$. Then $Z_i$ is independent of $W_j$ for $i \neq j$. But $Z_i + W_i$ are highly dependent. Exactly one of them is $0$ since he moves in one direction at a time. Then $Cov(Z_i, W_i) = E(Z_i W_i) - E(Z_i) E(W_i) = 0$, since $Z_i W_i$ is always $0$, and $Z_i$ and $W_i$ have mean $0$. So:

$Cov(X_n, Y_n) = \sum_{i,j} Cov(Z_i, W_j) = 0$

<h4>Exercise 55</h4>

Consider the following method for creating a bivariate Poisson (a joint distribution for two random variables such that both marginals are Poissons). Let $X=V+W$, $Y=V+Z$, where $V$, $W$, $Z$ are IID $Pois(\lambda)$.

<b>Part A:</b>

Find $Cov(X,Y)$

<i>Answer:</i>

Using the properties of covariance, we have:

$Cov(X,Y) = Cov(V,V) + Cov(V,Z) + Cov(W,V) + Cov(W,Z) = Var(V) = \lambda$

<b>Part B:</b>

Are $X$ and $Y$ independent? Are they conditionally independent given $V$?

<i>Answer:</i>

Since $X$ and $Y$ are correlated (with covariance $\lambda \gt 0$), they are not independent. Alternatively, note that $E(Y) = 2 \lambda$ but $E(Y|X=0) = \lambda$, since if $X=0$ occurs then $V=0$ occurs. But $X$ and $Y$ are conditionally independent given $V$, since the conditional joint PMF is:

$P(X=x, Y=y|V=v) = P(W=x-v, Z=y-v|V=v) = P(W=x-v, Z=y-v) = P(W=x-v)P(Z=y-v) = P(X=x|V=v) P(Y=y|V=v)$

This makes sense intuitively since if we observe that $V=v$, then $X$ and $Y$ are the independent random variables $W$ and $Z$, shifted by constant $v$.

<h3>Multinomial Exercises</h3>

<h4>Exercise 65</h4>

Consider the birthdays of $100$ perople. Assume people's birthdays are independent, and the $365$ days of the year are equally likely. Find the covariance and correlation between how many of the people were born on January 1 and how many were born on January 2.

Answer:

Let $X_j$ be the number of people born on January $j$. Then:

$Cov(X_1, X_2) = - \frac{100}{365^2}$

using the result about covariances in a Multinomial. Since $X_j \text{~} Bin \left( 100, \frac{1}{365} \right)$, we then have:

$Corr(X_1, X_2) = \frac{Cov(X,Y)}{\sqrt{Var(X) Var(Y)}} = \frac{ 100/365^2 }{ 100 \left( \frac{1}{365} \right) \left( \frac{364}{365} \right) } = - \frac{1}{364}$

<h4>Exercise 68</h4>

Emails arrive in an inbox according to a Poisson process with rate \lambda, independently. Let $X$, $Y$, $Z$ be the numbers of emails that arrive from 9am to noon, noon to 6pm, and 6pm to midnight.

Part A:

Find the joint PMF of $X$, $Y$, and $Z$.

Answer:

Since $X \text{~} Pois(3 \lambda), Y \text{~} Pois(6 \lambda), Z \text{~} Pois(6 \lambda)$ independently, the joint PMF is:

$P(X=x, Y=y, Z=z) = \frac{ e^{-3 \lambda}(3 \lambda)^x }{x!} + \frac{ e^{-6 \lambda}(6 \lambda)^y }{y!} + \frac{ e^{-6 \lambda}(6 \lambda)^z }{z!}$

Part B:

Find the conditional PMF of $X+Y$ given that $X+Y+Z = 36$, and find $E(X+Y|X+Y+Z = 36)$ and $Var(X+Y|X + Y + Z = 36)$.

Answer:

Conditional expectation and conditional variance given an event are defined in the same way as expectation and variance, using the conditional distribution given the event in place of the unconditional distribution.

Let $W=X+Y$ and $T=X+Y+Z$. Using the story of the Multinomial and part B, we can merge the categories 9am to noon and noon to 6pm to getL

$W|T = 36 \text{~} Bin \left( 36, \frac{9}{15} \right)$

Therefore, $E(W|T=36) = 36 \cdot \frac{9}{15} = 21.6$ and $Var(W|T=36) = 36 \cdot \frac{9}{15} \cdot \frac{6}{15} = 8.64$.