# Lecture 3.4: Jointly Distributed Random Variables

## Outline

* The joint distribution function
* Marginal distributions
* Independent random variables
* Conditional distributions
* Conditional expectation
* Combining random variables
    * Covariance
    * Correlation
    * Mean
    * Variance

## The Relationship Between Two Random Variables

<img src="images/relationship.png" width="400">

* Previously, we talked about the distribution, mean and variance for a single random variable.

* For today's class we will mainly focus on the relationship between two *discrete* random variables  

* The same rules hold for *continuous* random variables - refer to [this link](http://www.colorado.edu/economics/morey/6818/jointdensity.pdf) for a detailed discussion on the continuous case along with examples

## The Joint Distribution Function

* When we deal with two discrete random variables, $X$ and $Y$, it is convenient to work with joint probabilities. We define the joint probability distribution to be    


$$ f_{X, Y} (x, y) = P(X = x \text{ and } Y = y) $$

* As usual, we require that  


$$f_{X, Y} (x, y) \geq 0 \text{ for any pairs } x, y$$  
    
$$\sum_{\text{all } x, y} f_{X, Y} (x, y) = 1$$

### Example 1

Students in a college were classified according to years in school ($X$) and number of visits to a museum in the last year (0 for no visits, 1 for one visit, 2 for more than one visit). The joint probabilities in the accompanying table were estimated for these random variables.

<img src="images/joint_example1.png" width="400">

### Example 2

The accompanying table shows, for credit card holders with one to three cards, the joint probabilities for number of cards owned ($X$) and number of credit purchases made in a week ($Y$).

<img src="images/joint_example2.png" width="500">

## Marginal Distributions

* Suppose we have the joint distribution of $X$ and $Y$, but we are only interested in $X$. We can obtain the marginal distribution of $X$ as follows, 

    $$ f_X(x) = \sum_y f_{X, Y} (x, y) $$

* The marginal probabilities of $Y$ are given by  
    
    $$ f_Y(y) = \sum_x f_{X, Y} (x, y) $$
    

* The term “marginal” merely describes how the distribution of X can be calculated from the joint distribution of X and another variable Y; row sums (or column sums) are calculated and placed “in the margin” of the probability table.

### Example: Compute the Marginal Distributions

<img src="images/joint_example1.png" width="400">

|  x  | P(X = x) |
|:---:|:--------:|
|  1  |          |
|  2  |          |
|  3  |          |
|  4  |          |

|  y  | P(Y = y) |
|:---:|:--------:|
|  1  |          |
|  2  |          |
|  3  |          |

|  x  | P(X = x) |
|:---:|:--------:|
|  1  |   0.24   |
|  2  |   0.20   |
|  3  |   0.29   |
|  4  |   0.27   |


|  y  | P(Y = y) |
|:---:|:--------:|
|  1  |   0.17   |
|  2  |   0.56   |
|  3  |   0.27   |

## Independence

* Two random variables $X$ and $Y$ are called independent if the events ($X = x$) and ($Y = y$) are independent. That is,

* The random variables $X$ and $Y$ are independent if for all values of x and y:

$$ f_{X, Y} (x, y) = f_X (x) f_Y(y) $$

### Example

<img src="images/joint_example2.png" width="500">

* What is the $P(X = 2 \text{ and } Y = 3)$?  

* Calculate the marginal probabilities.  

* Are $X$ and $Y$ independent?

## Conditional Distributions

* Let $X$ and $Y$ be jointly distributed random variables. Then the conditional distribution of $X$ given $Y$ is given by

$$ f_{X|Y} (x | y) = \frac{f_{X, Y} (x, y)}{f_Y (y)} $$

* Note that for a given $y$ value, $f_{X|Y} (x | Y = y)$ is a probability distribution. That is, for any value $y$

$$ \sum_{\text{all } x \text{ values}} P(X = x | Y = y) = 1 $$

### Example

Does money make you happy?

<img src="images/money_happy.png" width="600">

* Given the fact that you're very happy (i.e. $Y = 2$), what is the conditional distribution of your salary?

* That is, we want to compute $f_{X|Y} (x | Y = 2)$ (or $P(X = x | Y = 2)$)  


|  $x$ | P(X = x $|$ Y = 2) |
|:----:|:------------------:|
|  2.5 |  0.07/0.46 = 0.15  |
|  7.5 |                    |
| 12.5 |                    |
| 17.5 |                    |

|  $x$ | P(X = x $|$ Y = 2) |
|:----:|:------------------:|
|  2.5 |  0.07/0.46 = 0.15  |
|  7.5 |  0.11/0.46 = 0.24  |
| 12.5 |  0.14/0.46 = 0.30  |
| 17.5 |  0.14/0.26 = 0.30  |

Note: these probabilities do not exactly add up to 1 due to the rounding of the decimals. If we had included more decimal places, they should add up to 1.

## Conditional Expectation

* One useful application of conditional distributions is in calculating conditional expectations. You will see a lot more of this when we get to regression analysis.

* A conditional distribution is simply a probablity distribution, therefore, we could find the expectation of this distribution by:

$$ E(X | Y = y) = \sum_{\text{all } x \text{ values}} x \text{ } f_{X|Y} (x | Y = y) = \sum_{\text{all } x \text{ values}} x \text{ } P(X = x | Y = y) $$

### Example

<img src="images/money_happy.png" width="400">

* What is the expected salary for someone who is depressed?

* We need to compute $E(X | Y = 0)$. How do we do this?

|  $x$ | P(X = x $|$ Y = 0) |
|:----:|:------------------:|
|  2.5 |  0.03/0.07 = 0.43  |
|  7.5 |  0.02/0.07 = 0.29  |
| 12.5 |  0.01/0.07 = 0.14  |
| 17.5 |  0.01/0.07 = 0.14  |



$$ \begin{align*} 
     E(X | Y = 0) &= \sum x_i P(X = x_i | Y = 0) \\
                  &= 2.5(0.43) + 7.5(0.29) + 12.5(0.14) + 17.5(0.14) \\
                  &= 7.45
   \end{align*}  $$

## Combining Random Variables

To understand how to calculate $E(X + Y)$ and $Var(X + Y)$, we need to first introduce the concepts of covariance and correlation for random variables.  

### Covariance

* The variance of a random variable is a measure of its variability, and the covariance of two random variables is a measure of their joint variability.  

* The covariance is a measure of the linear association of two random variables. Its sign reflects the direction of the association; if the variables tend to move in the same direction the covariance is positive. If the variables tend to move in opposite directions the covariance is negative.

* The covariance is a pain to calculate, but for completeness, here is the theoretical formula,

$$ \begin{align}
Cov(X, Y) &= \sigma_{XY} \\
&= E[(X - E(X)(Y - E(Y))] \\
&= \sum_{i = 1}^N (x_i - E(X))(y_i - E(Y))P(x_i, y_i)
\end{align}$$

* The covariance can also be expressed as

$$ Cov(X, Y) = E(XY) - E(X)E(Y) $$

* Two interesting facts
    * $Cov(X, X) = Var(X)$
    
    * if $X$ and $Y$ are **independent**, $Cov(X, Y) = 0$

#### Example

<img src="images/stock_bond.png" width="600">

$$ \begin{align*} 
     E(S) &= -10(0.10) + 0(0.20) + 10(0.40) + 20(0.30) \\
          &= 9 \\
     E(T) &= 6(0.20) + 8(0.60) + 10(0.20) \\
          &= 8 \\
   \end{align*}  $$

   $$ \begin{align*} 
   Cov(S, T) &= (-10 - 9)(6 - 8)(0) \\
               &  + (0 - 9)(6 - 8)(0) \\
               &  + (10 - 9)(6 - 8)(0.10) \\
               &  + (20 - 9)(6 - 8)(0.10) \\
               &  + (-10 - 9)(8 - 8)(0) \\
               &  + (0 - 9)(8 - 8)(0.10) \\
               &  + (10 - 9)(8 - 8)(0.30) \\
               &  + (20 - 9)(8 - 8)(0.20) \\
               &  + (-10 - 9)(10 - 8)(0.10) \\
               &  + (0 - 9)(10 - 8)(0.10) \\
               &  + (10 - 9)(10 - 8)(0) \\
               &  + (20 - 9)(10 - 8)(0) \\
               &= -9.1
   \end{align*}  $$

### The Covariance Matrix

Sometimes the covariance between random variables is presented in a table, or matrix of the following form:

\begin{bmatrix}
    Var(X_1) & Cov(X_1, X_2) & Cov(X_1, X_3) \\
    Cov(X_2, X_1) & Var(X_2) & Cov(X_2, X_3) \\
    Cov(X_3, X_1) & Cov(X_3, X_2) & Var(X_3) 
\end{bmatrix}


This is called a covariance matrix.

### Covariance and Independence

* IF $X$ and $Y$ are independent, then $Cov(X, Y) = 0$.
* But the reverse is not always true!

**Example**: 

* Suppose $X$ is a random variable with  


$$ P(X = -1) = P(X = 0) = P(X = 1) = 1/3 $$

* Then

$$ E(X) = -1(1/3) + 0(1/3) + 1(1/3) = 0 $$
$$ Var(X)= (-1 - 0)^2 (1/3) + (0 - 0)^2 (1/3) + (1 - 0)^2 (1/3) = 2/3$$

* Let   

$$Y = 1 - X^2$$  

* Note $XY = 0$&nbsp; always  

* So   

$$Cov(X,Y) = E(XY) - E(X)E(Y) = 0$$

* But $Y$ is a function of $X$ so they not independent

### Correlation: Covariance Rescaled

* Covariance can indicate whether $X$ and $Y$ have a positive, negative, or zero relation. But it is not a great measure of association since it depends on the units of measurement.

* To eliminate this difficulty, we define the correlation:
$$ \rho = \frac{\sigma_{X, Y}}{\sigma_X \sigma_Y} $$

* This is a unitless measure of association.

* The correlation is always between -1 and 1, with 1 indicating a perfect positive linear relationship, -1 a perfect negative linear relationship and 0 no linear relationship between $X$ and $Y$.

### Combination of Random Variables

* If $X$ and $Y$ are independent  

$$ E(X + Y) = E(X) + E(Y) $$  

$$ Var(X + Y) = Var(X) + Var(Y) $$  

* If $X$ and $Y$ are not independent  

$$ E(X + Y) = E(X) + E(Y) $$  

$$ Var(X + Y) = Var(X) + Var(Y) + 2Cov(X, Y) $$  

* In more general cases  

$$ E((a + bX) + (c + dY)) = a + bE(X) + c + dE(Y) $$  

$$ Var((a + bX) + (c + dY)) = b^2Var(X) + d^2Var(Y) + 2bdCov(X, Y) $$  

### Investigating the Sample Mean

* Consider the mean of $n$ independent and identically distributed (i.i.d.) random variables with mean $\mu$ and variance $\sigma^2$

$$ \bar{X} = \frac{1}{n} \sum_{i = 1}^n X_i $$


* Calculate the mean of the mean, $E(\bar{X})$

* Calculate the variance of the mean, $Var(\bar{X})$

$$ \begin{align*} 
     E(\bar{X}) &= E(\frac{1}{n} \sum_{i = 1}^n X_i) \\
                &= \frac{1}{n} E(\sum_{i = 1}^n X_i) \\
                &= \frac{1}{n} \sum_{i = 1}^n E(X_i) \\
                &= \frac{1}{n} \sum_{i = 1}^n \mu \\
                &= \mu
   \end{align*}  $$

$$ \begin{align*} 
     Var(\bar{X}) &= Var(\frac{1}{n} \sum_{i = 1}^n X_i) \\
                  &= \frac{1}{n^2} Var(\sum_{i = 1}^n X_i) \\
                  &= \frac{1}{n^2} \sum_{i = 1}^n Var(X_i) \\
                  &= \frac{1}{n^2} \sum_{i = 1}^n \sigma^2 \\
                  &= \frac{\sigma^2}{n}
   \end{align*}  $$