### Implication

This implies that if $Y_1$ and $Y_2$ are independent, then the $\mbox{Cov}(Y_1, Y_2)$ is zero. However note that the converse is not true. I.e. just because the Covariance is zero, does not mean the variables are inedpendent. See the next example.

### Example

Suppose that $Y_1$ and $Y_2$ are uniformly distributed over the triangle given by:  $-1 \leq Y_1 \leq 1 $ and $ 0 \leq Y_2 \leq 1 - |Y_1| $.

- Find the normalizing constant that makes the constant function of this region a valid PDF.
- Find the $Cov(Y_1, Y_2) = E( Y_1 Y_2) - E( Y_1) E(Y_2) $
- Find the coefficient of correlation for $Y_1$ and $Y_2$.
- Discuss:  Are $Y_1$ and $Y_2$ dependent or independent?

In [3]:
import sympy as sp

y1 = sp.Symbol('y1')
y2 = sp.Symbol('y2')
sp.integrate(1, (y1, y2-1, 1-y2), (y2, 0, 1) )

1

In [4]:
Ey1 = sp.integrate( y1, (y1, y2-1, 1-y2), (y2, 0, 1) ) 
Ey1

0

In [5]:
Ey2 = sp.integrate( y2, (y1, y2-1, 1-y2), (y2, 0, 1) ) 
Ey2

1/3

In [6]:
Ey1y2 = sp.integrate( y1*y2, (y1, y2-1, 1-y2), (y2, 0, 1) ) 
Ey1y2

0

## Correlation versus Dependence

So we say that variables are uncorrelated if $\mbox{Cov}(Y_1, Y_2) = 0 $

Wheras they are independent if $P( Y_1 < y_1, Y_2 < y_2) = P_1(Y_1 < y_1) P_2(Y_2 < y_2)$.

And independents implies uncorrelated, but uncorrelated does not imply indepdence.

# Expected Values, Variances, and Covariances of Linear Functions of Random Variables

We need a motivating example for what we are about derive. So lets look ahead to what we want to do:

## Sampling

We are interested in understanding what happens when we have a random variables $Y$ and we sample it $n$ times to get $Y_1, Y_2, \dots Y_{n}$ and then compute a single statistic from those results, say the mean. 

$$ \bar{Y} = \frac{1}{n} Y_1 + \frac{1}{n} Y_2 + \dots + \frac{1}{n} Y_n $$

Written this way, we see that we can think of this as having $Y_i$ independent random variables each with $E(Y_i) = \mu $ and $V(Y_i) = \sigma$ coming from the original random varialbe $Y$. 

**Question:** Find $ E(\bar{Y})$ and $V(\bar{Y})$. I.e. how do we expect the statistics computed from our sample to behave?



$$ E(\bar{Y}) = E( \frac{1}{n} Y_1 + \frac{1}{n} Y_2 + \dots + \frac{1}{n} Y_n ) = E(\frac{1}{n} Y_1) + E( \frac{1}{n} Y_2) + \dots + E( \frac{1}{n} Y_n ) = n \frac{\mu}{n} = \mu $$

*Hint* Hint we might want to use that $$V(\bar{Y}) = E( \bar{Y}^2 ) - E(\bar{Y})^2$$

We need to find

$$ E(\bar{Y}^2 ) $$

$$\bar{Y}^2 = \sum_{i, j} \frac{1}{n^2} Y_i Y_j $$

When $i\neq j$ 

$$ E(Y_i Y_j) = E(Y_i) E(Y_j) = \mu^2 $$

$$ E(Y_i^2) = \sigma + \mu^2 $$

Inserting this into the sum we get:

$$ E(\bar{Y}^2) = \mu^2 + \frac{1}{n} \sigma $$

So

$$ V(\bar{Y}) = \mu^2 + \frac{\sigma}{n} - \mu^2 = \frac{\sigma}{n} $$

## Linear Functions of Random Variables

More generally let the $Y_i$ and $X_j$ be (possibly dependent) random variables, and 

$$ U_1 = \sum_j a_j Y_j $$ and $$ U_2 = \sum_j b_j X_j$$ be linear functions of these random variables. Then

$$ E(U_1) = \sum_j a_j E(Y_j) $$

$$ V(U_1) = E( U_1^2 ) - E(U_1)^2 = \sum a_j^2 V(Y_j) + 2 \sum_{i<j} a_i a_j \mbox{Cov}(Y_i, Y_j) $$

and 

$$ \mbox{Cov}(U_1, U_2) = \sum_{i, j} a_i b_j \mbox{Cov}(Y_i, X_j) $$

# Conditional Expecations

Finally there some neat tricks we can play with conditional probabilities and expectations. 

Given two random variables $Y_1$ and $Y_2$, we define *Conditional Expectation* of $g(Y_1)$ given $Y_2=y_2$ to be 

$$ E(g(Y_1) | Y_2=y_2) = \int g(y_1) f(y_1 | y_2 ) dy_1 $$

Where $f(y_1 | y_2) $ is the conditional PDF. 

## Expected Values 

Then note if we take the expected value of the conditional expected value we get back to just the expected value in $Y_1$:

$$ E( E(Y_1 | Y_2) ) = E( Y_1 )$$

## Variances

More interestingly if we consider variances:

$$ V(Y_1) = E[ V(Y_1 | Y_2) ] + V[ E( Y_1 | Y_2 ) ] $$

where $$ V( Y_1 | Y_2) = E( Y_1^2 | Y_2) - [ E(Y_1 | Y_2 ) ]^2$$



### Example

It is maybe not clear why these conditional expectations and the related theorems are useful. They come up frequently in models where the parameters of the random distribution are themselves unkown. Consider:  

The viral load of person with COVID-19 is $Y$ in virus particulates per mg of saliva, and fits an exponential distribution with the $\beta$ parameter a uniformly distributed random variable between $(0, 200)$. I.e. the evidence is that the $\beta$ parameter itself changes from infection to infection.

For a given $\beta$ the expected value and variance of $Y$ are known: $$ E(Y | \beta) = \beta $$ and $$ V(Y | \beta) = \beta^2 $$

Find the $E(Y)$ and $V(Y)$.