# Expectation, Indicator Random Variables, Linearity

## More on Cumulative Distribution Functions

A CDF: $F(x) = P(X \le x)$, as a function of real $x$ has to be

* non-negative
* add up to 1

In the following discrete case, it is easy to see how the probability mass function (PMF) relates to the CDF:

![title](images/L0901.png)

Therefore, you can compute any probability given a CDF. 

_Ex. Find $P(1 \lt x \le 3)$ using $F$._

\begin{align}
  & &P(x \le 1) + P(1 \lt x \le 3) &= P(x \le 3) \\
  & &\Rightarrow P(1 \lt x \le 3) &= F(3) - F(1)
\end{align}

Note that while we don't need to be so strict in the __continuous case__, for the discrete case you need to be careful about the $\lt$ and $\le$.


### Properties of CDF

A function $F$ is a CDF __iff__ the following three conditions are satisfied.
1. increasing
1. right-continuous (function is continuous as you _approach a point from the right_)
1. $F(x) \rightarrow 0 \text{ as } x \rightarrow - \infty$, and $F(x) \rightarrow 1 \text{ as } x \rightarrow \infty$.


### Independence of Random Variables

$X, Y$ are independent r.v. if

\begin{align}
  \underbrace{P(X \le x, Y \le y)}_{\text{joint CDF}} &= P(X \le x) P(Y \le y) & &\text{ for all x, y in the continuous case} \\
  \\
  P(X=x, Y=y) &= P(X=x) P(Y=y) & &\text{ for all x, y in the discrete case}
\end{align}

## Averages of Random Variables (mean, Expected Value)

A mean is... well, the _average of a sequence of values_.

\begin{align}
  1, 2, 3, 4, 5, 6 \rightarrow \frac{1+2+3+4+5+6}{6} = 3.5
\end{align}

In the case where there is repetition in the sequence

\begin{align}
  1,1,1,1,1,3,3,5 \rightarrow & \frac{1+1+1+1+1+3+3+5}{8} \\
  \\
  & \dots \text{ or } \dots \\
  \\
  & \frac{5}{8} ~~ 1 + \frac{2}{8} ~~ 3 + \frac{1}{8} ~~ 5 ~~~~ \text{ ... weighted average}
\end{align}

where the weights are the frequency (fraction) of the unique elements in the sequence, and these weights add up to 1.

#### Average of a discrete r.v. X

\begin{align}
  \mathbb{E}(X) = \sum_{x} \underbrace{x}_{\text{value}} ~~ \underbrace{P(X=x)}_{\text{PMF}} ~~~~ \text{ ... summed over x with  } P(X=x) \gt 0
\end{align}

#### Average of  $X \sim Bern(p)$

\begin{align}
  \mathbb{E}(X) &= 1 ~~ P(X=1) + 0 ~~ P(X=0) \\
  &= p
\end{align}

#### Average of an Indicator Variable

\begin{align}
  X &=
  \begin{cases}
    1, &\text{ if A occurs} \\
    0, &\text{ otherwise }
  \end{cases} \\
  \\
  \therefore \mathbb{E}(X) &= P(A)
\end{align}

Notice how this lets us relate (bridge) the expected value $\mathbb{E}(X)$ with a probability $P(A)$.

#### Average of $X \sim Bin(n,p)$

There is a hard way to do this, and an easy way.

First the hard way:

\begin{align}
  \mathbb{E}(X) &= \sum_{k=0}^{n} k \binom{n}{k} p^{k} q^{n-k} \\
  &= \sum_{k=1}^{n} n \binom{n-1}{k-1} p^{k} q^{n-k} \\
  &= n p \sum_{k=1}^{n} \binom{n-1}{k-1} p^{k-1} q^{n-k} \\
  &= n p \sum_{j=0}^{n-1} \binom{n-1}{j} p^{j} q^{n-j-1} ~~~~ \text{ let } j = k-1 \rightarrow k = j + 1\\ 
  &= n p ~~~~ \text{ ... by the binomial theorem }
\end{align}

Now, what about the _easy way_?

## Linearity of Expected Values

Linearity is this:

\begin{align}
  \mathbb{E}(X+Y) &= \mathbb{E}(X) ~~ \mathbb{E}(Y) \\
  \\
  &\text{... and ...} \\
  \\
  \mathbb{E}(cX) &= c \mathbb{E}(X) \\
  \\
  &\text{... even if X and Y are dependent!}
\end{align}


#### Average of $X \sim Bin(n,p)$ using Linearity

And so the easy way to calculate the average of a binomial r.v. is

\begin{align}
  \mathbb{E}(X) = np 
\end{align}

... since $X = X_1 + X_2 + \dots + X_n$ where $X_j \sim Bern(P)$, and each of these $Bern(p)$ have $\mathbb{E}(X_j)=p$.


#### Average of  Hypergeometric r.v.

Ex. 5-card hand $X=(\# aces)$. Let $X_j$ be the indicator that the $j^{th}$ card is an ace.

\begin{align}
  \mathbb{E}(X) &= \mathbb{E}(X_1 + X_2 + X_3 + X_4 + X_5) \\
  &= \mathbb{E}(X_1) + \mathbb{E}(X_2) + \mathbb{E}(X_3) + \mathbb{E}(X_4) + \mathbb{E}(X_5) ~~~~ \text{ ... by linearity } \\
  &= 5 ~~ \mathbb{E}(X_1) ~~~~ \text{ ... by symmetry } \\
  &= 5 ~~ P(1^{st} \text{ card is ace}) \\
  &= \frac{5}{13} ~~~~ \blacksquare
\end{align}

#### Average of $Geom(p)$

Consider the $Geom$ distribution, comprising independent $Bern(p)$ trials where we count the number of failures before first success. 

Let $X \sim Geom(p)$.

The PMF is

\begin{align}
  P(X=k) &= q^k p \text{, where you have k failures before seeing a success, } k \in \{1,2, \dots \}
\end{align}

Is this a valid PMF?

\begin{align}
  \sum_{k=0}^{\infty} p q^k &= p \sum_{k=0}^{\infty} q^k \\
  &= p ~~ \frac{1}{1-q} ~~~~ \text{ ... by the geometric series where } |r| < 1 \\
  &= \frac{p}{p} \\
  &= 1
\end{align}

So, the hard way to calulate the expected value of a $Geom(p)$ is

\begin{align}
  \mathbb{E}(X) &= \sum_{k=0}^{\infty} k p q^k \\
  &= p \sum_{k=0}^{\infty} k q^k \\
  \\
  \\
  \text{ now ... } \sum_{k=0}^{\infty} q^k &=  \frac{1}{1-q} ~~~~ \text{ ... by the geometric series where |q| < 1} \\
  \sum_{k=0}^{\infty} k q^{k-1} &= \frac{1}{(1-q)^2} ~~~~ \text{ ... by differentiating with respect to k} \\
  \sum_{k=0}^{\infty} k q^{k} &= \frac{q}{(1-q)^2} \\
  \\
  \\
  \text{ and returning, we have ... } \mathbb{E}(X) &= p ~~ \frac{q}{(1-q)^2} \\
  &= p ~~ \frac{q}{p^2} \\
  &= \frac{q}{p}~~~~ \blacksquare
\end{align}