## Expectations

Definition

If X is a random variable defined on a sample sapce S, then the expectation of X is 

$$E[X] := \sum_{w\in S} X(w).Pr(w)$$

or an alternative definition

$$E[X] := \sum_{x\in range(X)} xp_X(x)$$

Linearity

$$E[\alpha X + \beta] = \alpha E[X] + \beta$$

$$E[X + Y + Z] = E[X] + E[Y] + E[Z]$$

Prove: Let $T = R_1 + R_2$, the proof follows by rearranging terms in the definition of expectation

$$\begin{align} E[T] & := \sum_{w\in S} T(w).Pr(w) \\
& = \sum_{w\in S} (R_1(w) + R_2(w)).Pr(w) \\
& = \sum_{w\in S} R_1(w)Pr(w) + \sum_{w\in S} R_2(w)P(w) \\
& = E[R_1] + E[R_2]
\end{align}$$

If X, Y are independent

- $E[XY] = E[X].E[Y]$
- $E[g(X)h(Y)] = E[g(X)].E[h(Y)]$

If X is a constant c, it means X can be only take up one value with probability 1, hence

$E[c] = c$


## Variances

Definition of variance, for every possible value of x, calculate its distance from the mean, then square, then weighted with its probability

$$\begin{align}
Var(X) & = E[(X - E[X])^2] \\
& = \sum_x (x - E[X])^2 p_X(x) \\
\end{align}$$

Here, derive a useful relation between expectation and variance which give us convenience to compute variance

$$\begin{align}
Var(X) & = E[(X - E[X])^2] \\
& = E[X^2 - 2X.E[X] + (E[X])^2 \\
& = E[X^2] - 2E[X.E[X]] + E[(E[X])^2] \\
\end{align}$$

For the second term, $E[X]$ is a number, so by linearity property, it can be factored out. For the third term, $(E[X])^2$ is a number and expectation of a number is the number itself

$$\begin{align} & = E[X^2] - 2E[X].E[X] + (E[X])^2 \\
Var(X) & = E[X^2] - (E[X])^2
\end{align}$$

Property

$Var(aX) = a^2 Var(X)$

$Var(X + a) = Var(X)$

If X, Y are independent

$Var(X + Y) = Var(X) + Var(Y)$




## Discrete Uniform Distribution

Ex: fair dice


## Continuous Uniform Distribution

If X is a uniform randon variable, it is all 0, except it has constant value of probility density function within a range [a,b] 

The height is $\frac{1}{b-a}$, so that width times height has area 1

$$ f_X(x) = \begin{cases}& \frac{1}{b-a} \quad a \leq x \leq b \\
& 0 \quad \text{otherwise}
\end{cases}$$

We can get the expectation by definition and/or considering expectation is the center of gravity, which is the midpoint of the interval [a,b]

$$E[X] = \int_a^b x \frac{1}{b-a}\ dx = \frac{x^2}{2(b-a)}|_a^b = \frac{b^2 - a^2}{2(b-a)} = \frac{a+b}{2}$$

midpoint of interval [a,b] is

$$E[X] = \frac{a + b}{2}$$

Variance

$$Var(X) = \int_a^b (x - \frac{a+b}{2})^2 \frac{1}{b-a}\ dx = \frac{(b-a)^2}{12}$$

More simply, we can calculate by Law of the Unconscious Statistician which states that 

$$E[g(X)] = \int_{-\infty}^{\infty} g(X) f_X(x)\ dx$$

Here $g(X) = X^2$, X~Uniform(0,1), E[X] = 1/2

$$E[X^2] = \int_0^1 x^2 f_X(x)\ dx = \int_0^1 x^2 \ dx = \frac{1}{3}$$

because $f_X(x) = 1$ in interval 0 to 1

$$Var(X) = E[X^2] - (E[X])^2 = \frac13 - \frac14 = \frac{1}{12}$$


## Geometric Distribution

Lecture 6

$X:$ number of independent coin tosses until first head

$$p_X(k) = (1-p)^{k-1}p, \quad k = 1, 2, ...$$

$$E[X] = \sum_{k=1}^\infty k.p_X(k) = \sum_{k=1}^\infty k(1-p)^{k-1}p$$

There is a trick to evaluate this algebraically and another way

Let say there is X who toss two tails before Y starts to toss coins (Y = X-2). Because coin toss is independent, two tails tossed by X doesn't affect what happen after it

$p_{X-2|X > 2}(k) = p_Y(k) = (1-p)^{k-1} p, \quad k >= 1$

Or another way


$$\begin{align*}
  p_{X-2|X>2}(k) &\equiv P(X-2 = k | X > 2)\\
           &= P(X = k + 2 | X > 2)\\
           &= \frac{P(X = k + 2, X > 2)}{P(X > 2)}\\
           &= \frac{P(X = k + 2)}{P(X > 2)}\\
           &= \frac{(1-p)^{k+2-1} p}{(1-p)^2}\\
           &= (1-p)^{k-1}p\\
           &= p_X(k)
 \end{align*}$$


From total expectation theorem

$E[X] = P(X = 1)E[X | X = 1] + P(X > 1)E[X|X>1]$

$P(X = 1)$ means first toss is head, so p. After first toss is head, X is a number, and expectations of a number is the number itself, $E[1 | 1 = 1] = 1$

$P(X > 1) = 1 - p$ which is the probability of first toss fails. What about expectation after first toss fails. $E[X | X-1 > 0] = E[X-1| X-1 > 0] + 1 = E[X] + 1$. +1 because he has wasted one toss, and because the process is memoryless, on the next toss, the expectation is still the same E[X]

$$E[X] = p + (1-p).(E[X] + 1)$$

$$p.E[X] = p + 1 - p$$

$$E[X] = \frac{1}{p}$$
