# 8. Conditional Expectations, Moments, and Moment-Generating Functions (MGFs)

<hr>

Conditional expectation refers to the expected value of a random variable given that certain conditions are met or certain values are observed in another random variable.

The conditional expectation of a random variable $X$ given $Y$ is denoted as $E(X \mid Y)$. It represents the average or mean of $X$ when $Y$ is known.

\begin{align}
E(X \mid Y=y) &= \sum_{\forall{x}} x . P(X=x \mid Y=y) & \text{(discrete)} \\
&= \sum_{\forall{x}} x \frac{P(x,y)}{P_Y(y)} \\
\\
E(X \mid Y=y) &= \int_{\forall{x}} x . f_{X \mid Y} (x \mid y) dx & \text{(continuous)} \\
&= \int_{\forall{x}} x \frac{f(x,y)}{f_Y(y)} \\
\end{align}

### 8.1 Computing Expectations using Conditioning

$$E[X] = E_Y [E_X [X \mid Y=y ]]$$

*Proof.* 

\begin{align}
E_Y [ E_X [X \mid Y-y ]] &= \sum_{\forall{y}} E[X \mid Y=y] . P(Y=y) \\
&= \sum_{\forall{y}} \sum_{\forall{x}} x P(X \mid Y) . P(Y=y) \\
&= \sum_{\forall{y}} \sum_{\forall{x}} x \frac{P(x,y)}{P_Y(y)} P(Y=y) \\
&= \sum_{\forall{y}} \sum_{\forall{x}} x P(x,y) \\
&= \sum_{\forall{x}} x \sum_{\forall{y}} P(x,y) \\
&= \sum_{\forall{y}} x P_X(x,y) \\
&= E[X] \\
\end{align}

If $X$ and $Y$ are independent, then:

$$E[X \mid Y] = E[X]$$

## 8.2 Moments
<hr>

In statistics, moments are quantitative measures related to the shape of a distribution's graph. For each integer $n$, the $n$th moment of a random variable $X$ is defined as:

$$\mu'_{n} = E[X^n]$$

The $n$th central moment of $X$ is defined as:

$$\mu_n = E[X-\mu]^n$$

- **First Moment (Mean):** The first moment is the expected value or the mean of the distribution.
- **Second Moment (Variance):** The second central moment is the variance.
- **Higher Moments:** The third moment is related to skewness (asymmetry) and the fourth moment to kurtosis (tailedness). Higher moments are used less frequently but can describe more complex aspects of the distribution's shape.

$$E[X^n] = \sum_{\forall{x}} x^n p(x)$$

# 8.3 Moment-Generating Functions (MGFs)
<hr>

Let $X$ be a random variable with density $f(x)$, then the moment generating function (mgf) is defined as the **Laplace transformation** of $f(x)$. 

Let the mgf of $X$ is denoted as $M_X (t)$.

$$M_X(t) = E_X(e^{tx}) = \sum_{\forall{x}} e^{tx} f(x) = \int_{\forall{x}} f(x) dx $$

**Theorem:** If $X$ is a random variable with MGF $M_X(t)$, then

$$ E[X^n] = \left. \frac{d^n}{dt^n} M_X(t) \right|_{t=0} $$

*Proof.*

$$\frac{d}{dt} M_X(t) = \frac{d}{dt} \int_{-\infty}^{\infty} e^{tx} f(x) dx = \int_{-\infty}^{\infty} xe^{tx} f(x) dx$$

Evaluating at $t=0$:

$$\left. \frac{d}{dt} M_X(t) \right|_{t=0} = E[X]$$

# 8.4 MGFs of Random Variables
<hr>

## 8.4.1 MGF of Binomial
If $X \sim \text{Binomial}(n,p)$, then $p(x) = \binom{n}{x}p^x (1-p)^{n-x}$. Therefore,

$$M_X(t) = E_X(e^{tx}) = \sum_{x} e^{tx} p(x) = \sum_{x=0}^n e^{tx} \binom{n}{x}p^x (1-p)^{n-x} = \sum_{x=0}^n \binom{n}{x} (e^t p)^x (1-p)^{n-x}$$

Since,

$$(a+b)^n = \sum_{i=1}^n \binom{n}{i}a^i b^{n-i}$$

<hr>
$$M_X(t) = \sum_{x=0}^n \binom{n}{x} (e^t p)^x (1-p)^{n-x} = (e^t p + 1-p)^n) = (pe^t + q)^n$$
<hr>

**Verify** that $E(X)=np$ and $\text{Var}(X)=np(1-p)$.

By definition, the first moment of $X$ is the $E[X]$ and the second central moment is $Var(X)$.

$$E[X] = \left. \frac{d}{dt} (pe^t + q)^n \right|_{t=0} = np$$

$$E[X^2] = \left. \frac{d^2}{dt^2} (pe^t + q)^n \right|_{t=0} = np(np+q)$$

Therefore,

$$\text{Var}(X) = E[X^2] - \mu^2 = np(np+q) - (np)^2 = npq$$

## 8.4.2 MGF of Geometric
$$X \sim \text{Geometric}(p)$$

## 8.4.3 MGF of Poisson
If $X \sim \text{Poisson}(\lambda)$, then $p(x) = e^{-\lambda} \frac{\lambda^x}{x!}$. Therefore,

$$M_X(t) = E_X(e^{tx}) = \sum_{x} e^{tx} p(x) = \sum_{x} e^{tx} e^{-\lambda} \frac{\lambda^x}{x!} = e^{-\lambda} \sum_{x} e^{tx} \frac{\lambda^x}{x!} = e^{-\lambda} \sum_{x} \frac{(e^t\lambda)^x}{x!}$$

Since,

$$\sum_{x=0}^\infty \frac{x^n}{n!} = e^x$$

<hr>

$$M_X(t) = e^{-\lambda} \sum_{x} \frac{(e^t \lambda)^x}{x!} = e^{-\lambda} e^{e^t \lambda} = e^{\lambda(e^t-1)}$$

<hr>

**Verify** that $E(X)=\lambda$ and $\text{Var}(X)=\lambda$.

By definition, the first moment of $X$ is the $E[X]$ and the second central moment is $Var(X)$.

$$E[X] = \left. \frac{d}{dt} e^{\lambda(e^t-1)} \right|_{t=0} = \lambda$$

$$E[X^2] = \left. \frac{d^2}{dt^2} e^{\lambda(e^t-1)} \right|_{t=0} = \lambda^2 + \lambda$$

Therefore,

$$\text{Var}(X) = E[X^2] - \mu^2 = \lambda^2 + \lambda - \lambda^2 = \lambda$$

## 8.4.4 MGF of Uniform
$$X \sim \text{Uniform}(a, b)$$

## 8.4.5 MGF of Normal
$$X \sim \text{Normal}(\mu, \sigma^2)$$

## 8.4.6 MGF of Exponential
$$X \sim \text{exp}(\lambda)$$

## 8.4.7 MGF of Gamma
$$X \sim \text{Gamma}(\lambda)$$

## 8.4.8 MGF of Chi-Square
$$X \sim \text{\chi^2}(\nu)$$

## 8.4.9 MGF of Beta
**Does not exist in closed-form.**