# 8. Conditional Expectations, Moments, and Moment-Generating Functions (MGFs)

<hr>

Conditional expectation refers to the expected value of a random variable given that certain conditions are met or certain values are observed in another random variable.

The conditional expectation of a random variable $X$ given $Y$ is denoted as $E(X \mid Y)$. It represents the average or mean of $X$ when $Y$ is known.

\begin{align}
E(X \mid Y=y) &= \sum_{\forall{x}} x . P(X=x \mid Y=y) & \text{(discrete)} \\
&= \sum_{\forall{x}} x \frac{P(x,y)}{P_Y(y)} \\
\\
E(X \mid Y=y) &= \int_{\forall{x}} x . f_{X \mid Y} (x \mid y) dx & \text{(continuous)} \\
&= \int_{\forall{x}} x \frac{f(x,y)}{f_Y(y)} \\
\end{align}

## 8.1 Computing Expectations using Conditioning
<hr>

$$E[X] = E_Y [E_X [X \mid Y=y ]]$$

*Proof.* 

\begin{align}
E_Y [ E_X [X \mid Y-y ]] &= \sum_{\forall{y}} E[X \mid Y=y] . P(Y=y) \\
&= \sum_{\forall{y}} \sum_{\forall{x}} x P(X \mid Y) . P(Y=y) \\
&= \sum_{\forall{y}} \sum_{\forall{x}} x \frac{P(x,y)}{P_Y(y)} P(Y=y) \\
&= \sum_{\forall{y}} \sum_{\forall{x}} x P(x,y) \\
&= \sum_{\forall{x}} x \sum_{\forall{y}} P(x,y) \\
&= \sum_{\forall{y}} x P_X(x,y) \\
&= E[X] \\
\end{align}

If $X$ and $Y$ are independent, then:

$$E[X \mid Y] = E[X]$$

## 8.2 Moments
<hr>

In statistics, moments are quantitative measures related to the shape of a distribution's graph. For each integer $n$, the $n$th moment of a random variable $X$ is defined as:

$$\mu'_{n} = E[X^n]$$

The $n$th central moment of $X$ is defined as:

$$\mu_n = E[X-\mu]^n$$

- **First Moment (Mean):** The first moment is the expected value or the mean of the distribution.
- **Second Moment (Variance):** The second central moment is the variance.
- **Higher Moments:** The third moment is related to skewness (asymmetry) and the fourth moment to kurtosis (tailedness). Higher moments are used less frequently but can describe more complex aspects of the distribution's shape.

$$E[X^n] = \sum_{\forall{x}} x^n p(x)$$

## 8.3 Moment-Generating Functions (MGFs)
<hr>

Let $X$ be a random variable with density $f(x)$, then the moment generating function (mgf) is defined as the **Laplace transformation** of $f(x)$. 

Let the mgf of $X$ is denoted as $M_X (t)$.

$$M_X(t) = E_X(e^{tx}) = \sum_{\forall{x}} e^{tx} f(x) = \int_{\forall{x}} e^{ty} f(x) dx $$

**Theorem:** If $X$ is a random variable with MGF $M_X(t)$, then

$$ E[X^n] = \left. \frac{d^n}{dt^n} M_X(t) \right|_{t=0} $$

*Proof.*

$$\frac{d}{dt} M_X(t) = \frac{d}{dt} \int_{-\infty}^{\infty} e^{tx} f(x) dx = \int_{-\infty}^{\infty} xe^{tx} f(x) dx$$

Evaluating at $t=0$:

$$\left. \frac{d}{dt} M_X(t) \right|_{t=0} = E[X]$$

## 8.4 MGFs of Random Variables
<hr>

### 8.4.1 MGF of Binomial
If $X \sim \text{Binomial}(n,p)$, then $p(x) = \binom{n}{x}p^x (1-p)^{n-x}$. Therefore,

$$M_X(t) = E_X(e^{tx}) = \sum_{x} e^{tx} p(x) = \sum_{x=0}^n e^{tx} \binom{n}{x}p^x (1-p)^{n-x} = \sum_{x=0}^n \binom{n}{x} (e^t p)^x (1-p)^{n-x}$$

Since,

$$(a+b)^n = \sum_{i=1}^n \binom{n}{i}a^i b^{n-i}$$

<hr>
$$M_X(t) = \sum_{x=0}^n \binom{n}{x} (e^t p)^x (1-p)^{n-x} = (e^t p + 1-p)^n) = (pe^t + q)^n$$
<hr>

**Verify** that $E(X)=np$ and $\text{Var}(X)=np(1-p)$.

By definition, the first moment of $X$ is the $E[X]$ and the second central moment is $Var(X)$.

$$E[X] = \left. \frac{d}{dt} (pe^t + q)^n \right|_{t=0} = np$$

$$E[X^2] = \left. \frac{d^2}{dt^2} (pe^t + q)^n \right|_{t=0} = np(np+q)$$

Therefore,

$$\text{Var}(X) = E[X^2] - \mu^2 = np(np+q) - (np)^2 = npq$$

### 8.4.2 MGF of Poisson
If $X \sim \text{Poisson}(\lambda)$, then $p(x) = e^{-\lambda} \frac{\lambda^x}{x!}$. Therefore,

$$M_X(t) = E_X(e^{tx}) = \sum_{x} e^{tx} p(x) = \sum_{x} e^{tx} e^{-\lambda} \frac{\lambda^x}{x!} = e^{-\lambda} \sum_{x} e^{tx} \frac{\lambda^x}{x!} = e^{-\lambda} \sum_{x} \frac{(e^t\lambda)^x}{x!}$$

Since,

$$\sum_{x=0}^\infty \frac{x^n}{n!} = e^x$$

<hr>

$$M_X(t) = e^{-\lambda} \sum_{x} \frac{(e^t \lambda)^x}{x!} = e^{-\lambda} e^{e^t \lambda} = e^{\lambda(e^t-1)}$$

<hr>

**Verify** that $E(X)=\lambda$ and $\text{Var}(X)=\lambda$.

By definition, the first moment of $X$ is the $E[X]$ and the second central moment is $Var(X)$.

$$E[X] = \left. \frac{d}{dt} e^{\lambda(e^t-1)} \right|_{t=0} = \lambda$$

$$E[X^2] = \left. \frac{d^2}{dt^2} e^{\lambda(e^t-1)} \right|_{t=0} = \lambda^2 + \lambda$$

Therefore,

$$\text{Var}(X) = E[X^2] - \mu^2 = \lambda^2 + \lambda - \lambda^2 = \lambda$$

### 8.4.3 MGF of Exponential

If $X \sim \text{exp}(\lambda)$, then $f(x) = \lambda e^{-\lambda x} = \frac{1}{\beta} e^{-y/\beta}$ where $\lambda = \frac{1}{\beta}$ such that $\beta>0, y \geq 0$.

$$M_X(t) = E_X(e^{tx}) = \int_{0}^\infty e^{tx} \left( \frac{1}{\beta} e^{\frac{-y}{\beta}} \right) dx = \frac{1}{\beta} \int_0^\infty e^{\frac{- (1-\beta t)y}{\beta}} dx = \frac{1}{\beta} \int_0^\infty e^{\frac{-y}{\left( \frac{\beta}{1-\beta t} \right)}} dx$$

*Note: $e^{\frac{-y}{\left( \frac{\beta}{1-\beta t} \right)}}$ is the exponential kernel. Therefore,* 

$$\int_0^\infty e^{\frac{-y}{\left( \frac{\beta}{1-\beta t} \right)}} dx = \frac{\beta}{1- \beta t}$$

<hr>

$$M_X(t) = \frac{1}{\beta} \left( \frac{\beta}{1-\beta t} \right) = \frac{1}{1-\beta t}$$

<hr>

**Verify** that $E(X)=\beta$ and $\text{Var}(X)=\beta^2$.

By definition, the first moment of $X$ is the $E[X]$ and the second central moment is $Var(X)$.

$$E[X] = \left. \frac{d}{dt}  \frac{1}{1-\beta t} \right|_{t=0} = \beta$$

$$E[X^2] = \left. \frac{d^2}{dt^2} \frac{1}{1-\beta t} \right|_{t=0} = 2 \beta^2$$

Therefore,

$$\text{Var}(X) = E[X^2] - \mu^2 = 2 \beta^2 -\beta^2 = \beta^2$$

In general, the moments of the exponential distribution are:

$$E(X^k) = k! \beta^k$$

## Table of MGFs
Commonly used distributions and their corresponding MGFs are given below:

<hr>

<div style="text-align:center">
    <img src="media/discrete_mgfs.png" width=900>
</div>

<br>
<hr>

<div style="text-align:center">
    <img src="media/cont_mgfs.png" width=900>
</div>


- $\chi^2$ and Exponential are special cases of the Gamma.
- $\chi^2=\text{Gamma}(\alpha = \nu/2, \beta=2)$
- $\text{Exponential}=\text{Gamma}(\alpha=1,\beta)$
- Poisson is the only distribution for which $\mu=\sigma^2$

**Question:** Identify the distribution and values of the mean and variance.

\begin{align}
(1-2t)^{-5} \\
\text{This is Gamma}(5, 2) \\
\mu = \alpha \beta = 10 \\
\sigma^2 = \alpha \beta^2 = 5(4) = 20 \\
\end{align}

<br>

\begin{align}
\frac{0.3e^t}{1-0.7e^t} \\
\text{This is Geometric}(0.3) \\
\mu = \frac{1}{p} = \frac{1}{0.3} \\
\sigma^2 = \frac{1-p}{p^2} = \frac{0.7}{0.3^2} \\
\end{align}

<br>

\begin{align}
e^{5t+18t^2} \\
\text{This is Normal}(5, 6) \\
\mu = 5 \\
\sigma^2 = 6 \\
\end{align}

## Linear Transformations using a Moment-Generating Function
<hr>

**Example:** $Y=\text{Poisson}(\lambda=4)$

Consider the transformation: $W=2Y+5$. What is the MGF of $W = ?$

The MGF of $Y$ is $e^{\lambda (e^t - 1)} = e^{4 (e^t -1)}$

$$W=aY+b \quad a=2, b=5$$

***Then, two random variables are equal if and only if they have the same moment generating function.***

$$M_W (t) = M_{aY+b} (t)$$

$$M_W (t) = E(e^{tW}) = E(e^{t(aY+b)}) = E(e^{taY} . e^{tb}) = e^{tb} E(e^{taY}) = e^{tb} M_Y (ta)$$

For $\lambda=4, a=2, b=5, M_Y(t) = e^{4(e^t - 1)}$:

$$M_W(t) = e^{5t} M_Y(2t) = e^{5t} e^{4 (e^{2t} - 1} = e^{4e^{2t}+5t-4}$$

Since $M_W (t)$ is not the same structure as the MGF of a Poisson, the random variable $W$ is not a Poisson. We can't recognize the pmf or pdf of $W$.

<br>

**Example:** $Y \sim N(3, 2)$

Consider the transformation: $W=4Y-7$. What is the MGF of $W = ?$

The MGF of $Y$ is $e^{\mu t + \frac{t^2 \sigma^2}{2}} = e^{3t + 2t^2}$

$$W = 4Y-7 \quad a=4, b=-7$$

$$M_W(t) = e^{tb} M_Y(ta) = e^{-7t} e^{12t+32t^2} = e^{5t+32t^2} = N(\mu=5, \sigma^2 = 64)$$

-	Any linear transformation of a normal distribution is also normal.
-	Any linear transformation of a uniform distribution keeps the same functional form.
-	Other distributions will retain the functional form under certain conditions.