In [1]:
from datascience import *
import numpy as np
from math import *

## Transformations

In some cases, we may be interested in the distribution of a transformation of a random variable. For example, if we know the distribution of $X$, we may wish to know the distribution of $X^2$ or $2X$. 

It helps to consider the pmf/cdf of the original random variables. Let $Y=t(X)$ where $X$ is discrete:

$$
f_Y(y)=P(Y=y) = P(t(X)=y) = P( X = t^{-1}(y))
$$

In the continuous case, let's consider the cdf:

$$
F_Y(y)=P(Y\leq y) = P(t(X)\leq y) = P(X \leq t^{-1}(y)) = F_X(t^{-1}(y))
$$

### Discrete

#### Example 1

Suppose the pmf for $X$ is given by the following table: 

 | value of $X$  | -2 | -1 | 0 | 1 | 2 | 
 | ------ | ------ | ----- | ----- | ----- | ----- |
 | probability | 0.05 | 0.10 | 0.35 | 0.30 | 0.20 |

Find the distribution of $X^2$ and calculate $E(X^2)$. Does $E(X^2) = [E(X)]^2$? 

The random variable $X^2$ can only take values 0, 1 or 4, with the following probabilities:

In [5]:
fy=Table().with_columns("Y",np.array([0,1,4]),"probability",np.array([0.35,0.1+0.3, 0.05+0.2]))
fy

Y,probability
0,0.35
1,0.4
4,0.25


In [8]:
ex= (np.array([-2,-1,0,1,2])*np.array([0.05,0.1,0.35,0.3,0.2])).sum()
ey = (fy.column(0)*fy.column(1)).sum()
print(ex**2)
print(ey)

0.25
1.4


#### Example 2
Let $X \sim \textsf{Binom}(n,p)$. What is the pmf for $X+3$? Make sure you specify the domain of $X+3$. 

Let $Y=X+3$. $f_Y(y)=P(Y=y)=P(X+3=y)=P(X=y-3)$. So,

$$
f_Y(y) = f_X(y-3) = {n\choose {y-3}}p^{y-3}(1-p)^{n-y+3}
$$
where $3\leq y \leq n+3$. 

#### Example 3

Let $X \sim \textsf{Unif}(0,1)$. Let $Y=X^2$. Find the **pdf** of $Y$. Again, specify the domain of $Y$. 

Note that $f_X(x)=1$ on $[0,1]$ and $F_X(x) = x$ on the same interval. Using the cdf, 

$$
F_Y(y)=P(Y\leq y) = P(X^2 \leq y) = P\left(X \leq \sqrt{y}\right) = F_X\left(\sqrt{y}\right) = \sqrt{y}
$$

And since $f_Y(y) = \frac{d}{dy} F_Y(y)$, 

$$
f_Y(y)=\frac{d}{dy} \sqrt{y} = \frac{1}{2\sqrt{y}}
$$

on $[0,1]$. 

## Moment Generating Functions (MGF)

One powerful concept in probability is the moment generating function (mgf). Let $X$ be a random variable. The mgf of $X$ is denoted by $M_X(t)$. This function is powerful because it can be used as a shortcut to find the $k$th central moment. Specifically,

$$
E(X^k) = \frac{d^k}{dt^k} M_X(t) \bigg |_{t=0}
$$

If you know the moment generating function of $X$, you can simply take the derivative of it with respect to $t$, evaluate at $t=0$ and the result is the expected value of $X$, $E(X)$. 

The mgf of $X$ is found by

$$
M_X(t) = E(e^{tX})
$$

#### Example 4: 

Let $X$ be a random variable with the exponential distribution with parameter $\lambda >0$. Recall that $f_X(x) = \lambda e^{-\lambda x}$, for $x>0$. Find the mgf of $X$. Use it to verify that $E(X) = \frac{1}{\lambda}$. 



$$
M_X(t) = E(e^{tX})=\int_0^\infty e^{tX}\lambda e^{-\lambda x} dx = \int_0^\infty \lambda e^{tx-\lambda x}dx=\frac{\lambda}{t-\lambda} e^{(t-\lambda)x}\bigg |_0^\infty
$$

Assuming $|t| < \lambda$, this simplifies to

$$
M_X(t) = \frac{\lambda}{\lambda-t}
$$

Differentiating with respect to $t$: 

$$
\frac{d}{dt} \frac{\lambda}{\lambda-t} = \frac{\lambda}{(\lambda-t)^2}
$$

When $t=0$, the result is $\frac{1}{\lambda}$. 

#### Example 5:

The moment generating function of a random variable with the binomial distribution (with parameters $n$ and $p$) is given by $M_X(t) = (pe^t + 1 - p)^n$. Use the mgf to verify that $E(X)=np$ and $V(X)=np(1-p)$. Note that $V(X)=E(X^2)-[E(X)]^2$. 

$$
M_X'(t)=n(pe^t+1-p)^{n-1}(pe^t)
$$

$$
E(X)= M_X'(0)=n(p+1-p)^{n-1}(p)= np
$$

$$
M_X''(t)=n(n-1)(pe^t+1-p)^{n-2}(p^2e^{2t}) + n(pe^t + 1 - p)^{n-1}(pe^t)
$$

$$
E(X^2)=M_X''(0)=n(n-1)(p+1-p)^{n-2}(p^2) + n(p + 1 - p)^{n-1}(p) = n(n-1)p^2 + np
$$

$$
Var(X) = E(X^2)-[E(X)]^2 = n^2p^2 - np^2 + np - n^2p^2 = n(p-p^2)=np(1-p)
$$

### Important Results

1) Let $X$ and $Y$ be random variables with mgfs $M_X$ and $M_Y$. $X$ and $Y$ are said to be identically distributed if and only if $M_X(t) = M_Y(t)$ for all $t$ in som interval containing 0. 

2) MGF of linear transformation of random variable: If $a$ and $b$ are constants, then 

$$
M_{aX+b}(t) = e^{bt}M_X(at)
$$

3) MGF of sum of independent random variables: If $X$ and $Y$ are independent random variables with mgfs $M_X$ and $M_Y$, then

$$
M_{X+Y}(t)=M_X(t) \cdot M_Y(t)
$$

 

#### Example 6 

Let $X \sim \textsf{Exp}(\lambda)$. Find the distribution of $Y=3X$.

From above, $M_X(t) = \frac{\lambda}{\lambda-t}$. From the second result above, $M_Y(t) = \frac{\lambda}{\lambda-3t} = \frac{\lambda/3}{\lambda/3 -t}$. This is the mgf of an exponentially distributed random variable with parameter $\lambda/3$. 

#### Example 7 

Suppose $X_1, X_2, ..., X_n$ are independent identically distributed $\textsf{Norm}(\mu,\sigma)$. Find the distribution of $S=X_1+X_2+...+X_n$ and $\bar{X} = \frac{X_1+X_2+...+X_n}{n}$. Note that the mgf of a normally distributed random variable is $M_X(t)=e^{\mu t+\sigma^2 t^2/2}$.

From result 3 above, 

$$
M_S(t)= \prod_{i=1}^n M_{X_i}(t) = \prod_{i=1}^n e^{\mu t+\sigma^2 t^2/2} = \left( e^{\mu t+\sigma^2 t^2/2}\right)^n = e^{n\mu t + n\sigma^2 t^2/2}
$$

This is the mgf of a normally distributed random variable with mean $n\mu$ and standard deviation $\sqrt{n} \sigma$. 

Since $\bar{X}=S/n$, we can use result 2 from above. 

$$
M_{\bar{X}}(t) = M_S(t/n) = e^{n\mu t/n + n\sigma^2 (t/n)^2/2}=e^{\mu t + \frac{\sigma^2}{n} \frac{t^2}{2}}
$$

This is the mgf of a normally distributed random variable with mean $\mu$ and standard deviation $\frac{\sigma}{\sqrt{n}}$. 