# Moment-generating function

## Moment-generating function: definition

**Moment-generating function** of r.v. $X$ is
$$
M_X(t) = \mathbb{E}\left[e^{tX}\right]
$$

It does not always exist. If it exists and is finite:
- It uniquely defines distribution of $X$
- $M_X(t) > 0, \forall t$ and $M_X(0) = 1$
- $M_{aX+b}(t) = e^{bt} M_X(at)$
- For all $k$ exists a finite moment of $X$ and is defined as $\mathbb{E}[X^k] = M^{(k)}_X(0)$ meaning $k$-th derivative

The purpose of MGF is to replace computation of expectation with differentiation.

## Example 1: Bernoulli MGF

Consider $X \sim Be(p)$. What is $M_X(t)$? Find expectation and variance using MGF.

## Solution 1

MGF:
$$
M_X(t) = \mathbb{E}\left[e^{tX}\right] = e^{t \cdot 0} \cdot \mathbb{P}(X = 0) + e^{t \cdot 1} \cdot \mathbb{P}(X = 1) = q + pe^t
$$

First and second derivatives are $pe^t$, so
$$
\mathbb{E}X = M'_X(0) = pe^0 = p = M''_X(0) = \mathbb{E}\left[X^2\right]
$$
$$
\mathbb{V}\text{ar}(X) = M''_X(0) - \left(M'_X(0)\right)^2 = p - p^2 = p(1-p)
$$

## Example 2: Poisson MGF

Consider $X \sim Pois(\lambda)$. What is $M_X(t)$? Find expectation and variance using MGF.

## Solution 2

MGF:
$$
M_X(t) = \mathbb{E}\left[e^{tX}\right] = \sum\limits_{k=-\infty}^\infty e^{tk} \frac{\lambda^k}{k!} e^{-\lambda} = e^{-\lambda} \sum\limits_{k=-\infty}^\infty \frac{1}{k!} \left( \lambda e^{t}\right)^k = \exp \left( \lambda \left( e^t - 1 \right) \right)
$$

First derivative:
$$
M'_X(t) = \lambda e^t \exp \left( \lambda \left( e^t - 1 \right) \right)
$$

Expectation $M'_X(0) = \lambda$. Second derivative:
$$
M''_X(0) = \lambda e^t \exp \left( \lambda \left( e^t - 1 \right) \right) + \lambda e^t \exp \lambda e^t \left( \lambda \left( e^t - 1 \right) \right)
$$

Second moment $M''_X(0) = \lambda + \lambda^2$. Variance $\mathbb{V}\text{ar}(X) = \lambda + \lambda^2 - \lambda^2 = \lambda$.

## Example 3: Gaussian MGF

Consider $X \sim \mathcal{N}(\mu, \sigma^2)$. What is $M_X(t)$? Find expectation and variance using MGF.

## Solution 3

First let's find for $Y \sim \mathcal{N}(0, 1)$, then apply properties.
$$
\begin{aligned}
M_Y(t) & = \mathbb{E}\left[e^{tY}\right] = \frac{1}{\sqrt{2\pi}} \int\limits_{-\infty}^\infty e^{tx} e^{-x^2 / 2} dx = \frac{1}{\sqrt{2\pi}} \int\limits_{-\infty}^\infty \exp\left( -\frac{x^2 - 2tx}{2}\right) dx = \\
& = \frac{1}{\sqrt{2\pi}} \int\limits_{-\infty}^\infty \exp\left( -\frac{(x - t)^2 - t^2}{2}\right) dx = \\
& = \exp\left( \frac{t^2}{2} \right) \frac{1}{\sqrt{2\pi}} \int\limits_{-\infty}^\infty \exp\left( -\frac{(x - t)^2}{2}\right) dx = \exp\left( \frac{t^2}{2} \right)
\end{aligned}
$$

## Solution 3 (continued)

From properties, $M_X(t) = e^{\mu t} M_Y(\sigma t) = \exp \left( \mu t + \frac{t^2 \sigma^2}{2} \right)$. First derivative:
$$
M'_X(t) = \left( \mu + t \sigma^2 \right) \exp \left( \mu t + \frac{t^2 \sigma^2}{2} \right)
$$

Second derivative:
$$
M''_X(t) = \sigma^2 \exp \left( \mu t + \frac{t^2 \sigma^2}{2} \right) + \left( \mu + t \sigma^2 \right)^2  \exp \left( \mu t + \frac{t^2 \sigma^2}{2} \right)
$$

Expectation: $M'_X(0) = \mu$, variance $M''_X(0) - \left( M'_X(0) \right)^2 = \sigma^2$.

# Random vector

## Random vector: definition
Consider probability space $(\Omega, \mathcal{F}, \mathbb{P})$. Then, a **random vector** is a borel function
$$
\mathbf{X}: \Omega \to \mathbb{R}^n,
$$
where $\mathbf{X} = (X_1, \ldots, X_n)^\top$. Every component $X_i$ of the vector is a random variable. The converse is also true: for any r.v.s $X_1, \ldots, X_n$ a vector $(X_1, \ldots, X_n)^\top$ is a random vector.

## Random vector: distribution

The distribution of a random vector $\mathbf{X} = (X_1, \ldots, X_n)^\top$ can be described via **multivariate  (joint) cumulative distribution function**:
$$
F_{\mathbf{X}}(\mathbf{x}) = \mathbb{P}(X_1 < x_1, X_2 < x_2, \ldots, X_n < x_n)
$$

Properties of multivariate CDF:
- $\lim_{x_i \to -\infty} F_{\mathbf{X}}(\mathbf{x}) = 0$ but $\lim_{x_1, \ldots, x_n \to \infty} F_{\mathbf{X}}(\mathbf{x}) = 1$
- $\lim_{x_i \to \infty} F_{\mathbf{X}}(\mathbf{x}) = $ the function $F$ of everything except $x_i$
- $F_{\mathbf{X}}(\mathbf{x})$ is non-decreasing and left-continuous in every component
- Supermodulatiry: $F_{\mathbf{X}}(x_1, \ldots, x_i, \ldots, x_n) - F_{\mathbf{X}}(x_1, \ldots, x_i - \varepsilon, \ldots, x_n) \geqslant 0$

## Random vector: distribution

If $X$ has continuous distribution, then exists **multivariate (joint) probability density function**, i.e. non-negative function $f_{\mathbf{X}}(\cdot)$ such that
$$
    \mathbb{P}(\mathbf{X} \in B) = \int_B f_{\mathbf{X}}(x_1, \ldots, x_n) dx_1 \ldots dx_n
$$

PDF can also be found from CDF:
$$
    f_{\mathbf{X}}(\mathbf{x}) = \frac{\partial^n F_{\mathbf{X}}(\mathbf{x})}{\partial x_1 \ldots \partial x_n}
$$

## Random vector: independence

If all r.v.s $X_i$ are independent, then
$$
\begin{cases}
F_{\mathbf{X}}(\mathbf{x}) & = \prod\limits_{i=1}^n F_{X_i}(x_i), \\
f_{\mathbf{X}}(\mathbf{x}) & = \prod\limits_{i=1}^n f_{X_i}(x_i)
\end{cases}
$$

## Random vector: moments

**Mathematical expectation** of a random vector is a vector of mathematical expectations of its components:
$$
\mathbb{E}\left[\mathbf{X}\right] = (\mathbb{E}X_1, \ldots, \mathbb{E}X_n)^\top
$$

Second moments of a random vector are described with **covariance matrix** $\mathbb{V}\text{ar}(\mathbf{X}) = \Sigma$, where
$$
\Sigma_{ij} = \operatorname{cov}(X_i, X_j)
$$

$$
\Sigma_{ij} = \operatorname{cov}(X_i, X_j) = \mathbb{E} \left[ (X_i - \mathbb{E} X_i) (X_j - \mathbb{E} X_j) \right]
$$

In particular, the diagonal elements are variances: $\Sigma_{ii} = \mathbb{V}\text{ar}(X_i)$.

## Random vector: covariance matrix

Matrix notation for covariance matrix is $\mathbb{V}\text{ar}(\mathbf{X}) = \mathbb{E}\left[(\mathbf{X} - \mathbb{E}[\mathbf{X}]) (\mathbf{X} - \mathbb{E}[\mathbf{X}])^\top\right]$.

Properties of convariance matrix:
- Symmetry $\Sigma^\top = \Sigma$
- Non-negative semi-definite: $a^\top \Sigma a \geqslant 0, \forall a$

## Random vector: marginal and conditional distributions

**Marginal distribution** is the distribution of a subset of a random vector. For example, consider r.v. $\mathbf{X} \in \mathbb{R}^n$ and let's view it as two vectors, $\mathbf{Y} \in \mathbb{R}^k$ and $\mathbf{Z} \in \mathbb{R}^{n-k}$, stacked: $\mathbf{X} = (\mathbf{Y}^\top, \mathbf{Z}^\top)^\top$. The marginal distribution of $\mathbf{Z}$ then will be:
$$
f_{\mathbf{Z}}(\mathbf{z}) = \int_{\mathbb{R}^k} f_{\mathbf{X}}(\mathbf{y}, \mathbf{z}) d \mathbf{y}
$$

In words, we take distribution of $\mathbf{X}$ and **integrate out** everything not realted to $\mathbf{Z}$.

We may also define **conditional distribution**:
$$
f_{\mathbf{Y}|\mathbf{Z}=\mathbf{z}}(\mathbf{y}) = \frac{f_{\mathbf{X}}(\mathbf{y}, \mathbf{z})}{f_{\mathbf{Z}}(\mathbf{z})}
$$

## Example 4: joint, marginal and conditional distributions for discrete case

Let $X$ be the indicator of the sampled individual being a current smoker, and let $Y$ be the indicator of his developing lung cancer at some point in his life. Suppose the joint PMF is as follows:

||$Y=1$|$Y=0$|
|--|--|--|
|$X=1$|$\frac{5}{100}$|$\frac{20}{100}$|
|$X=0$|$\frac{3}{100}$|$\frac{72}{100}$|

Find the marginal and consitional distributions.

## Solution 4

||$Y=1$|$Y=0$|
|--|--|--|
|$X=1$|$\frac{5}{100}$|$\frac{20}{100}$|
|$X=0$|$\frac{3}{100}$|$\frac{72}{100}$|

$$
\mathbb{P}(X = x) = \sum_y \mathbb{P}(X = x, Y = y)
$$

||$Y=1$|$Y=0$|$\text{Sum}$|
|--|--|--|--|
|$X=1$|$\frac{5}{100}$|$\frac{20}{100}$|$\frac{25}{100}$|
|$X=0$|$\frac{3}{100}$|$\frac{72}{100}$|$\frac{75}{100}$|
|$\text{Sum}$|$\frac{8}{100}$|$\frac{92}{100}$|$\frac{100}{100}$|

## Solution 4

||$Y=1$|$Y=0$|
|--|--|--|
|$X=1$|$\frac{5}{100}$|$\frac{20}{100}$|
|$X=0$|$\frac{3}{100}$|$\frac{72}{100}$|

$$
\mathbb{P}(Y = y | X = x) = \frac{\mathbb{P}(X = x, Y = y)}{\mathbb{P}(X = x)}
$$

Example: if the person is a smoker ($X = 1$), then $\mathbb{P}(Y = 1 | X = 1) = \frac{\mathbb{P}(X = 1, Y = 1)}{\mathbb{P}(X = 1)} = \frac{5/100}{25/100} = 0.2$.

## Example 5 (unit disc)

Consider a random point on unit disc with random coordinates $(X, Y)$. What is the joint, marginal and conditional PDF for the coordinates?

## Solution 5

The joint is
$$
f_{X, Y}(x, y) = \begin{cases}
\frac1\pi, \text{ if } x^2 + y^2 \leqslant 1, \\
0, \text{ else}
\end{cases}
$$

The marginal for $X$ is:
$$
f_X(x) = \int\limits_{-\infty}^\infty f_{X, Y}(x, y) dy = \int\limits_{-\sqrt{1 - x^2}}^{\sqrt{1-x^2}} \frac{1}{\pi} dy = \frac{2}{\pi} \sqrt{1 - x^2}
$$

The conditional for $Y$ is:
$$
f_{Y|X=x}(x) = \frac{f_{X, Y}(x, y)}{f_{X}(x)} = \frac{\frac{1}{\pi}}{\frac{2}{\pi}\sqrt{1 - x^2}} = \frac{1}{2\sqrt{1 - x^2}}
$$