In [1]:
# HIDDEN
from datascience import *
from prob140 import *
import numpy as np
import matplotlib.pyplot as plt
plt.style.use('fivethirtyeight')
%matplotlib inline
from scipy import stats

$$
\newcommand{\mat}[1]{\mathop{#1}_{\sim}}
\newcommand{\bsymb}[1]{\boldsymbol{#1}}
$$


### Multivariate Normal Distribution ###
The parametrization of the bivariate normal distribution by the mean vector and covariance matrix provides a straightforward way to extend the definition to more than two *multivariate* normal random variables.

A *vector valued random variable*, or more simply, a *random vector*, is an array of random variables. We will think of it as a column.
$$
\mat{X} ~ = ~ 
\begin{bmatrix}
X_1 \\
X_2 \\
\vdots \\
X_n
\end{bmatrix}
$$

The *mean vector* of $\mat{X}$ is the vector of means
$$
\mat{\mu} ~ = ~
\begin{bmatrix}
\mu_{X_1} \\
\mu_{X_2} \\
\vdots \\
\mu_{X_n}
\end{bmatrix}
$$

The *covariance matrix* of $\mat{X}$ is the matrix whose $(i, j)$th element is $Cov(X_i, X_j)$. For brevity, let $\sigma_{X,Y}$ denote the covariance of any two random variables $X$ and $Y$. With this notation, the covariance matrix of $\mat{X}$ is

$$
\bsymb{\Sigma} ~ = ~ 
\begin{bmatrix}
\sigma_{X_1}^2 & \sigma_{X_1, X_2} & \sigma_{X_1, X_3} & \ldots & \sigma_{X_1, X_n} \\
\sigma_{X_2, X_1} & \sigma_{X_2}^2 & \sigma_{X_2, X_3} & \ldots & \sigma_{X_2, X_n} \\
\sigma_{X_3, X_1} & \sigma_{X_3, X_2} & \sigma_{X_3}^2 & \ldots & \sigma_{X_3, X_n} \\
\vdots & \vdots & \vdots & \ddots & \vdots \\
\sigma_{X_n, X_1} & \sigma_{X_n, X_2} & \sigma_{X_n, X_3} & \ldots & \sigma_{X_n}^2 
\end{bmatrix}
$$

The $i$th diagonal element of $\bsymb{\Sigma}$ is the variance of $X_i$. The matrix is symmetric because of the symmetry of covariance.

To understand more about covariance matrices, it helps to know some linear algebra. We haven't assumed that in this course, but if you have studied linear algebra then you should check that by the rules of variance and covariance, $\bsymb{\Sigma}$ must be be positive definite if the distribution of $\mat{X}$ is $n$-dimensional and not degenerate. As symmetric positive definite matrices are invertible, $\bsymb{\Sigma}$ has an inverse. 

### Multivariate Normal Density ###
Let $\mat{x}$ be a vector of $n$ real values:
$$
\mat{x} ~ = ~ 
\begin{bmatrix}
x_1 \\
x_2 \\
\vdots \\
x_n
\end{bmatrix}
$$

The *multivariate normal density* with mean vector $\mat{\mu}$ and covariance matrix $\bsymb{\Sigma}$ is the function $f$ whose value at $\mat{x}$ is

$$
f\big{(} \mat{x} \big{)} ~ = ~ 
\frac{1}{(\sqrt{2\pi})^n \sqrt{\lvert \bsymb{\Sigma} \rvert} }
\exp \Big{(} -\frac{1}{2} \big{(}\mat{x} - \mat{\mu} \big{)}^T \bsymb{\Sigma}^{-1} \big{(}\mat{x} - \mat{\mu} \big{)} \Big{)}
$$

Here $(\mat{x} - \mat{\mu})^T$ is the transpose of $\mat{x} - \mat{\mu}$, and $\lvert \bsymb{\Sigma} \rvert$ is the determinant of $\bsymb{\Sigma}$.

Even if you have not studied linear algebra you can check that the formula is correct in the case $n = 1$. In this case $\bsymb{\Sigma} = [\sigma^2]$ is just a scalar. It is a number, not a larger matrix; its determinant is itself; its inverse is simply $1/\sigma^2$. Also, $\mat{x} = x$ and $\mat{\mu} = \mu$ are just numbers. It is easy to see that the formula above reduces to the familiar normal density with mean $\mu$ and variance $\sigma^2$.

In the case $n = 2$, the formula is the same as the bivariate normal density of $X$ and $Y$. To see this, just check that it is true in the standard bivariate normal case. 

In that case

$$
\mathop{\mu}_{\sim} ~ = ~
\begin{bmatrix}
0 \\
0
\end{bmatrix}
$$
and
$$
\bsymb{\Sigma} ~ = ~ 
\begin{bmatrix}
1 & \rho \\
\rho & 1
\end{bmatrix}
$$
So
$$
\lvert \bsymb{\Sigma} \rvert ~ = ~ 1 - \rho^2
$$
and
$$
\bsymb{\Sigma}^{-1} ~ = ~ \frac{1}{1 - \rho^2}
\begin{bmatrix}
1 & -\rho \\
-\rho & 1
\end{bmatrix}
$$

Set
$$
x ~ = ~
\begin{bmatrix}
x \\ 
y
\end{bmatrix}
$$
and do the algebra. The multivariate normal density above is the same as the standard bivariate normal density with correlation $\rho$, derived earlier in this chapter.

### Key Properties of the Multivariate Normal ###
Properties of the multivariate normal distribution are the same as or extensions of properties of bivariate normal distributions. Think of the multivariate normal distribution as the joint distribution of random variables $X_1, X_2, \ldots, X_n$. We won't prove the properties below but you will recognize them from the bivariate case and should use them as needed.

- The distribution of a linear combination of multivariate normal variables is normal. In particular, all the marginals are normal.
- The joint distribution of two or more linear combinations of multivariate normal variables is multivariate normal.
- If a random vector is multivariate normal, then the conditional joint distribution of any of its subsets given any other subset is multivariate normal.
- If multivariate normal random variables are *pairwise uncorrelated*, then they are mutually independent. 