# Notes on Week 8 Lectures: Building Blocks

## Matrices

### M1: Introduction to Vectors and Matrices

To denote the $i^{\text{th}}$ row of matrix $A$ by

$$A_{i \bullet}$$

Similarly, we denote the $j^{\text{th}}$ column of matrix $A$ by

$$A_{\bullet j}$$

If $v$ is a $(p \times 1)$ column-vector and $x$ is a $(1 \times p)$ row-vector, then

$$v \cdot x = Y$$

is called the **outer product** and $Y$ is a $(p \times q)$ matrix.

We denote a column-vector whose entries are all 1 by $\iota$. That is

$$\iota = \begin{bmatrix} 1 \\ \vdots \\ 1 \end{bmatrix}$$

For this course, we sometimes refer to this as **the unit vector** (note this is in contrast to the typical meaning of unit vector: $\vert v \vert = 1$).

### M2: Special Matrix Operations

For this course, the transpose of $A$ may be denoted $A'$ or by $A^T$.

### M3: Vectors and Differentiation

The gradient of a function $f(\mathbf{b}) = f(b_1, \ldots, b_q)$ is the column-vector

$$ \frac{\partial f}{\partial \mathbf{b}} (\mathbf{b}) = \begin{bmatrix} \frac{\partial f}{\partial b_1} (\mathbf{b}) \\ \vdots \\ \frac{\partial f}{\partial b_q} (\mathbf{b}) \end{bmatrix} $$

The **Hessian** is given by

$$ \frac{\partial f}{\partial \mathbf{b} \partial \mathbf{b}^T} (\mathbf{b}) = \begin{bmatrix} \frac{\partial f}{\partial b_1 \partial b_1} (\mathbf{b}) & \ldots & \frac{\partial f}{\partial b_1 \partial b_q} (\mathbf{b}) \\ \vdots & \ddots & \vdots \\ \frac{\partial f}{\partial b_q \partial b_1} (\mathbf{b}) & \ldots & \frac{\partial f}{\partial b_q \partial b_q} (\mathbf{b}) \end{bmatrix} $$

When $f$ is smooth the Hessian is symmetric.

## Probability

### P1: Random Variables

When given $n$ random variables, there are $\frac{n(n-1)}{2}$ covariance combinations of different random variables since

$$ {n \choose 2} = \frac{n!}{2!(n-2)!} = \frac{n(n-1)}{2} $$

and, of course, $n$ variances.

We denote the covariance matrix by

$$
    \Sigma =
        \begin{bmatrix}
            \sigma_{11}^2 & \ldots & \sigma_{1n}^2 \\
            \vdots & \ddots & \vdots \\
            \sigma_{n1}^2 & \ldots & \sigma_{nn}^2
        \end{bmatrix}
$$

### P2: Probability Distributions

If $x$ and $y$ are independent, then

$$
\begin{align*}
    E[xy] &= E[x] E[y] \\
    E[g(x) h(y)] &= E[g(x)] E[h(y)]
\end{align*}
$$

If $x$ is normally distributed with mean $\mu$ and variance $\sigma^2$ we write

$$ x \sim N(\mu, \sigma^2) $$

If $x$ is an $n \times 1$ vector of normally distributed random variables with mean vector $\mu_{n \times 1}$ and variance matrix $\Sigma_{n \times n}$ then we write

$$ x \sim N(\mu, \Sigma) $$

where, in particular, $x_i \sim N(\mu_i, \sigma_i^2)$.

If $ y_{m \times 1} = A_{m \times n} x + b_{m \times 1}$, then

$$ y \sim N(A\mu+b, A \Sigma A^T) $$

If $y_i \sim N(\mu, \sigma^2)$ for every $i$ then we say the $y_i$ are normally and identically distributed and denote this by NID. That is

$$ y_i \sim \text{NID}(\mu, \sigma^2) $$

When the distribution is not necessarily normal we write IID and say the variables are independent and identically distributed.

If $z = \sum_{i=1}^n y_i^2$ where $y_i \sim \text{NID}(0,1)$ then $z \sim \chi^2(n)$. That is, $z$ is a chi-square distribution with $n$ degrees of freedom.

If $y \sim N(0,1)$ and $z \sim \chi^2(\nu)$ with $y,z$ independent, then

$$ \frac{y}{ \sqrt{ \frac{z}{\nu} } } \sim t(\nu) $$

where $t(\nu)$ is the Student's $t$ distribution with $\nu$ degrees of freedom.

Note that

$$ \lim_{\nu \rightarrow \infty} t(\nu) = N(0,1) $$

Let $z_1 \sim \chi^2(d_1)$ and $z_2 \sim \chi^2(d_2)$ be independent. Then

$$ \frac{z_1/d_1}{z_2/d_2} \sim F(d_1,d_2) $$

where $F(d_1,d_2)$ is an $F$-distribution with $(d_1,d_2)$ degrees of freedom.