# Contrasts
Our first step towards trying to reach some conclusions about our data is to take the parameter estimate images and turn them into images of *statistics*. These images of statistics are known as *statistical parametric maps* (SPMs). In this section, we are going to explore how to create these statistical maps, before turning to what we actually do with them in a later section.

## Forming a Statistical Parametric Map
As we know from the world of mathematical statistics, we can use a parameter estimate and its standard error to form a test statistic by dividing one by the other, like so

$$
t = \frac{\hat{\beta}_{k}}{\sigma\{\beta_{k}\}}
$$

where $\sigma\{\beta_{k}\}$ is the standard error of parameter estimate $k$. This value forms a standardised metric of the magnitude of our effect, relative to the degree of uncertainty. We can think of it as how large our effect is, per unit of uncertainty. In other words, in standard error units. The smaller the uncertainty, the larger $t$ will become.

From the perspective of hypothesis testing, the value of $t$ represents an implicit comparison with a hypothesised population value of 0 for the parameter in question. So we are assessing the extent to which the population parameter is likely to be 0, given our degree of uncertainty. The larger $t$ becomes, the greater the discrepancy between the estimated parameter value and the hypothesised parameter value. In this instance, that means that the bigger $t$ becomes, the more unlikely it is that the parameter truly is 0 in the population. 

When dealing with *images* of parameter estimates, we can perform the same calculation each voxel, converting each parameter estimate into a t value. This forms a statistical parametric map of $t$-statistics, also known as an SPM{t} image, as shown in {numref}`creating-t-fig`.

```{figure} images/creating-t.png
---
width: 800px
name: creating-t-fig
---
Illustration of how an SPM{t} image can be created by dividing each voxel in the parameter estimate image by each voxel in the standard error image.
```

Notice already that this gives us some insight into why we want to create an image of statistics. If we just looked at the parameter estimates image we might see a larger effect in the brain stem. However, looking at the standard error image, this region has a large degree of uncertainty around the accuracy of this estimate. As such, once this uncertainty is taken into account, the effect in the brain stem disappears. This nicely highlights why we should take the degree of uncertainty into account before trying to interpret the values of the parameter estimates.

From this image of $t$-values we could then follow classical statistical theory and calculate an image of $p$-values associated with each $t$-statistic. We can then use the $p$-values to assess where in the brain we have significant effects. For instance, {numref}`t-p-thresh-fig` shows an image of $t$-values on the left and its associated image of $p$-values in the middle. The image on the right highlights which voxels have a $p$-value < 0.05. So following the traditional NHST procedure, we have now localised where in the brain this parameter estimate is significantly different from 0.

```{figure} images/t-p-thresh.png
---
width: 800px
name: t-p-thresh-fig
---
Example of applying traditional significance-testing criteria to an image of $t$-statistics.
```

Unfortunately, there is a big problem with calculating $p$-values in this fashion, but we will get to that later in this lesson. For now, let us focus on the process of creating a statistical map. One issue with what we have done so far is that our approach to calculating $t$ is limited to only testing a single parameter at a time. Typically, in fMRI, our parameter estimates will represent signal change in relation to different experimental conditions. Because of this, the *differences* between the experimental conditions are often of interest. We might also want to look at the averages of several conditions, or perhaps look at differences of averages. As such, in order to make our hypothesis testing framework more flexible, we have to turn to the use of *contrasts*.

## Contrasts Theory
A contrast is simply a linear combination of parameter estimates. In the framework of the GLM, we can define a contrast as

$$
c = \mathbf{L}\hat{\boldsymbol{\beta}},
$$

where we can see our parameter estimates are being pre-multiplied by a vector called $\mathbf{L}$. This vector contains *weights*, which work to create a linear combination of our parameter estimates. To see this, consider the following example

$$
\begin{align}
c = \mathbf{L}\hat{\boldsymbol{\beta}} &= 
\begin{bmatrix}
    0 & 1 & -1
\end{bmatrix}
\begin{bmatrix}
    \beta_{0} \\
    \beta_{1} \\
    \beta_{2}
\end{bmatrix} \\
&= (0 \times \beta_{0}) + (1 \times \beta_{1}) + (-1 \times \beta_{2}) \\
&= \beta_{1} - \beta_{2}.
\end{align}
$$

So, by specifying weights of $[0, 1, -1]$, we can subtract the estimates for $\beta_{1}$ and $\beta_{2}$. Although this may seem an overly complicated way of subtracting two numbers, the point is that this is a *framework* where we can put any weights in $\mathbf{L}$ to form a variety of values from the estimates. For example, we could use a contrast to average the estimates

$$
c = \mathbf{L}\hat{\boldsymbol{\beta}} = 
\begin{bmatrix}
    0 & 1/2 & 1/2
\end{bmatrix}
\begin{bmatrix}
    \beta_{0} \\
    \beta_{1} \\
    \beta_{2}
\end{bmatrix}
= \frac{\beta_{1} + \beta_{2}}{2}.
$$