# The Repeated Measures ANOVA


## The Model of *Partitioned Errors*
In the model above, we mentioned that performing the subtraction

$$
y_{ij} - \alpha_{j} = \mu + S_{i} + \epsilon_{ij}.
$$

resulted in a model that was simply the grand mean *plus* pure error. In defining this "error", we split it into two parts, one associated with the subjects and one representing everything else. This process of splitting the error into different chunks is known as *partitioning* the error. In our original linear model, we only had a single error term and the model for two conditions was

$$
y_{ij} = \mu + \alpha_{j} + \epsilon_{ij}.
$$

Although we implied that the term $S_{i}$ was *added* to the model, it is more correct to think of *splitting* the original error term. If we rename the error term to $\eta_{ij}$, then this is more correctly expressed as

$$
\begin{alignat*}{1}
    y_{ij}   &= \mu + \alpha_{j} + \eta_{ij} \\
    \eta_{ij} &= S_{i} + \epsilon_{ij}
\end{alignat*}
$$

Remembering that the errors represent the *random* part of our model, then the model with the single error term is given by

$$
\begin{alignat*}{1}
    y_{ij}   &= \mu + \alpha_{j} + \eta_{ij} \\
    \eta_{ij} &\sim \mathcal{N}(0, \sigma^{2})
\end{alignat*}
$$

which should be familiar from last semester. If we were to then split $\eta_{ij}$ into different chunks we end up with *multiple* random error terms

$$
\begin{alignat*}{1}
    y_{ij}        &=    \mu + \alpha_{j} + (S_{i} + \epsilon_{ij}) \\
    S_{i}         &\sim \mathcal{N}(0, \sigma^{2}_{b}) \\
    \epsilon_{ij} &\sim \mathcal{N}(0, \sigma^{2}_{w})
\end{alignat*}
$$

So now we can see *explicitly* that there are two sources of error variance, one given by $\sigma^{2}_{b}$ and one given by $\sigma^{2}_{w}$. This means that the expected value of $y_{ij}$ stays exactly the same as any other model of two groups

$$
E\left(y_{ij}\right) = \mu + \alpha_{j} = \mu_{j}.
$$

This connects directly with the idea that repeated measurements do not affect the mean function. However, what changes is the *variance* of $y_{ij}$, which is now given by

$$
\text{Var}\left(y_{ij}\right) = \sigma^{2}_{b} + \sigma^{2}_{w}.
$$

These types of models are also known as *variance components* models, because they work precisely by splitting the variance into multiple components, as shown above. So, we can now express this model in terms of its mean and variance function like so

$$
\begin{alignat*}{2}
    y_{ij}             &\sim \mathcal{N}(\mu_{j}, \sigma^{2})             &\quad \text{(Population distribution)} \\
    E(y_{ij})          &=    \mu_{j} = \mu + \alpha_{j}                   &\quad \text{(Mean function)}           \\
    \text{Var}(y_{ij}) &=    \sigma^{2} = \sigma^{2}_{b} + \sigma^{2}_{w} &\quad \text{(Variance function)}.      \\
\end{alignat*}
$$

So, we now have a model with a slightly more complex variance function that accommodates the fact that we have two sources of error whenever there are repeated measurements. This also connects directly with the idea that data from the same subject are *correlated*. The *covariance* between two measurements from the same subject is given by

$$
\text{Cov}(y_{i1},y_{i2}) = \text{Cov}(\mu + \alpha_{1} + S_{i} + \epsilon_{i1}, \mu + \alpha_{2} + S_{i} + \epsilon_{i2}) 
$$

Because $\mu$, $\alpha_{1}$ and $\alpha_{2}$ are *population constants*, they have 0 variance and thus do not contribute to the definition of covariance, leading to

$$
\text{Cov}(y_{i1},y_{i2}) = \text{Cov}(S_{i} + \epsilon_{i1}, S_{i} + \epsilon_{i2}) 
$$

This can be expanded like so

$$
\text{Cov}(y_{i1},y_{i2}) = \text{Cov}(S_{i},S_{i}) + \text{Cov}(S_{i},\epsilon_{i2}) + \text{Cov}(\epsilon_{i1},S_{i}) + \text{Cov}(\epsilon_{i1}, \epsilon_{i2}). 
$$

The subject effects and the errors are not correlated as these represent independent partitions of the overall error. As such, $\text{Cov}(S_{i},\epsilon_{i2}) = \text{Cov}(\epsilon_{i1},S_{i}) = 0$. Similarly, the final errors are uncorrelated because the correlation has been *removed* by partitioning-out the subject effects. So $\text{Cov}(\epsilon_{i1}, \epsilon_{i2}) = 0$. This leaves

$$
\text{Cov}(y_{i1},y_{i2}) = \text{Cov}(S_{i},S_{i}).
$$

A key result from the definition of covariance is that the covariance of a random variable with itself is simply its variance, meaning

$$
\text{Cov}(y_{i1},y_{i2}) = \text{Cov}(S_{i},S_{i}) = \text{Var}(S_{i}) = \sigma^{2}_{b}.
$$

All of which is to say that the variance associated with the subject-specific deflections *is* the correlation induced by the repeated measurements.

...

$$
\begin{alignat*}{1}
    \text{Var}\left(y_{i1} - y_{i2}\right) &= \text{Var}(y_{i1}) + \text{Var}(y_{i2}) - 2\text{Cov}(y_{i1},y_{i2}) \\
                                           &= \left[\sigma^{2}_{b} + \sigma^{2}_{w}\right] + \left[\sigma^{2}_{b} + \sigma^{2}_{w}\right] - 2\sigma^{2}_{b} \\
                                           &= \sigma^{2}_{w} + \sigma^{2}_{w}
\end{alignat*}
$$

So, we can see that the correlation *cancels-out*, which is exactly as expected from our exploration of the *model of paired differences* from earlier.

### Partitioning the Error as a Decomposition of the Variance-covariance Matrix
Now, we will connect what we have done above with the idea of modelling the variance-covariance matrix. Rather than doing this *explicitly*, the method above was an *implicit* modelling of the covariance structure...

$$
\begin{bmatrix}
    y_{11} \\
    y_{12} \\
    y_{21} \\
    y_{22} \\
\end{bmatrix}
\sim\mathcal{N}\left(
\begin{bmatrix}
    \mu + \alpha_{1} \\
    \mu + \alpha_{2} \\
    \mu + \alpha_{1} \\
    \mu + \alpha_{2} \\
\end{bmatrix}, 
\begin{bmatrix}
    \sigma^{2}_{b} + \sigma^{2}_{w}  & \sigma^{2}_{b}                  & 0           & 0                     \\
    \sigma^{2}_{b}                   & \sigma^{2}_{b} + \sigma^{2}_{w} & 0           & 0                      \\
    0                                & 0                               & \sigma^{2}_{b} + \sigma^{2}_{w}  & \sigma^{2}_{b}            \\
    0                                & 0           & \sigma^{2}_{b} & \sigma^{2}_{b} + \sigma^{2}_{w}  \\
\end{bmatrix}
\right)
$$

## The Concept of Blocking

## Adding Between-subjects Factors

... So what is the correct error here? The obvious, and correct, answer is that it is the *between-subjects* error. So how do we use this in our test statistics? At present, all we have done is *removed* the between-subjects error by including the `Subject` factor in our models. However, we somehow have to use this removed error as the denominator in tests that are based on between-subjects effects.

### Partitioned Error Models

... the difference is that, rather than including the `Subject` factor as something to be estimated and tested (like any other regression coefficients), we instead want to just use these effects for the purpose of estimating the *between-subjects* variance. We already saw from previous examples that the effects of `Subject` are not interesting, as we simply ignored them. Indeed, the $t$-tests and $p$-values and other automatic treatments of these coefficients were both *uninteresting* and *unnecessary*, which is a hint to the fact that we were not really using this factor correctly when adding it to the linear model in this way. This is because `Subject` was, in fact, something known as a *random-effect*.

...This distinction between something we want to directly estimate and test and something that is used for calculating varirance is the difference between *fixed-effects* and *random-effects*. As such, in order to use the `Subject` factor in this way, it must be treated as a *random-effect*.

...We have already seen models that contain random variables, as every single linear model so far has contained a *random error term*. As such, 

...The way the ANOVA calculated this error is through the usual decomposition of sums-of-squares...If we run an ANOVA on the paired model from earlier, we can see the *mean-squares* for the subejct effects. This *is* the estimate of the between-subejcts variance...

## RM ANOVA in `R`

### Using the `aov()` Function

In [1]:
summary(aov(y.long ~ cond + Error(subject)))

ERROR: Error in eval(predvars, data, env): object 'y.long' not found


So this is now very explicitly saying that we want to add `subject` to the model, but that we want it to be treated as an *error term*. This tells `R` that we want the sums-of-squares and mean-square to be calculated for `subject`, but what we do not want is for there to be tests on this factor. Instead, we want it to be treated as the *denominator* of the $F$-statistic for certain tests. So, in the ANOVA table, we want `subject` to be an additional error term, rather than an actual effect we are interested in. 

### Using the `ezANOVA()` Function
As we can see above, using a partitioned error model with `aov()` is a tricky business and it would be very easy to get this wrong. As an alternative, we can use the `ezANOVA()` function from the `ez` package. As the name implies, this is designed to allow for an RM ANOVA without the usual difficulties associated with the `aov()` or `lm()` functions. Unfortuantely, the aim of this package is largely to make the `R` output the same as SPSS. So it does away with the linear model framework. This means, no residuals, no parameter estimates, no diagnostic plots or anything else we have made use of so far. If you *have* to use an RM ANOVA, this is the simplest way to get it *right*. However, as we will discuss below, we would disuade you from ever considering RM ANOVA as an option in the future. About the only utility of this is showing doubtful researchers that our better options of GLS and mixed-effects models are, in fact, giving them the same answer as an RM ANOVA.

In [3]:
library(ez)
ezANOVA()

ERROR: Error in ezANOVA(): argument "data" is missing, with no default


## Why We Should *Not* Use RM ANOVA
Everything we have discussed above has really been an exercise in telling you why you really do not want to use RM ANOVA. All the unncessary fiddling with error terms and different tests requiring different errors is a complication that we could simply do without. Even if we do manage to successfully work out what needs to go where (or get a function like `ezANOVA()` to sort it for us), we are still left with a method that has a number of meaningful restrictions. ... Because of this, the RM ANOVA is both tricky to understand, tricky to use correctly and massively inflexible. It is no wonder that statisticians abandoned this method decades ago! And yet, this is the method that has persisted in psychology until releatively recently.

This section has largely been motivational to understand why we want to use something more flexible and more modern, but it is important to recognise that you may well end up working with someone who knows nothing beyond the RM ANOVA. In those situations, it is useful to (a) motivate the need for something better and (b) understand how to get the RM ANOVA results in `R`, in case they require further convincing. So, we do not condone the use of the RM ANOVA, but we understand its place in psychology and also understand that there are times where you may want to see what the RM ANOVA says, even if you do not wish to use it.