# The Repeated Measures ANOVA


## The Concept of Blocking

## Adding Between-subjects Factors

... So what is the correct error here? The obvious, and correct, answer is that it is the *between-subjects* error. So how do we use this in our test statistics? At present, all we have done is *removed* the between-subjects error by including the `Subject` factor in our models. However, we somehow have to use this removed error as the denominator in tests that are based on between-subjects effects.

### Partitioned Error Models

... the difference is that, rather than including the `Subject` factor as something to be estimated and tested (like any other regression coefficients), we instead want to just use these effects for the purpose of estimating the *between-subjects* variance. We already saw from previous examples that the effects of `Subject` are not interesting, as we simply ignored them. Indeed, the $t$-tests and $p$-values and other automatic treatments of these coefficients were both *uninteresting* and *unnecessary*, which is a hint to the fact that we were not really using this factor correctly when adding it to the linear model in this way. This is because `Subject` was, in fact, something known as a *random-effect*.

...This distinction between something we want to directly estimate and test and something that is used for calculating varirance is the difference between *fixed-effects* and *random-effects*. As such, in order to use the `Subject` factor in this way, it must be treated as a *random-effect*.

...We have already seen models that contain random variables, as every single linear model so far has contained a *random error term*. As such, 

...The way the ANOVA calculated this error is through the usual decomposition of sums-of-squares...If we run an ANOVA on the paired model from earlier, we can see the *mean-squares* for the subejct effects. This *is* the estimate of the between-subejcts variance...

## RM ANOVA in `R`

### Using the `aov()` Function

In [1]:
summary(aov(y.long ~ cond + Error(subject)))

ERROR: Error in eval(predvars, data, env): object 'y.long' not found


So this is now very explicitly saying that we want to add `subject` to the model, but that we want it to be treated as an *error term*. This tells `R` that we want the sums-of-squares and mean-square to be calculated for `subject`, but what we do not want is for there to be tests on this factor. Instead, we want it to be treated as the *denominator* of the $F$-statistic for certain tests. So, in the ANOVA table, we want `subject` to be an additional error term, rather than an actual effect we are interested in. 

### Using the `ezANOVA()` Function
As we can see above, using a partitioned error model with `aov()` is a tricky business and it would be very easy to get this wrong. As an alternative, we can use the `ezANOVA()` function from the `ez` package. As the name implies, this is designed to allow for an RM ANOVA without the usual difficulties associated with the `aov()` or `lm()` functions. Unfortuantely, the aim of this package is largely to make the `R` output the same as SPSS. So it does away with the linear model framework. This means, no residuals, no parameter estimates, no diagnostic plots or anything else we have made use of so far. If you *have* to use an RM ANOVA, this is the simplest way to get it *right*. However, as we will discuss below, we would disuade you from ever considering RM ANOVA as an option in the future. About the only utility of this is showing doubtful researchers that our better options of GLS and mixed-effects models are, in fact, giving them the same answer as an RM ANOVA.

In [3]:
library(ez)
ezANOVA()

ERROR: Error in ezANOVA(): argument "data" is missing, with no default


## Why We Should *Not* USe RM ANOVA
Everything we have discussed above has really been an exercise in telling you why you really do not want to use RM ANOVA. All the unncessary fiddling with error terms and different tests requiring different errors is a complication that we could simply do without. Even if we do manage to successfully work out what needs to go where (or get a function like `ezANOVA()` to sort it for us), we are still left with a method that has a number of meaningful restrictions. ... Because of this, the RM ANOVA is both tricky to understand, tricky to use correctly and massively inflexible. It is no wonder that statisticians abandoned this method decades ago! And yet, this is the method that has persisted in psychology until releatively recently.

This section has largely been motivational to understand why we want to use something more flexible and more modern, but it is important to recognise that you may well end up working with someone who knows nothing beyond the RM ANOVA. In those situations, it is useful to (a) motivate the need for something better and (b) understand how to get the RM ANOVA results in `R`, in case they require further convincing. So, we do not condone the use of the RM ANOVA, but we understand its place in psychology and also understand that there are times where you may want to see what the RM ANOVA says, even if you do not wish to use it.