# Introduction to Mixed-effects Models
We will start with a brief high-level discussion about mixed-effects models. Because these types of model can be quite complex, we will just talk very generally about a few key topics, before we turn to the actual theory in the next part of the lesson. 

## Multilevel vs Hierarchical vs Mixed-effects
Perhaps one of the more confusing aspects of mixed-effects models is that they have *multiple names*. Worse than this, those multiple names come with different ways of *writing* and *thinking about* these models. This is to the extent that naive researchers sometimes do not even realise they are the same thing. So, it is important that we establish this from that start: a mixed-effects model can also be called a *multilevel model* or a *hierarchical linear model*. This should be burned into your brain by the end of this lesson.

Although typical to teach these types of models from only *one* of these perspectives, we are taking the approach of teaching them *all* simultaneously. Though this sounds challenging, there is good reason for doing so:

- Multilevel and hierarchical refer to the *same* perspective, so there are only *two* ways of looking at these models, not *three*.
- Both perspectives have their merit and it is likely you will find one of them more intuitive than the other. If you know *both*, you can choose.
- Different software requires different perspectives. If you know both, you can use anything you want.

From the perspective of this lesson, there is a preference for *thinking about* these models as multilevel/hierarchical, but then *implementing* them as mixed-effects. This involves building the model conceptually as a multilevel/hierarchical model, but then translating it into mixed-effects form for the purpose of fitting in software. You may disagree, but that is some of the point of presenting *both* perspectives. 

In addition, we will use the term *multilevel* throughout. The term *hierarchical* is entirely equivalent, so if anyone ever talks to you about *hierarchical linear models*, they are talking about *multilevel models*. Our preference for *multilevel* comes down to it being (a) more directly descriptive of the model form and (b) easier to say. 

## Estimation of Mixed-effects Models
Before we even get to understanding mixed-effects models, it is important to realise that these types of models can be *computationally challenging*. In general, the examples we provide will not push the computer very hard and will be very quick to fit. However, in real datasets, things can be more challenging. All the software we will present leverages iterative maximum likelihood methods for finding the model parameters. This is important to understand for several reasons:

- Iterative algorithms can fail if they do not land on a solution after the default number of iterations.
- We are not finding closed-form solutions here, we are using the computer to search for the "best" solution in an unknown landscape of possibilities.
- Failure can either be a *structural* problem with the model or a *computational* problem related to either the algorithm itself, or the computer running out of memory.

You should not run into any problems using the examples in these lessons because they are chosen to be *simple*. However, in the real world, using large datasets, these problems can emerge. This is precisely because we have left the world of least-squares certainty and have entered the world of iterative optimisation algorithms.

In addition, most mixed-effects software will have the option of finding parameters using either *maximum likelihood* (ML) or *restricted maximum likelihood* (REML). As we know from last semester, ML is *biased* when estimating variance terms because it treats the mean structure as *known*. REML is *unbiased* because it is able to accommodate that uncertainty. So, obviously we just want to use REML? In general, yes, however, there is a catch. If we are *comparing models*, using REML will change elements of the model structure beyond those that we want to compare. As such, we need to make sure we are using ML when performing model comparisons. We will discuss this in more detail in the associated workshop, but it is worth noting that we cannot ignore the estimation procedure when it comes to interrogating what our models are telling us about the data.

## Inference in Mixed-effects Models

## Mixed-effects Applied to Repeated Measurements

## Packages for Mixed-effects
As a final discussion, we need to address the fact that there are *two* main packages used in `R` for fitting mixed-effects models. This is unfortunate because it adds a degree of confusion that we could do without. This situation is largely a *historic* one, however, neither package can be considered superior to the other as they each have their pros and cons. Unfortunately, no single *uber package* yet exists that combines all the advantages of both. Until that day, you need to make an informed decision about which to use for a given analysis. 

For the vast majority of these materials, we will stick with `nlme`. However, we may mention `lme4` at certain points. You will likely see more examples of `lme4` in the wild because this is the *newer* of the two. However, it is our opinion that `lme4` has a distinct *disadvantage* that often makes it *less suitable* for behavioural research. We will discuss this more below.


### `nlme`

### `lme4`
... The *disadvantage* of `lme4` is that it has no facility for accommodating between-subjects variance differences. For example, if you had two independent groups of patients and controls, `lme4` would force you into assuming homogeneity of variance. `nlme`, on the hand, would allow you to estimate a different variance for each group, thus lifting this assumption. The problem is that distinct groups of individuals will often differ in terms of how variable their responses are. Depending upon the condition under study, it may be perfectly reasonable to assume that the patient group will be *more variable* than the controls. Indeed, this additional variability may be a large part of what defines the patient group. Simply *ignoring* this will inevitably bias inference. So, in this situation, it may be better to use `nlme` over `lme4`. However, note that this is entirely *data-dependent*. If we have no between-subjects grouping then there is no issue.