# From Multilevel to Mixed-effects
...

## Why Collapse Our Hierarchy?

- The model is always estimated as a single unit, with the levels informing each other
- Multilevel implies that we can estimate each level separately, which loses the whole advantage of this framework (this is known as a *summary statistics* approach)
- Software is harder to write in a multilevel fashion (though it does exist e.g. MLM, MLwin), whereas a single function call for a single model fits happily inside the usual `R` approach

So we can think, very broadly, that the hierarchical perspective is useful *intuitively*, but the mixed-effects perspective is useful *practically*.

## From Multilevel to Mixed-effects
... So, the key here is recognising that we can collapse the two equations

$$
\begin{alignat*}{1}
    y_{ij}    &= \mu_{i} + \alpha_{j} + \eta_{ij}  \\
    \mu_{i}   &= \mu + S_{i}                       \\
\end{alignat*}
$$

After all, we can see exactly what $\mu_{i}$ is equal to. So let us replace $\mu_{i}$ in the first equation with the equality in the second equation. If we do so, we get

$$
\begin{alignat*}{1}
    y_{ij} &= (\mu + S_{i}) + \alpha_{j} + \eta_{ij} \\
           &= \mu + \alpha_{j} + S_{i} + \eta_{ij}
\end{alignat*}
$$

with

$$
\begin{alignat*}{1}
    S_{i}     &\sim \mathcal{N}\left(0,\sigma^{2}_{b}\right) \\
    \eta_{ij} &\sim \mathcal{N}\left(0,\sigma^{2}_{w}\right) \\
\end{alignat*}
$$

This is *exactly the partitioned error model we saw last week*.

## Mixed-effects Using `nlme`

In [3]:
library('datarium')
library('reshape2')

data('selfesteem')

# repeats and number of subjects
t <- 3
n <- dim(selfesteem)[1]

# reshape wide -> long
selfesteem.long <- melt(selfesteem,            # wide data frame
                        id.vars='id',          # what stays fixed?
                        variable.name="time",  # name for the new predictor
                        value.name="score")    # name for the new outcome

selfesteem.long           <- selfesteem.long[order(selfesteem.long$id),] # order by ID
rownames(selfesteem.long) <- seq(1,n*t)                                  # fix row names
selfesteem.long$id        <- as.factor(selfesteem.long$id)               # convert ID to factor

## Mixed-effects Using `lme4`

## The Implied Marginal Model
... From this perspective, a mixed-effects model is simply GLS, but with a more sophisticated way of constructing the variance-covariance matrix using the random effects. This leads to some important consequences:

1. The inferential issues around GLS are *identical* for mixed-effects
2. The covariance structure is *more restricted* under mixed-effects, because it is constrained by the random effects themselves 

## Advantages of Mixed-effects

### Modelling Data Structure
... In our previous discussion, GLS was used as a method to get around the presence of correlation within repeated measurement data. We impose a structure for GLS to remove. What LME models do instead is get us to model the *structure* of the data. We do not worry about dependence because this is accommodated *automatically* as a consequence of the structure. As long as we get the structure right, everything else follows. So this is an entirely different perspective. We are not trying to model correlation, we are trying to get the structure correct. Once we do, everything gets worked out for us, correlation and all. 

### Pooling and Shrinkage
...

### The Bias-Variance Tradeoff
... So, from this perspective, LME models provide a solution to this issue via pooling and shrinkage. GLS provides *no pooling* ....

### Regularisation