# From Multilevel to Mixed-effects
...

## Why Collapse Our Hierarchy?

- The model is always estimated as a single unit, with the levels informing each other
- Multilevel implies that we can estimate each level separately, which loses the whole advantage of this framework (this is known as a *summary statistics* approach)
- Software is harder to write in a multilevel fashion (though it does exist e.g. MLM, MLwin), whereas a single function call for a single model fits happily inside the usual `R` approach

So we can think, very broadly, that the hierarchical perspective is useful *intuitively*, but the mixed-effects perspective is useful *practically*.

## From Multilevel to Mixed-effects
... So, the key here is recognising that we can collapse the two equations

$$
\begin{alignat*}{1}
    y_{ij}    &= \mu_{i} + \alpha_{j} + \eta_{ij}  \\
    \mu_{i}   &= \mu + S_{i}                       \\
\end{alignat*}
$$

After all, we can see exactly what $\mu_{i}$ is equal to. So let us replace $\mu_{i}$ in the first equation with the equality in the second equation. If we do so, we get

$$
\begin{alignat*}{1}
    y_{ij} &= (\mu + S_{i}) + \alpha_{j} + \eta_{ij} \\
           &= \mu + \alpha_{j} + S_{i} + \eta_{ij}
\end{alignat*}
$$

with

$$
\begin{alignat*}{1}
    S_{i}     &\sim \mathcal{N}\left(0,\sigma^{2}_{b}\right) \\
    \eta_{ij} &\sim \mathcal{N}\left(0,\sigma^{2}_{w}\right) \\
\end{alignat*}
$$

This is *exactly the partitioned error model we saw last week*.

## Mixed-effects Using `lme()`

In [3]:
library('datarium')
library('reshape2')

data('selfesteem')

# repeats and number of subjects
t <- 3
n <- dim(selfesteem)[1]

# reshape wide -> long
selfesteem.long <- melt(selfesteem,            # wide data frame
                        id.vars='id',          # what stays fixed?
                        variable.name="time",  # name for the new predictor
                        value.name="score")    # name for the new outcome

selfesteem.long           <- selfesteem.long[order(selfesteem.long$id),] # order by ID
rownames(selfesteem.long) <- seq(1,n*t)                                  # fix row names
selfesteem.long$id        <- as.factor(selfesteem.long$id)               # convert ID to factor

## Mixed-effects Models Applied to Repeated Measurements
... So, in fact, applying the model above effectively takes us back to the *repeated measures ANOVA*. This is not a coincidence. The repeated measures ANOVA *is* a very basic multilevel/mixed-effects model. So, it may appear that despite all the discussion above, we have not actually made any progress. In a way, this is *true*. ... This alone reveals a truth about multilevel/mixed-effects models that is not always appreciated: if you have *no* replications per-subject and per-repeated measurements, a multilevel/mixed-effects model will do *no  better* than a repeated measures ANOVA. In fact, it is arguably *worse* because the mixed-effects model has no correction for violations of sphericity. The covariance structure simply is what it is and the inference is based off of this assumption. These correction exist for the repeated measures ANOVA precisely because it has no flexibility in its covariance structure. Because mixed-effects models have much more flexibility, there is no sense of applying a correction for misspecification. If we want a more flexible structure, we just need to specify it. However, because this structure is *implicit*, the data must be able to support it. When the data cannot, we fallback on a simpler structure. So, when mixed-effects and the repeated measures ANOVA converge, the is argument to be made that mixed-effects is actually *less precise* for inference.

Now, there *are* advantages to the mixed-effects framework with a compound symmetric covariance structure. For instance, we do not need to manually assign error terms for tests...

In [8]:
library(nlme)
library(car)
lme.mod <- lme(score ~ time, random= ~ 1|id, data=selfesteem.long)
print(summary(lme.mod))

Anova(lme.mod)

Linear mixed-effects model fit by REML
  Data: selfesteem.long 
       AIC      BIC    logLik
  86.99346 93.47264 -38.49673

Random effects:
 Formula: ~1 | id
         (Intercept)  Residual
StdDev: 2.772358e-05 0.8859851

Fixed effects:  score ~ time 
               Value Std.Error DF   t-value p-value
(Intercept) 3.140122 0.2801731 18 11.207793   0e+00
timet2      1.793820 0.3962246 18  4.527281   3e-04
timet3      4.496220 0.3962246 18 11.347655   0e+00
 Correlation: 
       (Intr) timet2
timet2 -0.707       
timet3 -0.707  0.500

Standardized Within-Group Residuals:
       Min         Q1        Med         Q3        Max 
-1.4987927 -0.6279054 -0.0321792  0.4530803  2.4177254 

Number of Observations: 30
Number of Groups: 10 


Analysis of Deviance Table (Type II tests)

Response: score
      Chisq Df Pr(>Chisq)    
time 130.52  2  < 2.2e-16 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1