-
Notifications
You must be signed in to change notification settings - Fork 6
Finding an optimal mixed effects model
After processing your MRI data you may be wondering to yourself, what is the most appropriate statistical model to explain the variance in my data?
Here I will go through the steps necessary to compare models based on best fit of the data. For example, if you have longitudinal data and are unsure how to fit the effect of time, you may want to compare a linear to a quadratic fit (old example). Alternatively, you could compare two models with varying complexity, as in the first example. The preferred method implements the AIC as described first, but the older method is included at the bottom as well.
Here we compare one model that interacts timepoint (age) and treatment including sex as a covariate with a second model that interacts age, sex, and treatment. While this example is written for mincLmer
, it applies mostly the same for vertexLmer
or anatLmer
.
You will need to load the following libraries in R:
> library(RMINC)
> library(lme4)
First define a function to calculate AIC:
AIC_summary <-
function(mmod)
c(RMINC:::fixef_summary(mmod), AIC = extractAIC(mmod)[2])
Then, run both of the models you are interested in comparing.
mlm <-
mincLmer(jacobians ~ Treatment * age + Sex + (1|ID)
, data = data
, REML = F
, summary_type = AIC_summary)
mlm2 <-
mincLmer(jacobians ~ Treatment * age * Sex + (1|ID)
, data = data
, REML = F
, summary_type = AIC_summary)
When they have run completely, write the AIC columns to minc files thus:
mincWriteVolume(mlm, “model1_AIC.mnc”, column = ‘AIC’)
mincWriteVolume(mlm2, “model2_AIC.mnc”, column = ‘AIC’)
Then, open a terminal and load minc-toolkit
In your terminal, type:
minccalc -expression 'A[0]<A[1]?1:2' model1_AIC.mnc model2_AIC.mnc winnermap.mnc
In this expression, A[0] is the first .mnc file you enter (model1_AIC.mnc). A[1] is the second (model2_AIC.mnc). This expression will compare A[0] to A[1] at each voxel and create an output mnc (winnermap.mnc) showing at each voxel which model is better. For AIC, a smaller value indicates a better model. If A[0] is less (better) than A[1], that voxel receives a value of 1. If not, that voxel receives a value of 2. Then, we can Display winnermap.
Display winnermap.mnc
Areas that are black indicate model1 is a better fit. Areas that are white indicate model2 is a better fit.
In some cases it may be obvious which model is better, but in this example both maps account for a fairly equal amount of the image. Two things can help you decide which model to end up using. First: pay attention to areas that may be subject to motion artefacts, for example dorsal, posterior areas. Second: check if one model is more symmetrical. In this case, I chose model 1.
If you are simply working with a linear model, and not a linear mixed effects model, you can do the following: First run your two models and store them as model1 and model 2:
model1 = mincLm(rel_jac_overall ~ new_groups*sex,filtered.data, mask=votedmask)
model2 = mincLm(rel_jac_overall ~ new_groups+sex,filtered.data, mask=votedmask)
Then you can run compare_models
to compare the two models using AIC:
model_comparison = compare_models(model1, model2, metric=AICc)
The output looks as follows, where the first column is the AIC for the first model (for each voxel), and the second column corresponds to the second model:
model_comparison
model_comparison matrix, showing 6 of 24749075 rows, and 2 of 2 columns
print more by supplying n and width to the print function
showing models:
showing models:
1: rel_jac_overall ~ new_groups * sex
2: rel_jac_overall ~ new_groups + sex
[,1] [,2]
[1,] 6.189573 8.285714
[2,] 6.189573 8.285714
[3,] 6.189573 8.285714
[4,] 6.189573 8.285714
[5,] 6.189573 8.285714
[6,] 6.189573 8.285714
If you want the AIC map for each model you can store model_comparison[,1] or model_comparison[,2] as minc files:
mincWriteVolume(buffer=model_comparison[,1], output.filename = "AIC_model1.mnc", like.filename = "your_nlin3_average.mnc")
Alternatively, if you want a map as the one described for the lmer method, you can run the following line:
winner = apply(compare_models(model1,model2), 1, which.min)
As above, areas in black will be best explained by model 1, and areas in white are best explained by model 2.
If your AIC winnermap looks pretty evenly split between your tested models, it might be challenging to see where exactly your brain region of interest is to decide on your winning model. In this example, I'm interested in choosing a model for the hippocampus only.
Step 1: Select your area of interest using the DSURQE label csv file
Open DSURQE_40micron_R_mapping.csv and make note of the right and left label numbers in the region. For example, I took the labels highlighted in the image below:
Step 2: make a lookup table with the chosen labels on one column
Label each row as 1 on a separate column to note that these are the regions that are significant in your mask.
Step 3: Run minclookup
This command performs a lookup table operation on each voxel of a minc file. Here, I set my input lookup table as the csv file we just created above. This will point the command to look for the voxels specified in the table within the input file. For example:
minclookup -discrete -lookup_table hc_lookup.csv labels_on_average.mnc hc_label_on_average.mnc
I used the option -discrete
to specify that my lookup table is discrete. This will make sure that each input value is treated as an integer, and if it is found in the lookup table, then the corresponding value is written to the output file. If it is not found, then a null value is written out (zero by default). For example:
mincmath -mult hc_label_on_average.mnc winnermap_linearvspoly2.mnc hc_winnermap_linearvspoly2.mnc
Final hippocampus label file will look something like this:
Step 4: run mincmath
Use this command to multiply each value of each voxel in the lable file we just created with the values in your winnermap. This way, you're basically croppping out all the AIC labels that aren't part of the hippocampus.
mincmath -mult hc_label_on_average.mnc winnermap_linearvspoly2.mnc hc_winnermap_linearvspoly2.mnc
Final output winnermap will look like this, where gray regions indicate model 1 and white regions indicate model 2.
What you can also do instead of making a histogram is just re-running your AIC with a mask that only covers your area of interest, not the whole brain. To do this, you would still need to use minclookup
to create your region mask. Once you have your region mask, run AIC and make the winnermap as described above under "AIC". The final winnermap would look like this for the hippocampus:
To make your life easier, you can display your winnermap with the region label file so that you can easily tell whether the white regions are those outside of the hippocampus or regions within the hippocampus where the second model won.
You will need to load the following libraries in R:
> library(RMINC)
> library(lme4)
Once you've set up your data frames, you will run your first, most simple model. Here I am modelling a linear age term using a polynomial. When comparing models with different fixed effects, as we will in this example, you should set REML=FALSE
. You would use REML=TRUE
if you are comparing models with different random effects, or when performing your actual statistical analysis.
mod_lin = mincLmer(abs_jac ~ poly(age,1) + (1|mouse_id), mask=mask, REML=FALSE)
Next, you will estimate your degrees of freedom, and perform a False Discovery Rate (FDR) correction on your model:
fdr_mod_lin = mincFDR(mincLmerEstimateDF(mod_lin))
Next, you will run your more complex model. In this case, it I am fitting a quadratic age term followed by the degrees of freedom estimation and FDR correction:
mod_quad = mincLmer(abs_jac ~ poly(age,2) + (1|mouse_id), mask=mask, REML=FALSE)
fdr_mod_quad = mincFDR(mincLmerEstimateDF(mod_quad))
Now you are ready to compare models. We will use the log likelihood ratio test. The simpler model goes first, the more complex model goes second:
linvsquad = mincFDR(mincLogLikRatio(mod_lin,mod_quad))
Here is an example result from this comparison. In this case, the quadratic model is significantly better at explaining the variance in my data at q=0.01:
Degrees of Freedom: 3
FDR Thresholds:
mod_quad
0.01 11.955936
0.05 8.221689
0.1 6.567381
0.15 5.577953
0.2 4.862967
Next, you have to visualize where in the brain your new model best fits your data. To do so, you will output your model comparison results as a minc file:
mincWriteVolume(buffer = linvsquad, column = "mod_quad", output.filename = "model_comp_lin_vs_quad.mnc", like.filename = "second_level-nlin-3.mnc")
Finally, you will use Display to overlay your model_comp_lin_vs_quad.mnc file over your nlin-3.mnc average.
Here I have set my q-value threshold to 0.05. Areas in white/color are areas in which the more complex model (i.e. quadratic) better fits the data.