PermutationSchema: terBraak #4

behinger · 2021-02-10T07:35:08Z

We want to implement terBraak permutation for MixedModels.jl

Problems: How to deal with ranef-variance = 0

First method: using CGR

We propose first a method inspired by the CGR-boostrap and the ter-braak permutation method.

The CGR bootstrap (doi: 10.1111/1467-9876.00415) also includes a transformation to deal with correlation in the random effects (eg: correlation between random slope and random intercept).
We will describe the 2 methods separately. However the method dealing with correlated random effect is a generalisation of the independant case.

For the independant case:

For the model:
y = D*eta + X*beta + Z*gamma + epsilon

where Z*gamma is assumed to be composed of independant random effects: Z*gamma = Sum_i Z_i*gamma_i.

In the lme4 formula, in correpond for instance to:

+ (1|subj) + (1|item) + (1|subj:fact) + (cov+0|item)

where fact is a within subject factor (eg: type of images) and cov is a within-item covariate (eg: age of participant)

each random effect is associated to ONE variance parameter (sigma_i)

We compute the estimated model:
y = D*e + X*b + Z*g + r

with variance paramters estimated s_i.

We compute the k_i factors by matching the empirical covariance to the estimated covariance. s_i = k_i * sd(g_i)
We compute resamples (under H0, i.e. without X*b) by using signflipping matrices P_i (diagonal with +/- 1)

y* = D*e + Sum_i k_i*Z_i*P_i*g_i + P_0*r

For the model with covariance between random effects:

The equation of the model is the same but we allow correlation between random effects:

y = D*eta + X*beta + Z*gamma + epsilon

In the lme4 formula, in correpond for instance to:

+(1+fact|subj) + (1+cov|item)

Here with have only 2 independant random effects. Z*gamma = Sum_i Z_i*gamma_i, for i = 1,2 corresponding to subjects and items.

gamma_i is a vector associated to a covariance matrix Sigma_i (is formally Id (x) Sigma_i, where (x) is Kronecker product)

We compute the estimated model:
y = D*e + X*b + Z*g + r

with covariance matrices estimated S_i.

We transform the random effects to match the estimated covariance matrices.

(These steps have already been written in the RGL licensed script, line 71: https://github.com/aloy/lmeresampler/blob/master/R/resamplers.R)

We can compute empirical covariances with the rearranged g_i's. Here we rearrange the g_i's to have in rows each subject/item and
in column each type of random effect (intercept/slopes). With these rearranged data we can compute the empirical covariance matrices

Ls_i*Ls_i' = (g_i' *g_i)'/N_i

where N_i corresponds to the number of subjects/items. Ls_i*Ls_i' is the empirical covariance matrix of the random effects and Ls_i its Cholesky decomposition.
In the example for subject (g_i' *g_i)'/N_i is a square matrix with size (# levels of fact) and for the item 2x2.

Then we perform the Cholesky decompostion of the estimated covariance matrices:

Lr_i*Lr_i' = S_i

We transform g_i into:

u_i = Lr_i*Ls_i^{-1}*g_i

Finally, we compute resamples:

y* = D*e + Sum_i Z_i*P_i*u_i + P_0*r

Note that the P_i matrices should signflip the whole subject/item together (in contrast to the independant case and also to the P_0 matrix that signflips residuals.

Finally for BOTH cases we can re-estimate the original model using:
y* as dependant variable,
D AND X in the fixed part of the design matrix and
the same correlation structure.

Second method: including OLS

The second approach would be to include OLS to take care of the problem of sigma_i estimated to 0.

Fiting using mixed model the model: y = D*eta + X*beta + Z*gamma + epsilon
Refit using OLS:

r_f ~ Z

where r_f = y - D*e - X*b.

Then we would signflipping estimated parameters in the OLS model.

However, the Z matrix is more complicated to construct when we include interactions with factors in the random effects (eg: +(1+fact|subj)). Indeed, we need to remove some dimensions, which correspond to product with contrasts matrices.

We need to discuss with you the problem more in details.

behinger · 2021-02-12T23:41:05Z

Very cool write-up. I could follow it nicely. Thanks Jaromil!

maybe this is a non-problem due to my lacking math skills ;) I also couldnt follow the whole intuition of the GCR method. I will ask on monday. The principle is clear though.
so my potential naive question: if variances are 0, we could run into issues using cholesky on the covariance matrix for GCR. Correct? If so, we could run the rePCA before it, resample the rePCA and push it back into original space. Would that work? That should circumvent the variance=0 problem. Not sure how "fair" it is though, I still have issues with variance=0 in Julia (I have the feeling the solutions could be numerically instable, potentially pushing around the 0 to other predictors).

@palday the R package linked above is GPL licensed, is that good enough for you to get inspired from it? @jaromilfrossard what is RGL licensed?

lastly: The issue with LSQ is that Z can be overdetermined, @palday, is that what Reinhold uses in his "strange" models? I forgot the syntax. The thing Dave implemented for him

PS: as always, sorry if I ask naive things - but if I dont ask them now I will be way behind you guys ;)

Edit: Thought about it more and the rePCA probably doesnt solve the problem, because obviously if we back-calculate the rePCA, the resulting random effect predictions are still 0.

This was referenced Feb 10, 2021

PermutationSchema: completly-pooled LeastSquares #3

Closed

Implementation Overview #6

Open

This was referenced Feb 15, 2021

Types and CGR nonparametric bootstrap #9

Merged

Permutation #12

Merged

palday closed this as completed in #12 Feb 16, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

PermutationSchema: terBraak #4

PermutationSchema: terBraak #4

behinger commented Feb 10, 2021 •

edited

Loading

jaromilfrossard commented Feb 12, 2021

behinger commented Feb 12, 2021 •

edited

Loading

PermutationSchema: terBraak #4

PermutationSchema: terBraak #4

Comments

behinger commented Feb 10, 2021 • edited Loading

jaromilfrossard commented Feb 12, 2021

First method: using CGR

For the independant case:

For the model with covariance between random effects:

Second method: including OLS

behinger commented Feb 12, 2021 • edited Loading

behinger commented Feb 10, 2021 •

edited

Loading

behinger commented Feb 12, 2021 •

edited

Loading