Translation from MERF random effects nomenclature to align with other implementations #14

dstanner · 2018-09-05T23:56:36Z

First, thank you for creating this package! It is kind of exactly what I am looking for. However, I have some questions that stem from my prior experience with mixed models on other platforms (notably the lme4 package for R), and the documentation and examples aren't helping me translate my knowledge of how random effects are specified/named in MERF.

For example, in lme4, random effects are designated as having slopes and intercepts (which I think correspond to "clusters" and "covariates" in MERF? With 1s in the covariates matrix indicating the intercepts?).

So if "subject" (in an experiment) or "county") like in the radon example are grouping variables over which there are multiple observations, one could specify a random intercept for subject (or county). Then, one could additionally specify a random slope for some variable by the grouping variable (such as to specify a random slope for experimental condition by subject, to allow the model to estimate the variance of how much subjects differ in their response to the experimental manipulation, or that counties could have a random slope for floor, allowing counties to differ in how much each floor impacts random levels in the model). I'm not quite sure if I'm translating these to the MERF nomenclature correctly.

Moreover, lme4 allows the researcher to specify multiple, crossed random effects (e.g., random intercepts for both experimental subjects and experimental items, as well as random slopes for variables by both subject and item).

Getting to the point:

My looking through the examples leads me to think that the clusters argument is the column containing the IDs for which random intercepts are generated: Is this the case?
I get the inclination that the Z matrix includes 1s for the random intercepts, but can include a second column for random slopes (i.e., the covariates): Is this the case? Can there be more columns?

More generally, some comments in the notebooks or documentation about what these variables are (concretely, and what form they must/can/cannot take, and possibly relationships to how random effects are specified in other mixed modeling packages) would be very helpful.

Can MERF handle crossed random effects structures?
Finally, does MERF provide variable importance measures from the fit forest (analogous to those produced in sklearn, and from randomForest and party::cforest in R)? I couldn't find mention of that in the readme or the notebooks.

Thanks!

resdntalien · 2018-10-01T22:28:11Z

@dstanner Sorry for the super late response to this. You bring up very good points. Some comments covering some (maybe not all) your points:

This adheres to the sklearn model interface as much as possible. That was the goal to make is easily usable in the Python community.
I have implemented this so that Z (the random effects features) can have multiple dimensions. In my notebook examples I've only always made Z a vector of all 1's -- which effectively means we're only allowing random intercepts. You can make random slopes as well by adding in another feature variable, e.g. the floor in the Minnesota radon example. You can put in crossed variables as well -- whatever you put into the feature matrix will be modeled as a random effect. Usually these are also in there as a fixed effect.

And I like your comments I will try to make cleaner notebooks in the upcoming months. Of course, I am going to now put this back on you -- if you want to take a swag at updating some of the notebooks with the clearer nomenclature, please by all means do so and submit a PR. I would really appreciate that.

eacton · 2019-05-28T16:07:35Z

Hi! In a similar vein, I was wondering how to extract information about feature importances, which can be easily accessed with other random forests in python.

resdntalien added the documentation label Mar 28, 2019

resdntalien closed this as completed May 19, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Translation from MERF random effects nomenclature to align with other implementations #14

Translation from MERF random effects nomenclature to align with other implementations #14

dstanner commented Sep 5, 2018

resdntalien commented Oct 1, 2018

eacton commented May 28, 2019

Translation from MERF random effects nomenclature to align with other implementations #14

Translation from MERF random effects nomenclature to align with other implementations #14

Comments

dstanner commented Sep 5, 2018

resdntalien commented Oct 1, 2018

eacton commented May 28, 2019