-
Notifications
You must be signed in to change notification settings - Fork 17
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add fixed (definition variable) covariates to umxACE #21
Comments
So to do this across all twin models, we need a method that can handle ordinal and continuous data, still handle models with no covariates, and works at the level of
So... need to add average effects matrices and beta matrices to
|
Binary variables can be included. The trick is that the means formula contains only the regressions on the covariates, with no grand mean/intercept parameter. In other cases, the mean can be a free parameter (assuming that one uses the Mehta et al trick of fixing two adjacent thresholds). Yes this would be very nice to have! |
|
so... twinData$cohort1 = twinData$cohort2 =twinData$part
mzData = twinData[twinData$zygosity %in% "MZFF", ]
dzData = twinData[twinData$zygosity %in% "DZFF", ]
m2 = umxACE(selDVs = "ht", selCovs = c("age", "cohort"), sep = "", dzData = dzData, mzData = mzData)
umxSummaryACE(m2,digits=3) ACE -2 × log(Likelihood) = 5944.831
Means: Intercept and (raw) betas from model$top$intercept and model$top$meansBetas
|
Interesting downside: having def vars in a model increases model run time 20-fold... 4sec ACE -> 90s with def.covariates in the means model. But: All working, and now
|
Great that selCovs is working more broadly! It is unsurprising that using definition variables slows things down. Remember that with FIML, each row of the data has its own set of path coefficients (some may be the same across different rows, others may differ on an individual basis). So computationally, the expected covariance matrix has to be rebuilt and inverted for each data row. OpenMx has some economies in doing this, looking at whether the definition variables or the pattern of observed variables differs from the previous row, and not bothering to reconstruct or invert if the result is already known. So the slow down largely depends on the number of unique covariance matrices the algorithm has to invert. So it seems that covariates with ordinal variables analyzed by FIML is good to go. Of course, there's a limit to the number of variables that can reasonably be jointly analyzed as ordinal, due to the curse of dimensionality. I'd probably not go further than about a dozen total. |
yes, multiple covariates is working for most models and for binary, ordinal, continuous variables and for mixtures of these. |
Speed comment more to consider implementing regression based method under the hood for the all continuous case, or at least note to user that umx_residualize will be many times faster |
Yep. I note that it would be possible to residualize the continuous variables and only apply the definition approach to the ordinal ones. residualizeContinuousVars=TRUE or some such argument. In practice this would make the modeling steps faster because there would be fewer parameters to optimize. It would not permit testing of whether different variables' regressions on covariates are equal, although I don't think I've seen such usage. In factor analysis a Rasch model essentially equates factor loadings, but it's not the situation here. |
Yeah: will do that - not always a win, but for the “lots of ord and lots of cont” it would be dealmaker. Good suggestion! |
Great. Situations with many continuous and only a few ordinal variables would see the greatest performance improvements. Neuroimaging & diagnostic outcome analyses are good examples of the need. |
Currently, users wanting to use covariates are encouraged to use umx_residualize on their data. This doesn't work for ordinal variables (it's not good turn sex from a binary to a continuous-bimodal distribution...) and also it's nice to have the means in the model, and to retain the raw data.
v 2.0 of umx should support covariates, report how many rows were lost, have the means and covariate betas printed in summary,
The text was updated successfully, but these errors were encountered: