Fast estimation of Gaussian Mixture Copula Models
The GMCM package (Bilgrau et. al., 2016) offers R functions that very fast perform high-dimensional meta-analysis (Li et. al., 2011) and general unsupervised cluster analysis (Tewari et. al., 2011) using Gaussian Copula Mixture Models. Online documentation is available here.
Gaussian copula mixture models (GMCMs) are a very flexible alternative to regular Gaussian mixture models (GMMs) in unsupervised cluster analysis of continuous data where non-normal clusters are present. GMCMs models the ranks of the observed data and are thus invariant to monotone increasing transformations of the data, i.e. they are semi-parametric and only the ordering of the data is important providing needed flexibility. A special-case of GMCMs can be used for a novel meta-analysis approach in high-dimensional settings. In this context, the model tries to cluster results into two groups which agree and do not agree on statistical evidence. These two groups corresponds to a reproducible and irreproducible group.
The optimization of the complicated likelihood function is difficult, however. GMCM utilizes Rcpp and RcppArmadillo to evaluate the likelihood function quickly and arrive at a parameter estimate using either standard numerical optimization routines or an pseudo EM algorithm.
Additional information, documentation, help, and examples can be found by here or by running
?GMCM in R.
The paper is also found as a vignette by
vignette("GMCM-JStatSoft") or the official website online..
The core user functions of GMCM are
The released and tested version of GMCM is available at CRAN (Comprehensive R Archive Network). It can be installed from within R by running
If you wish to install the latest version of GMCM directly from the master branch at GitHub, run
#install.packages("remotes") # Install remotes if needed remotes::install_github("AEBilgrau/GMCM")
Note, that this version is in development and is likely different from the version at CRAN. As such, it may be unstable. Be sure that you have the package development prerequisites if you wish to install the package from the source.
When installed, run
GMCM::runGMCM() to launch a local instance of the GMCM shiny application also available online at shinyapps.io.
news(package = "GMCM") to view the latest changes of GMCM or visit here.
As noted above, the usage of GMCM comes in two different applications; one general and one special.
An example of using the package to fit special GMCMs for meta analysis of is described here
vignette("usage-example-special-model"). This model is a specific special case of the general GMCMs.
An example of unsupervised clustering using the package is found with
vignette("usage-example-general-model") for general purposes.
The package also provides a graphical user interface via Shiny for both its uses. See
- Anders Ellern Bilgrau, Poul Svante Eriksen, Jakob Gulddahl Rasmussen, Hans Erik Johnsen, Karen Dybkaer, Martin Boegsted (2016). GMCM: Unsupervised Clustering and Meta-Analysis Using Gaussian Mixture Copula Models. Journal of Statistical Software, 70(2), 1-23. doi:10.18637/jss.v070.i02