Skip to content

Inference of multiscale graphical models with neighborhood selection approach

License

Unknown, MIT licenses found

Licenses found

Unknown
LICENSE
MIT
LICENSE.md
Notifications You must be signed in to change notification settings

desanou/mglasso

Repository files navigation

R-CMD-check Project Status: Active – The project has reached a stable, usable state and is being actively developed. Lifecycle: maturing

Method

This repository provides an implementation of the MGLasso (Multiscale Graphical Lasso) algorithm: an approach for estimating sparse Gaussian Graphical Models with the addition of a group-fused Lasso penalty.

MGLasso is described in the paper Inference of Multiscale Gaussian Graphical Model. MGLasso has these contributions:

  • We simultaneously infer a network and estimate a clustering structure by combining the neighborhood selection approach (Meinshausen and Bühlman, 2006) and convex clustering (Hocking et al. 2011).

  • We use a continuation with Nesterov smoothing in a shrinkage-thresholding algorithm (CONESTA, Hadj-Selem et al. 2018) to solve the optimization problem.

To solve the MGLasso problem, we seek the regression vectors \beta^i that minimize

J_{\lambda_1, \lambda_2}(\boldsymbol{\beta}; \mathbf{X} ) =  \frac{1}{2}  \sum_{i=1}^p  \left \lVert  \mathbf{X}^i - \mathbf{X}^{\setminus i} \boldsymbol{\beta}^i  \right \rVert_2 ^2 +  \lambda_1  \sum_{i = 1}^p  \left \lVert  \boldsymbol{\beta}^i \right \rVert_1 +  \lambda_2  \sum_{i < j}  \left \lVert  \boldsymbol{\beta}^i - \tau_{ij}(\boldsymbol{\beta}^j)  \right \rVert_2.

MGLasso package is based on the python implementation of the solver CONESTA available in pylearn-parsimony library.

Package requirements and installation

  • Install the reticulate package and Miniconda if no conda distribution available on the OS.
install.packages('reticulate')
reticulate::install_miniconda()
  • Install MGLasso, its python dependencies and configure the conda environment rmglasso.
# install.packages('mglasso')
remotes::install_github("desanou/mglasso")
library(mglasso)
install_pylearn_parsimony(envname = "rmglasso", method = "conda")
#> + "C:/Users/doedm/AppData/Local/r-miniconda/condabin/conda.bat" "install" "--yes" "--name" "rmglasso" "python=3.8"
reticulate::use_condaenv("rmglasso", required = TRUE)
reticulate::py_config()
#> python:         C:/Users/doedm/AppData/Local/r-miniconda/envs/rmglasso/python.exe
#> libpython:      C:/Users/doedm/AppData/Local/r-miniconda/envs/rmglasso/python38.dll
#> pythonhome:     C:/Users/doedm/AppData/Local/r-miniconda/envs/rmglasso
#> version:        3.8.13 (default, Mar 28 2022, 06:59:08) [MSC v.1916 64 bit (AMD64)]
#> Architecture:   64bit
#> numpy:          C:/Users/doedm/AppData/Local/r-miniconda/envs/rmglasso/Lib/site-packages/numpy
#> numpy_version:  1.22.4
#> conesta_solver: [NOT FOUND]
#> 
#> NOTE: Python version was forced by use_python function

The conesta_solver is delay loaded. See reticulate::import_from_path for details.

An example of use is given below.

Illustration on a simple model

Simulate a block diagonal model

We simulate a 3-block diagonal model where each block contains 3 variables. The intra-block correlation level is set to 0.85 while the correlations outside the blocks are kept to 0.

library(Matrix)
n = 50
K = 3
p = 9
rho = 0.85
blocs <- list()

for (j in 1:K) {
  bloc <- matrix(rho, nrow = p/K, ncol = p/K)
  for(i in 1:(p/K)) { bloc[i,i] <- 1 }
  blocs[[j]] <- bloc
}

mat.correlation <- Matrix::bdiag(blocs)
corrplot::corrplot(as.matrix(mat.correlation), method = "color", tl.col="black")

Simulate gaussian data from the covariance matrix

set.seed(11)
X <- mvtnorm::rmvnorm(n, mean = rep(0,p), sigma = as.matrix(mat.correlation))
colnames(X) <- LETTERS[1:9]

Run mglasso()

We set the sparsity level \lambda_1 to 0.2 and rescaled it with the size of the sample.

X <- scale(X)    
res <- mglasso(X, lambda1 = 0.2*n, lambda2_start = 0.1, fuse_thresh = 1e-3, verbose = FALSE)

To launch a unique run of the objective function call the conesta function.

temp <- mglasso::conesta(X, lam1 = 0.2*n, lam2 = 0.1)

Estimated clustering path

We plot the clustering path of mglasso method on the 2 principal components axis of X. The path is drawn on the predicted X’s.

library(ggplot2)
library(ggrepel)
mglasso:::plot_clusterpath(as.matrix(X), res)

Estimated adjacency matrices along the clustering path

As the the fusion penalty increases from level9 to level1 we observe a progressive fusion of adjacent edges.

plot_mglasso(res)

Reference

Edmond, Sanou; Christophe, Ambroise; Geneviève, Robin; (2022): Inference of Multiscale Gaussian Graphical Model. ArXiv. Preprint. https://doi.org/10.48550/arXiv.2202.05775

About

Inference of multiscale graphical models with neighborhood selection approach

Resources

License

Unknown, MIT licenses found

Licenses found

Unknown
LICENSE
MIT
LICENSE.md

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published