The biomarker networks measured by different modalities of data (e.g., structural magnetic resonance imaging (sMRI), diffusion tensor imaging (DTI)) may share the same true underlying biological model. In this work, we propose a node-wise biomarker graphical model to leverage the shared mechanism between multi-modality data to provide a more reliable estimation of the target modality network and account for the heterogeneity in networks due to differences between subjects and networks of external modality. Latent variables are introduced to represent the shared unobserved biological network and the information from the external modality is incorporated to model the distribution of the underlying biological network. An approximation approach is used to calculate the posterior expectations of latent variables to reduce time.
-
Title: Integrative Network Learning for Multi-modaility Biomarker Data
-
Authors: Shanghong Xiea (sx2168@columbia.edu), Donglin Zengb, and Yuanjia Wanga
-
Affiliations:
-
- Department of Biostatistics, Mailman School of Public Health, Columbia University, New York
-
- Department of Biostatistics, University of North Carolina, Chapel Hill, North Carolina
-
-
Manuscript: Xie S, Zeng D and Wang Y (2021). Integrative Network Learning for Multi-modaility Biomarker Data. Annals of Applied Statistics 15(1), 64-87.
- R
- Install Rcpp and RcppEigen packages
The code for the proposed methodology is included in Code folder. Please download all the files in the folder to implement the method.
To implement the proposed method with approximated posterior expectation of latent variables, source the following files
sourceCpp('~/INLApproxC.cpp')
source('~/INLApproxRcode.R')
source('~/INLApproxHardThrRcode.R')
To implement the proposed method with exact calculation of posterior expectation of latent variables, source the following files
sourceCpp('~/INLDirectC.cpp')
source('~/INLDirectRcode.R')
source('~/INLDirectHardThrRcode.R')
-
INLApproxRcode: estimate network without pruning, the posterior expectations of latent variables are calculated by the approximated approach. It requires sourceCpp('~/INLApproxC.cpp').
-
INLApproxHardThrRcode.R: hard thresholding the estimated network from INLApproxRcode output based on EBIC.
-
INLDirectRcode.R: estimate network without pruning, the posterior expectations of latent variables are calculated by direct calculation. It requires sourceCpp('~/INLDirectC.cpp').
-
INLDirectHardThrRcode.R: hard thresholding the estimated network from INLDirectRcode output based on EBIC.
Viegnette.R provides an example to implement the methods using the simulated data.