This package computes Multiple Co-Inertia Analysis (MCIA) on multi-block data using the Nonlinear Iterative Partial Least Squares (NIPALS) method.
Features include:
- Efficient computation of deflation and variance enabling embedding of high-volume (e.g. single-cell) datasets.
- Functionality to perform out-of-sample embedding.
- Easy-to-adjust options for deflation and pre-processing
- Multiple visualization and analysis options for sample- and feature-level embedding results
- Streamlined and well-documented and supported code that is consistent with published theory to enable more efficient algorithm development and extension
Citation
For more information on the methodology used in nipalsMCIA and to cite, please see
- Mattessich et al. (2024) nipalsMCIA: Flexible Multi-Block Dimensionality Reduction in R via Non-linear Iterative Partial Least Squares, bioRxiv 2024.06.07.597819 (https://doi.org/10.1101/2024.06.07.597819).
This package can be installed via Bioconductor:
if (!require("BiocManager", quietly = TRUE))
install.packages("BiocManager")
BiocManager::install("nipalsMCIA")
You can install the development version of nipalsMCIA from GitHub with:
# install.packages("devtools")
devtools::install_github("Muunraker/nipalsMCIA", ref = "devel",
force = TRUE, build_vignettes = TRUE)
The package currently includes one test dataset: data_blocks
. This is
a list of dataframes containing observations of variables from three
omics types (mRNA, proteins, and micro RNA) on 21 cancer cell lines from
the NCI60 cancer cell lines. The data file includes a metadata
data
frame containing the cancer type associated with each cell line.
# load the package and set a seed for reproducibility
library(nipalsMCIA)
set.seed(42)
data(NCI60) # import data as "data_blocks" and metadata as "metadata_NCI60"
# examine the data and metadata
summary(data_blocks)
#> Length Class Mode
#> mrna 12895 data.frame list
#> miRNA 537 data.frame list
#> prot 7016 data.frame list
head(metadata_NCI60)
#> cancerType
#> CNS.SF_268 CNS
#> CNS.SF_295 CNS
#> CNS.SF_539 CNS
#> CNS.SNB_19 CNS
#> CNS.SNB_75 CNS
#> CNS.U251 CNS
table(metadata_NCI60)
#> cancerType
#> CNS Leukemia Melanoma
#> 6 6 9
Note: this dataset is reproduced from the omicade4 package (Meng et. al., 2014). This package assumes all input datasets are in sample by feature format.
The main MCIA function can be called on data_blocks
and optionally can
include metadata_NCI60
for plot coloring by cancer type:
# to convert data_blocks into an MAE object we provide the simple_mae() function
data_blocks_mae <- simple_mae(data_blocks, row_format = "sample",
colData = metadata_NCI60)
mcia_results <- nipals_multiblock(data_blocks_mae = data_blocks_mae,
col_preproc_method = "colprofile",
num_PCs = 10, tol = 1e-12,
color_col = "cancerType")
Here num_PCs
is the dimension of the low-dimensional embedding of the
data chosen by the user.