Skip to content

Muunraker/nipalsMCIA

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

nipalsMCIA: Software to Compute Multi-Block Dimensionality Reduction

BioC status R-CMD-check

This package computes Multiple Co-Inertia Analysis (MCIA) on multi-block data using the Nonlinear Iterative Partial Least Squares (NIPALS) method.

Features include:

  • Efficient computation of deflation and variance enabling embedding of high-volume (e.g. single-cell) datasets.
  • Functionality to perform out-of-sample embedding.
  • Easy-to-adjust options for deflation and pre-processing
  • Multiple visualization and analysis options for sample- and feature-level embedding results
  • Streamlined and well-documented and supported code that is consistent with published theory to enable more efficient algorithm development and extension

References

Installation

This package can be installed via Bioconductor:

if (!require("BiocManager", quietly = TRUE))
  install.packages("BiocManager")

BiocManager::install("nipalsMCIA")

You can install the development version of nipalsMCIA from GitHub with:

# install.packages("devtools")
devtools::install_github("Muunraker/nipalsMCIA", ref = "devel",
                         force = TRUE, build_vignettes = TRUE)

Basic Example

The package currently includes one test dataset: data_blocks. This is a list of dataframes containing observations of variables from three omics types (mRNA, proteins, and micro RNA) on 21 cancer cell lines from the NCI60 cancer cell lines. The data file includes a metadata data frame containing the cancer type associated with each cell line.

# load the package and set a seed for reproducibility
library(nipalsMCIA)
set.seed(42)
data(NCI60) # import data as "data_blocks" and metadata as "metadata_NCI60"

# examine the data and metadata
summary(data_blocks)
#>       Length Class      Mode
#> mrna  12895  data.frame list
#> miRNA   537  data.frame list
#> prot   7016  data.frame list
head(metadata_NCI60)
#>            cancerType
#> CNS.SF_268        CNS
#> CNS.SF_295        CNS
#> CNS.SF_539        CNS
#> CNS.SNB_19        CNS
#> CNS.SNB_75        CNS
#> CNS.U251          CNS
table(metadata_NCI60)
#> cancerType
#>      CNS Leukemia Melanoma 
#>        6        6        9

Note: this dataset is reproduced from the omicade4 package (Meng et. al., 2014). This package assumes all input datasets are in sample by feature format.

The main MCIA function can be called on data_blocks and optionally can include metadata_NCI60 for plot coloring by cancer type:

# to convert data_blocks into an MAE object we provide the simple_mae() function
data_blocks_mae <- simple_mae(data_blocks, row_format = "sample",
                              colData = metadata_NCI60)

mcia_results <- nipals_multiblock(data_blocks_mae = data_blocks_mae,
                                  col_preproc_method = "colprofile",
                                  num_PCs = 10, tol = 1e-12,
                                  color_col = "cancerType")

Here num_PCs is the dimension of the low-dimensional embedding of the data chosen by the user.