The factorial single-cell latent variable model (slalom)
What is slalom?
slalom is a scalable modelling framework for single-cell RNA-seq data that uses gene set annotations to dissect single-cell transcriptome heterogeneity, thereby allowing identification of biological drivers of cell-to-cell variability and model confounding factors.
Observed heterogeneity in single-cell profiling data is multi-factorial. slalom provides an efficient framework for unravelling this heterogeneity by simultaneously inferring latent factors that reflect annotated factors from pathway databases, as well as unannotated factors that capture variation outside the annotation. slalom builds on sparse factor analysis models, for which this implementation provides efficient approximate inference using Variational Bayes, allowing the application of slalom to very large datasets containing up to 100,000 cells.
We provide two implementations of the slalom model: an R/C++ implementation that is available on Bioconductor and a Python implementation. Both the R and Pyhton packages implement the model described in the accompanying publication .
Software by Florian Buettner, Davis McCarthy and Oliver Stegle.
slalom R package is available on Bioconductor, so the most reliable way
to install the package is to use the usual Bioconductor method:
## try http:// if https:// URLs are not supported source("https://bioconductor.org/biocLite.R") biocLite("slalom")
The source code for the R package can be found in the R_package folder of this repository.
The vignette supplied with the R package provides an overview of usage of that implementation of slalom.
Installation requirements python implementation:
slalom requires Python 2.7 or newer with
- scipy, h5py, numpy, matplotlib, scikit-learn, re
slalom can be installed via pip with
pip install slalom.
For best results, we recommend the ANACONDA python distribution.
How to use slalom?
The current software version should be considered as beta. More extensive documentation, tutorials and examples will be available soon.
For an illustration of how slalom can be applied to mESC data considered in Buettner et al. , we have prepared a notebook. Along with other notebooks, this illustrates example analyses/workflows with slalom that you can read, download and adapt for your own analyses. These notebooks can be viewed and downloaded from here or here.
Documentation of the code can be found here.
 Buettner, F.,Pratanwanich, N., McCarthy, D.J., Marioni, J., Stegle, O. f-scLVM: scalable and versatile factor analysis for single-cell RNA-seq, 2017, Genome Biology.