scLVM is a modelling framework for single-cell RNA-seq data that can be used to dissect the observed heterogeneity into different sources, thereby allowing for the correction of confounding sources of variation.
HTML Jupyter Notebook Python R
Switch branches/tags
Nothing to show
Clone or download
Latest commit b6ada74 Jun 20, 2017
Permalink
Failed to load latest commit information.
CFG initial commit Sep 14, 2014
R Update README.md Jun 20, 2017
data/Tcell adjusted mardown Oct 16, 2014
scLVM merge Jun 20, 2017
tutorials updates, tutorial Sep 20, 2015
.gitignore package up scLVM May 11, 2015
README.md Update README.md Jun 20, 2017
license.txt adjusted mardown Oct 16, 2014
setup.py merge Jun 20, 2017

README.md

scLVM

What is scLVM?

scLVM is a modelling framework for single-cell RNA-seq data that can be used to dissect the observed heterogeneity into different sources, thereby allowing for the correction of confounding sources of variation.

scLVM was primarily designed to account for cell-cycle induced variations in single-cell RNA-seq data where cell cycle is the primary soure of variability. For other use cases tutorials will follow shortly.

Software by Florian Buettner, Paolo Casale and Oliver Stegle. scLVM is explained in more detail in the accompanying publication [1].

Philosophy

Observed heterogeneity in single-cell profiling data is multi-factorial. scLVM provides an efficient framework for unravelling this heterogeneity, correcting for confounding factors and facilitating unbiased downstream analyses. scLVM builds on Gaussian process latent variable models and linear mixed models. The underlying models are based on inference schemes implemented in LIMIX.

Installation:

  • scLVM can be installed using pip install scLVM on most systems. If you have trouble using pip, have a look at the detailed instructions in the wiki.

  • It requires Python 2.7 with

    • scipy, h5py, numpy, pylab
  • In addition, scLVM relies heavily on limix (version 1.0.8 or higher).

  • If you would like to use the non-linear GPLVM for visualisation, we suggest installing the GPy package. This can be installed using pip install GPy.

  • Preprocessing steps are executed in R and require R>3.0: This can either be perfromed as part of the R package (see also next bullet point) or via scripts. For an example of how raw counts can be processed appropriately, see our markdown vignette.

  • For users who prefer to run the entire scLVM pipeline in R, we also provide an R package wich is based on rPython. The scLVM R package can be downloaded here

How to use scLVM?

The current software version should be considered as beta. Still, the method is working and can be used to reproduce the result of the accompanying publication [1]. More extensive documentation, tutorials and examples will be available soon.

A good starting point are the tutorials for our R package and for the python implementation.

For an illustration of how scLVM can be applied to the T-cell data considered in Buettner et al. [1], we have prepared a notebook that can be viewed interactively or alternatively as PDF export. This is also available for the R package.

While in principle both the R package and the python package have the same funcitonality, we recommend using the R package as more extensive documentation is available and the focus of development currently lies on the R package.

Problems ?

If you want to use scLVM and encounter any issues, please contact us by email: scLVM-dev@ebi.ac.uk

License

See LICENSE

References

[1] Buettner F, Natarajan KN, Casale FP, Proserpio V, Scialdone A, Theis FJ, Teichmann SA, Marioni JC & Stegle O, 2015. Computational analysis of cell-to-cell heterogeneity in single-cell RNA-Sequencing data reveals hidden subpopulation of cells, Nature Biotechnology, doi: 10.1038/nbt.3102.