Feature Selection and Dimension Reduction for Single Cell RNA-Seq based on a Multinomial Model
This repository contains supporting code to facilitate reproducible analysis. For details see the biorxiv preprint. If you find bugs please create a github issue.
GLM-PCA (dimension reduction for generalized linear model likelihoods) is now available as a standalone R package. This method is highlighted in the paper as being suitable for single cell RNA-Seq data.
Will Townes, Stephanie Hicks, Martin Aryee, and Rafa Irizarry
Description of Repository Contents
Implementations of dimension reduction algorithms
- existing.R - wrapper functions for PCA, tSNE, ZINB-WAVE, etc
- glmpca.R - placeholder file that just loads the glmpca package.
Analysis of various real scRNA-Seq datasets. The Rmarkdown files can be used to produce figures in the manuscript
Systematic assessment of clustering performance of a variety of normalization, feature selection, and dimension reduction algorithms using ground-truth datasets.
- clustering.R - wrappers for seurat clustering, model based clustering, and k-means
- functions.R - Poisson and Binomial deviance and residuals functions, a function for loading 10x read counts from molecule information files.
- functions_genefilter.R - convenience functions for gene filtering (feature selection) based on highly variable genes, highly expressed genes, and deviance.