-
Notifications
You must be signed in to change notification settings - Fork 0
Home
Welcome to our project wiki!
There are important differences among individual cells. Recent advances allow single-cell-resolution measurement of many molecular parameters, such as the expression of surface markers. Analyzing, visualizing, and interpreting this data presents many challenges, but could lead to better understanding of disease processes.
Given a large number of high-dimensional single-cell observations, we would like to:
- Characterize rare cellular subtypes
- How many cells are in the subtype?
- What are the average values of observables within the subtype? How is this different from other cells in the population?
- Visualize all observations in 2D
- Illustrate relations between sub-types
- Summarize the full dataset in terms of distinct clusters and/or continuous spectra of variation, as appropriate
In addition to applications of generic algorithms like PCA, these new datasets have motivated the development / adaptation of domain-specific algorithms.
http://www.nature.com/nbt/journal/v29/n10/full/nbt.1991.html
https://github.com/nolanlab/spade
- Generates qualitatively different outputs when run multiple times on the same data (due to stochastic down-sampling step)
- The number of cellular subtypes identified is a user-defined parameter, not an output
- Always returns a "progression tree", even when the cluster centers are, for example, mutually equidistant
- The local density estimator used is non-standard and is a nonlinear function of actual local density
Dana Pe'er's lab used t-distributed stochastic neighbor embedding (re-branded as "viSNE") to visualize cytometry data in 2D.
viSNE wraps Laurens van der Maaten's original C++ implementation, described in the paper "Barnes-Hut SNE" and available at: http://lvdmaaten.github.io/tsne/
- t-SNE learns an embedding, not a map: cannot extend to new out-of-sample observations
- t-SNE is a force-based method and sensitive to "resolution parameters:" it can artificially create clusters if attractive forces between similar points are too strong relative to repulsive forces between dissimilar points (cf. http://www.pnas.org/content/108/41/16916.full)
- Theano Homepage: http://deeplearning.net/software/theano/index.html
- Hinton Science Paper: http://www.cs.toronto.edu/~hinton/science.pdf
- Hinton Training RBMs Paper: https://www.cs.toronto.edu/~hinton/absps/guideTR.pdf
- deeplearning.net Theano RBM tutorial: http://deeplearning.net/tutorial/rbm.html