Skip to content
/ scanpy Public

Testing version of scanpy that solely includes DPT and diffusion maps.

License

Notifications You must be signed in to change notification settings

ShHsLin/scanpy

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Examples | Tools | Comparable Software | Installation | References

scanpy - single-cell analysis in python

!! This is just a testing version that solely includes DPT and Diffusion Maps. !!

!! Comments are welcome. !!

Tools for analyzing and simulating single-cell data that aim at an understanding of dynamic biological processes from snapshots of transcriptome or proteome.

Examples

The following examples assume you use the Python scripts in tools, which work without installation. You might modify these scripts to your own taste, for example, by adding more examples in tools/preprocess.py. In case you prefer working with jupyter notebooks, you might look at examples in examples/examples.ipynb.

Download or clone the repository and cd into its root directory. The package has been tested using a Anaconda environments for Python 2 and 3.

Segment 1 corresponds to a branch of granulocyte/macrophage progenitors (GMP), segment 3 corresponds to a branch of megakaryocyte/erythrocyte progenitors (MEP).

$ python tools/dpt.py paul15

Segment 3 corresponds to a branch of erythorocytes, segment 1 and 2 to a branch of endothelial cells.

$ python tools/dpt.py moignard15

In case you just want to get a quick visualization using the diffusion map representation.

$ python tools/diffmap.py moignard15

Your own example

We are not satisfied with taking the logarithm of the count matrix before running DPT for the data of Paul et al. (2015) as in example paul15 above. We copy the entry paul15 from the dicionary examples in scanpy/preprocess.py and paste it into the dictionary examples in tools/preprocess.py. We then rename the key of the new entry to "paul15_nolog". We do the same with the function paul15, where we remove the log transform and rename it to paul15_nolog.

Running paul15_nolog, we observe a considerably changed representation. Here, we identify segment 3 with the branch of granulocyte/macrophage progenitors (GMP) and segment 2 with the branch of megakaryocyte/erythrocyte progenitors (MEP).

$ python tools/dpt.py paul15_nolog

Simulated myeloid progenitor data (Krumsiek et al., 2011)

Here, we are going to simulate some data using a literature curated boolean gene regulatory network, which is believed to describe myeloid differentiation (Krumsiek et al., 2011). Using sim.py, the boolean model is translated into a stochastic ordinary differential equation (Wittman et al., 2009). Simulations result in branching time series of gene expression, where each branch corresponds to a certain cell fate of common myeloid progenitors (megakaryocytes, erythrocytes, granulocytes and monocytes).

$ python tools/sim.py krumsiek11

If the order is shuffled, as in a snapshot, the same data looks as follows

Let us reconstruct an order according to estimating geodesic distance with DPT. By that, we obtain the branching lineage using

$ python tools/dpt.py krumsiek11

The left panel illustrates how the data is organized according to a pseudotime and different segments. Pseudotime is an estimator of geodesic distance on the manifold from an initial point. Segments are discrete partitions of the data. Both can be visualized in the diffusion map representation.

Tools

Here, each tool is described in more detail.

diffmap and dpt

diffmap.py implements diffusion maps Coifman et al. (2005), which has been proposed for visualizing single-cell data by Haghverdi et al. (2015). Also, diffmap.py accounts for modifications to the original algorithm proposed by Haghverdi et al. (2016).

dpt.py implements Diffusion Pseudotime as introduced by Haghverdi et al. (2016).

The functions of these two tools compare to the R package destiny of Angerer et al. (2015).

Comparable Software

This section compiles software packages that are comparable to scanpy, but differ substantially in implementation, usage and tools provided. A more comprehensive list can be found here.

Installation

For usage of the scripts in tools from the root of the repository, no installation is needed.

If you want to import scanpy from anywhere on your system, you can install it locally via

$ pip install .

You can also install the package with symlinks, so that changes on your version of the package become immediately available

$ pip install -e .

Your work on the scripts in tools will not be affected by installation. These scripts insert the root of the repository at the beginning of the search path via sys.path.insert(0,'.') and hence load scanpy locally.

References

Angerer et al. (2015), destiny - diffusion maps for large-scale single-cell data in R, Bioinformatics 32, 1241.

Coifman et al. (2005), Geometric diffusions as a tool for harmonic analysis and structure definition of data: Diffusion maps, PNAS 102, 7426.

Haghverdi et al. (2015), Diffusion maps for high-dimensional single-cell analysis of differentiation data, Bioinformatics 31, 2989.

Haghverdi et al. (2016), Diffusion pseudotime robustly reconstructs branching cellular lineages, Nature Methods 13, 845.

Moignard et al. (2015), Decoding the regulatory network of early blood development from single-cell gene expression measurements, Nature Biotechnology 33, 269.

Paul et al. (2015), Transcriptional Heterogeneity and Lineage Commitment in Myeloid Progenitors, Cell 163, 1663.

Wittman et al. (2009), Transforming Boolean models to continuous models: methodology and application to T-cell receptor signaling, BMC Systems Biology 3, 98.

About

Testing version of scanpy that solely includes DPT and diffusion maps.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages