PROSSTT (PRObabilistic Simulations of ScRNA-seq Tree-like Topologies) is a package with code for the simulation of scRNAseq data for dynamic processes such as cell differentiation. PROSSTT is open source GPL-licensed software implemented in Python.
Single-cell RNAseq is revolutionizing cellular biology, and many algorithms are developed for the analysis of scRNAseq data. PROSSTT provides an easy way to test the performance of trajectory inference methods on realistic data with a known "gold standard". The algorithm can produce datasets with user-defined topologies while simulating any number of sampled cells and genes.
PROSSTT can be installed using the
pip package manager or any
pip-compatible package manager:
pip install git+git://github.com/soedinglab/prosstt.git
Installation from source
git clone https://github.com/soedinglab/prosstt.git cd prosstt python setup.py install
numpy, for data structures
scipy, for probabilistic distributions and special functions
pandas, for I/O
newick, for the Newick tree file format
We also recommend the following libraries:
matplotlib, for plotting
jupyternotebooks, for demonstration and development purposes
scanpy, for the visualization of simulations via diffusion maps. This requires anndata and Python 3.6 to work.
How to use
In PROSSTT, topologies are described in terms of branches and their connectivity. The simulated genes are all impacted by the differentiation process (no nonsense genes included, although they may be in the future).
Alternatively, we include a python script that can be run on the command line (examples/generate_simN.py) to produce simulations like the ones used in the MERLoT paper.
For more information please refer to the documentation.