Skip to content

jessemzhang/dendrosplit

Repository files navigation

DendroSplit

This repository provides the full source code for the DendroSplit framework described in the paper "An Interpretable Framework for Clustering Single-Cell RNA-Seq Datasets" by Zhang, Fan, Fan, Rosenfeld, and Tse. It also contains the scripts necessary for reproducing the results in the paper. Please see this Bitbucket repository for the version of the package used and maintained by BD Genomics.

Overview

In our paper we analyzed 9 publicly available single-cell RNA-Seq datasets:

  1. Biase et al.: paper, data
  2. Yan et al.: paper, data
  3. Pollen et al.: paper, data
  4. Kolodzieczyk et al.: paper, data
  5. Patel et al.: paper, data
  6. Zeisel et al.: paper, data
  7. Macosko et al.: paper, data
  8. Birey et al.: paper, data
  9. Zheng et al.: paper, data

We also analyzed some synthetic datasets. Please see the Jupyter notebooks in the Figures directory for the code used to reproduce all the figures in the paper. Some wrapper code used in the notebooks is also provided. For each dataset, processing requires 4 inputs which are saved in directory DATAPREFIX/ as:

  1. DATAPREFIX_expr.txt (or DATAPREFIX_expr.h5 for larger datasets): a matrix of gene/transcript expression values where the rows correspond to cells and the columns correspond to features
  2. DATAPREFIX_labels.txt: a set of labels for all the cells
  3. DATAPREFIX_features.txt: a set of feature names
  4. DATAPREFIX_reducedim_coor.txt: a 2D representation of the data for visualizing results

Dependencies

DendroSplit is written in Python 2.7 and has the following dependencies (Python modules):

  • numpy (1.12.1)
  • scipy (0.19.0)
  • matplotlib (1.5.3)
  • sklearn (0.18.1)
  • networkx (1.11)
  • community

The tutorial Jupyter notebook also uses tsne (0.1.7) and pandas (0.20.1) for preparing the example data.

Instructions

DendroSplit can be installed via pip:

pip install dendrosplit

Import DendroSplit by adding the following line of code to your Python script:

from dendrosplit import split, merge, utils

A tutorial for using the main DendroSplit functions is given in the tutorial Jupyter notebook. Please refer to the Jupyter notebooks used to generate the figures in the paper for more examples.

License

DendroSplit is licensed and distributed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International license.

Method

method

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages