Skip to content
A series of analyses related to species compendia
Branch: master
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Type Name Latest commit message Commit time
Failed to load latest commit information.

Exploring species compendia

Once reaches production, we will periodically release compendia comprised of all the samples from a species that we were able to process. We refer to these as species compendia and envision that these collections will be useful for extracting features from a diverse set of biological conditions. Creating a species compendia pipeline required us to tackle various problems such as selecting a method for imputing missing values. This repository holds a series of analyses related to species compendia divided up into related modules. See the README files in the individual directories for more information.


  • select_imputation_method - A series of experiments/evaluations for selecting a method for imputing missing values.
  • human_missingness - Typically genes that are measured in less than 30% of samples are removed before imputing missing values in gene expression data. How many genes would be left in the human compendium using this cutoff?
  • impute_requirements - How long does it take to run KNN impute? (Too long for our use case.)
  • quality_check - Exploring test zebrafish compendia.
You can’t perform that action at this time.