Skip to content
A series of analyses related to refine.bio species compendia
Branch: master
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
human_missingness
impute_requirements
quality_check
select_imputation_method
.gitignore
LICENSE
README.md

README.md

Exploring refine.bio species compendia

Once refine.bio reaches production, we will periodically release compendia comprised of all the samples from a species that we were able to process. We refer to these as species compendia and envision that these collections will be useful for extracting features from a diverse set of biological conditions. Creating a species compendia pipeline required us to tackle various problems such as selecting a method for imputing missing values. This repository holds a series of analyses related to refine.bio species compendia divided up into related modules. See the README files in the individual directories for more information.

Modules

  • select_imputation_method - A series of experiments/evaluations for selecting a method for imputing missing values.
  • human_missingness - Typically genes that are measured in less than 30% of samples are removed before imputing missing values in gene expression data. How many genes would be left in the human compendium using this cutoff?
  • impute_requirements - How long does it take to run KNN impute? (Too long for our use case.)
  • quality_check - Exploring test zebrafish compendia.
You can’t perform that action at this time.