Skip to content

Improved multi-sample transcript abundance estimates using adaptive priors

License

Notifications You must be signed in to change notification settings

DarwinAwardWinner/shoal

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

29 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

shoal

Improved multi-sample transcript abundance estimates using adaptive priors

A shoal1

What is shoal?

shoal is a tool which jointly quantify transcript abundances across multiple samples. Specifically, shoal learns an empirical prior on transcript-level abundances across all of the samples in an experiment, and subsequently applies a variant of the variational Bayesian expectation maximization algorithm to apply this prior adaptively across multi-mapping groups of reads.

shoal can increase quantification accuracy, inter-sample consistency, and reduce false positives in downstream differential analysis when applied to multi-condition RNA-seq experiments. Moreover, shoal, runs downstream of Salmon and requires less than a minute per-sample to re-estimate transcript abundances while accounting for the learned empirical prior.

shoal is designed and developed by Avi Srivastava, Michael Love and Rob Patro.

Using shoal

Shoal requires to have salmon output of all the samples in the experiment separately using the latest version of Salmon (either built from the develop branch of the Salmon repo; or, you can grab a pre-compiled binary for Linux from here). Please run Salmon with the --dumpEqWeights option, which will produce output suitable for shoal.

  • clone shoal into your local machine:
git clone https://github.com/COMBINE-lab/shoal.git
  • run shoal2:
./run_shoal.sh -q <salmon_quant_directory_path> -o <output_directory_path>

This script assumes that all of the Salmon quantification directories are subdirectories of the path that you provide via the -q option. So, e.g., if you have an experiment with six samples across 2 conditions (say, A{1,2,3} and B{1,2,3}), then the shoal script would expect a layout like:

exp_quants
  |
  |--- A1
     |
     |--- quant.sf
  |--- A2
    |
    |--- quant.sf
  |--- A3
    |
    |--- quant.sf
  |--- B1
    |
    |--- quant.sf
  |--- B2
    |
    |--- quant.sf
  |--- B3
    |
    |--- quant.sf

the script would then be invoked by passing -q exp_quants to provide the top-level quantification directory for the entire experiment. Specifically, a command like ./run_shoal -q exp_quants -o exp_shoal_quants would produce a modified (Salmon-format) quantification file for each of the samples ({A,B}{1,2,3}) in the directory exp_shoal_quants as described below (the script will create the output directory if it does not already exist).

  • shoal output:
    -- shoal generates .sf files for each sample in the experiment with naming convention as follows:
<output_directoty>/<sample_name>_adapt.sf

Footnotes:

1 This image is from the wikipedia artical on shoaling. It is licensed under CC-BY-SA.

2 shell script can be given executable permission with command: chmod +x run_shoal.sh

About

Improved multi-sample transcript abundance estimates using adaptive priors

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • C++ 98.4%
  • Other 1.6%