Personalized cancer epitope discovery and peptide vaccine prediction pipeline
Latest commit 0e768f4 Feb 16, 2017 @ihodes ihodes committed on GitHub Merge pull request #155 from hammerlab/refactor
Refactor & improve docs


Epidisco is a highly-configurable genomic pipeline. It supports alignment, the GATK, variant calling, epitope discovery, and vaccine generation.

It uses Biokepi to construct Ketrew workflows, which can use Torque, YARN, and even Kubernetes on Google Cloud (via Coclobas) to schedule on many kinds of clusters.

Pipeline Overview

Note on Multiple Samples

You can pass multiple samples into Epidisco, but they will be merged into one sample (tumor, normal, or tumor RNA) after the alignment & mark duplicates step. This option to process multiple samples should only be used to e.g. pass data from biological replicates (or samples you wish to treat as such) into the pipeline, which fundamentally operates on a tumor, normal, and tumor RNA sample set.


Getting started with Epidisco is most easily done by setting up a GCloud cluster following these instructions, which also cover how to submit an Epidisco job.

Once compiled, epidisco --help provides extensive instructions on how to invoke the pipeline.

Advanced Usage

For more advanced uses, you can build Epidisco with omake, and then run it using an ocaml script like the following (calling it, say,

#use "topfind";;
#require "epidisco";;

#use "./";;

let () =
  Epidisco.Command_line.main ~biokepi_machine ()

Call it with ocaml to see the possible options.