systemPipeR: NGS workflow and report generation environment
systemPipeR is an R/Bioconductor package for building and running automated end-to-end analysis workflows for a wide range of next generation sequence (NGS) applications such as RNA-Seq, ChIP-Seq, VAR-Seq and Ribo-Seq. Important features include a uniform workflow interface across different NGS applications, automated report generation, and support for running both R and command-line software, such as NGS aligners or peak/variant callers, on local computers or compute clusters. The latter supports interactive job submissions and batch submissions to queuing systems of clusters. Efficient handling of complex sample sets and experimental designs is facilitated by a well-defined sample annotation infrastructure which improves reproducibility and user-friendliness of many typical analysis workflows in the NGS area.
To install the package, please use the biocLite method as instructed here.
To obtain the most recent updates immediately, one can install it directly from github as follows:
source("http://bioconductor.org/biocLite.R") biocLite("tgirke/systemPipeR", build_vignettes=TRUE, dependencies=TRUE)
Instructions for running systemPipeR are given in its main vignette (manual). The sample data set used in the vignette are provided by the data package systemPipeRdata. The expected format to define NGS samples (e.g. FASTQ files) and their labels are given in targets.txt and targetsPE.txt (latter is for PE reads). The run parameters of command-line software are defined by param files that have a simplified JSON-like name/value structure. Here is a sample param file for Tophat2: tophat.param. Templates for setting up custom project reports are provided by systemPipeRdata. The corresponding PDFs of these report templates are linked here: systemPipeRNAseq, systemPipeRIBOseq, systemPipeChIPseq and systemPipeVARseq.