Skip to content

remiolsen/radQC

Repository files navigation

radseqQC

Pipeline for QC of RAD-seq data

Build Status Nextflow

install with bioconda Docker Singularity Container available

Introduction

radseqQC: Pipeline for QC of RAD-seq data

A work-in-progress pipeline for QC of RAD-seq data generated by National Genomics Infrastructure in Stockholm. The pipeline was was primarily made for an in-house RAD-seq protocol that resembles ezRAD, however other flavours of RAD-seq might work as long as Stacks support them. RAD-seq allows for deep yet sparsely sampled sequencing of many individuals in a highly multiplexed manner, where typical applications includes QTL mapping, GWAS studies, high resolution population differentiation/phylogeny, pedigree reconstruction and SNP discovery for other more high throughput assays. These steps are mainly designed to only characterize the data and attempt to correct defects (e.g. adapter contamination and restriction-site sequencing errors) for further downstream analysis, but not to draw any biologically relevant conclusions:

  • FastQC
  • Adapter trimming using trimmomatic
  • Stacks de novo pipeline with the most lenient parameters

The pipeline is built using Nextflow, a workflow tool to run tasks across multiple compute infrastructures in a very portable manner. It comes with docker / singularity containers making installation trivial and results highly reproducible.

Documentation

The radseqQC pipeline comes with documentation about the pipeline, found in the docs/ directory:

  1. Installation
  2. Pipeline configuration
  3. Running the pipeline
  4. Output and how to interpret the results
  5. Troubleshooting

Credits

This pipeline was written by Remi-Andre Olsen (remiolsen) at SciLifeLab.