Skip to content
A fully reproducible and state of the art ancient DNA analysis pipeline.
Branch: master
Clone or download
Latest commit bc55df3 Mar 6, 2019
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
.github Update issue templates Oct 10, 2018
assets
bin Fix for processes DamageProfiler Dec 22, 2018
conf Added fastP position to MultiQC config Feb 25, 2019
docs Typos fixed in usage Mar 5, 2019
.gitattributes
.gitignore Initial template commit Oct 5, 2018
.travis.yml PR for release 2.0.6 Mar 5, 2019
CHANGELOG.md Missing some details Mar 5, 2019
CODE_OF_CONDUCT.md Initial template commit Oct 5, 2018
Dockerfile
LICENSE Merging everything with TEMPLATE Branch Oct 5, 2018
README.md
Singularity PR for release 2.0.6 Mar 5, 2019
environment.yml PR for release 2.0.6 Mar 5, 2019
main.nf Polish PR even further Mar 5, 2019
nextflow.config Fix custom config version Mar 5, 2019

README.md

nf-core/eager

Build Status Nextflow Slack Statusinstall with bioconda Docker Container available Singularity Container available DOI

Introduction

nf-core/eager is a bioinformatics best-practice analysis pipeline for NGS sequencing based ancient DNA (aDNA) data analysis.

The pipeline uses Nextflow, a bioinformatics workflow tool. It pre-processes raw data from FASTQ inputs, aligns the reads and performs extensive general NGS and aDNA specific quality-control on the results. It comes with docker, singularity or conda containers making installation trivial and results highly reproducible.

Pipeline steps

By default the pipeline currently performs the following:

  • Create reference genome indices for mapping (bwa, samtools, and picard)
  • Sequencing quality control (FastQC)
  • Sequencing adapter removal and for paired end data merging (AdapterRemoval)
  • Read mapping to reference using (bwa aln, bwa mem or CircularMapper)
  • Post-mapping processing, statistics and conversion to bam (samtools)
  • Ancient DNA C-to-T damage pattern visualisation (DamageProfiler)
  • PCR duplicate removal (DeDup or MarkDuplicates)
  • Post-mapping statistics and BAM quality control (Qualimap)
  • Library Complexity Estimation (preseq)
  • Overall pipeline statistics summaries (MultiQC)

Additional functionality contained by the pipeline currently includes:

  • Illumina two-coloured sequencer poly-G tail removal (fastp)
  • Automatic conversion of unmapped reads to FASTQ (samtools)
  • Damage removal/clipping for UDG+/UDG-half treatment protocols (BamUtil)
  • Damage reads extraction and assessment (PMDTools)

Quick Start

  1. Install nextflow

  2. Install one of docker, singularity or conda

  3. Download the EAGER pipeline

nextflow pull nf-core/eager
  1. Test the pipeline using the provided test data
nextflow run nf-core/eager -profile <docker/singularity/conda>,test --pairedEnd
  1. Start running your own ancient DNA analysis!
nextflow run nf-core/eager -profile <docker/singularity/conda> --reads'*_R{1,2}.fastq.gz' --fasta '<REFERENCE>.fasta'

NB. You can see an overview of the run in the MultiQC report located at <OUTPUT_DIR>/MultiQC/multiqc_report.html

Modifications to the default pipeline are easily made using various options as described in the documentation.

Documentation

The nf-core/eager pipeline comes with documentation about the pipeline, found in the docs/ directory:

  1. Installation
  2. Pipeline configuration
  3. Running the pipeline
  4. Output and how to interpret the results
  5. Troubleshooting

Credits

This pipeline was written by Alexander Peltzer (apeltzer), with major contributions from Stephen Clayton, ideas and documentation from James Fellows Yates, Raphael Eisenhofer and Judith Neukamm. If you want to contribute, please open an issue and ask to be added to the project - happy to do so and everyone is welcome to contribute here!

Contributors

If you've contributed and you're missing in here, please let me know and I'll add you in.

Tool References

You can’t perform that action at this time.