Skip to content
UNDER CONSTRUCTION: MNase-seq analysis pipeline using BWA and DANPOS2.
Branch: master
Clone or download
Pull request Compare This branch is 3 commits ahead of drpatelh:master.
Latest commit 7cb1a10 Jun 27, 2019
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
.github initial template build from nf-core/tools, version 1.6 Jun 4, 2019
assets Update logo size for email Jun 26, 2019
bin Changes to fork pipeline Jun 20, 2019
conf
docs Update docs Jun 27, 2019
.gitattributes initial template build from nf-core/tools, version 1.6 Jun 4, 2019
.gitignore initial template build from nf-core/tools, version 1.6 Jun 4, 2019
.travis.yml
CHANGELOG.md initial template build from nf-core/tools, version 1.6 Jun 4, 2019
CODE_OF_CONDUCT.md initial template build from nf-core/tools, version 1.6 Jun 4, 2019
Dockerfile initial template build from nf-core/tools, version 1.6 Jun 4, 2019
LICENSE initial template build from nf-core/tools, version 1.6 Jun 4, 2019
README.md Update README Jun 27, 2019
environment.yml
main.nf Update docs Jun 27, 2019
nextflow.config Add skipDANPOS to nextflow.config Jun 27, 2019

README.md

nfcore/mnaseseq

Build Status Nextflow

install with bioconda Docker

Introduction

nfcore/mnaseseq is a bioinformatics analysis pipeline used for DNA sequencing data obtained via micrococcal nuclease digestion.

The pipeline is built using Nextflow, a workflow tool to run tasks across multiple compute infrastructures in a very portable manner. It comes with docker containers making installation trivial and results highly reproducible.

Pipeline summary

  1. Raw read QC (FastQC)
  2. Adapter trimming (Trim Galore!)
  3. Alignment (BWA)
  4. Mark duplicates (picard)
  5. Merge alignments from multiple libraries of the same sample (picard)
    1. Re-mark duplicates (picard)
    2. Filtering to remove:
      • reads mapping to blacklisted regions (SAMtools, BEDTools)
      • reads that are marked as duplicates (SAMtools)
      • reads that arent marked as primary alignments (SAMtools)
      • reads that are unmapped (SAMtools)
      • reads that map to multiple locations (SAMtools)
      • reads containing > 4 mismatches (BAMTools)
      • reads that are soft-clipped (BAMTools)
      • reads that have an insert size within specified range (BAMTools; paired-end only)
      • reads that map to different chromosomes (Pysam; paired-end only)
      • reads that arent in FR orientation (Pysam; paired-end only)
      • reads where only one read of the pair fails the above criteria (Pysam; paired-end only)
    3. Alignment-level QC and estimation of library complexity (picard, Preseq)
    4. Create normalised bigWig files scaled to 1 million mapped reads (BEDTools, bedGraphToBigWig)
    5. Calculate genome-wide coverage assessment (deepTools)
    6. Call nucleosome positions and generate smoothed, normalised coverage bigWig files that can be used to generate occupancy profile plots between samples across features of interest (DANPOS2)
    7. Generate gene-body meta-profile from DANPOS2 smoothed bigWig files (deepTools)
  6. Create IGV session file containing bigWig tracks for data visualisation (IGV).
  7. Present QC for raw read and alignment results (MultiQC)

Quick Start

i. Install nextflow

ii. Install one of docker, singularity or conda

iii. Download the pipeline and test it on a minimal dataset with a single command

nextflow run nf-core/mnaseseq -profile test,<docker/singularity/conda>

iv. Start running your own analysis!

nextflow run nf-core/mnaseseq -profile <docker/singularity/conda> --design design.csv --genome GRCh37

See usage docs for all of the available options when running the pipeline.

Documentation

The nf-core/mnaseseq pipeline comes with documentation about the pipeline, found in the docs/ directory:

  1. Installation
  2. Pipeline configuration
  3. Running the pipeline
  4. Output and how to interpret the results
  5. Troubleshooting

Credits

The pipeline was originally written by the The Bioinformatics & Biostatistics Group for use at The Francis Crick Institute, London.

The pipeline was developed by Harshil Patel.

Citation

You can cite the nf-core pre-print as follows:
Ewels PA, Peltzer A, Fillinger S, Alneberg JA, Patel H, Wilm A, Garcia MU, Di Tommaso P, Nahnsen S. nf-core: Community curated bioinformatics pipelines. bioRxiv. 2019. p. 610741. doi: 10.1101/610741.

You can’t perform that action at this time.