Skip to content

EpiDiverse/wgbs

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

EpiDiverse-WGBS Pipeline

Nextflow install with bioconda Docker Release Publication Twitter

EpiDiverse/wgbs is a bioinformatics analysis pipeline for aligning whole genome bisulfite sequencing data from non-model plant species.

The workflow processes raw data from FastQ inputs (FastQC, cutadapt), aligns the reads (erne-bs5 or segemehl), and performs extensive quality-control on the results using custom scripts and Picard MarkDuplicates. Methylation calling and mbias correction is performed with Methyldackel.

See the output documentation for more details of the results.

The pipeline is built using Nextflow, a workflow tool to run tasks across multiple compute infrastructures in a very portable manner. It comes with docker containers making installation trivial and results highly reproducible.

Quick Start

  1. Install nextflow

  2. Install one of docker, singularity or conda

  3. Start running your own analysis!

NXF_VER=20.07.1 nextflow run epidiverse/wgbs -profile <docker|singularity|conda> \
--input /path/to/reads/directory --reference /path/to/reference.fasta

See the usage documentation for all of the available options when running the pipeline.

Test data

A minimal example dataset for testing purposes can be found in the EpiDiverse/datasets repository. You can either download the files manually and run the pipeline above as intended, or you can directly run the pipeline using the test profile option which will automatically download the data for you:

NXF_VER=20.07.1 nextflow run epidiverse/wgbs -profile test,<docker|singularity|conda>

Wiki Documentation

The EpiDiverse/wgbs pipeline is part of the EpiDiverse Toolkit, a best practice suite of tools intended for the study of Ecological Plant Epigenetics. Links to general guidelines and pipeline-specific documentation can be found below:

  1. Installation
  2. Pipeline configuration
  3. Running the pipeline
  4. Understanding the results
  5. Runtime and memory usage guidelines
  6. Troubleshooting

Credits

These scripts were originally written for use by the EpiDiverse European Training Network, by Adam Nunn (@bio15anu) and Nilay Can (@nilaycan).

This project has received funding from the European Union’s Horizon 2020 research and innovation programme under the Marie Skłodowska-Curie grant agreement No 764965

Citation

If you use epidiverse/wgbs for your analysis, please cite it using the following doi:

About

The EpiDiverse Whole Genome Bisulfite Sequencing Pipeline, implemented with Nextflow

Resources

License

Stars

Watchers

Forks

Packages

No packages published