DMRworkflow: automating differential methylation analysis of DNA methylation sequencing data

This workflow is being developed as part of a graduate research project to analyze whole genome bisulfite sequencing (WGBS) data from the Sequence Read Archive (SRA).

DMRworkflow uses Snakemake workflow management in conjunction with Slurm workload manager to improve the automation and reproducibility of genomic data analysis on an HPC.

Analysis steps included in pipeline

The snakemake workflow includes a subworkflow component to download fastq files from SRA, run QC checks, trimming, alignment, and methylation calling. For human WGBS data analysis, these subworkflow steps require compute resources that are generally not available locally so the subworkflow has been configured to run on an HPC cluster. The output files from the subworkflow are portable .beta files that contain the methylation beta values for each CpG site in the genome. The primary snakemake workflow requires far less computing resources and can be carried out locally to identify differentially methylated regions (DMRs) associated with a group of samples relative to the background/control group. Differentially methylated genes (DMGs) are identified as the gene associated with promoters within DMRs.

Simplified overview of pipeline

Setup

This pipeline is a work and progress and instructions for setup will be updated as the pipeline is developed.

Dependencies

python 3.11.4
conda 23.5.0
- Automatically manages all Snakemake rule dependencies as specified in .yaml files.
- The mamba package manager could be used in place of conda for faster dependency resolution.
snakemake version 7.28.3
- Create a snakemake environment with conda create -n snakemake -c conda-forge -c bioconda snakemake
wgbstools version 0.2.0
- This package must be set up, compiled, and in path following the instructions in the wgbstools repository.
- You do not need to initialize a reference genome with wgbstools, as the pipeline will do this for you.

More set up and usage instructions to come...

Name		Name	Last commit message	Last commit date
Latest commit History 40 Commits
environment_files		environment_files
pre-processing		pre-processing
rules		rules
script_dev		script_dev
supplementary		supplementary
.gitignore		.gitignore
Snakefile		Snakefile
cluster.yaml		cluster.yaml
config.yaml		config.yaml
readme.md		readme.md
workflow_functions.py		workflow_functions.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

environment_files

environment_files

pre-processing

pre-processing

rules

rules

script_dev

script_dev

supplementary

supplementary

.gitignore

.gitignore

Snakefile

Snakefile

cluster.yaml

cluster.yaml

config.yaml

config.yaml

readme.md

readme.md

workflow_functions.py

workflow_functions.py

Repository files navigation

DMRworkflow: automating differential methylation analysis of DNA methylation sequencing data

Analysis steps included in pipeline

Simplified overview of pipeline

Setup

Dependencies

About

Releases

Packages

Languages

MSleeper1/dmr_workflow

Folders and files

Latest commit

History

Repository files navigation

DMRworkflow: automating differential methylation analysis of DNA methylation sequencing data

Analysis steps included in pipeline

Simplified overview of pipeline

Setup

Dependencies

About

Resources

Stars

Watchers

Forks

Languages