Skip to content

genomic-medicine-sweden/pgx

Repository files navigation

pgx

Pharmacogenomics pipeline implemented in hydra

Lint Snakefmt

pycodestyle pytest

integration test

License: GPL-3

💬 Introduction

❗ Dependencies

To run this workflow, the following tools need to be available:

python snakemake singularity

🎒 Preparations

Sample data

  1. Add all sample ids to samples.tsv in the column sample.
  2. Add all sample data information to units.tsv. Each row represents a fastq file pair with corresponding forward and reverse reads. Also indicate the sample id, run id and lane number, adapter.

Reference data

  1. You need a BAM file marked for duplicates
  2. Reference genome
  3. dbSNP database in VCF file format

✅ Testing

The workflow repository contains a small test dataset .tests/integration which can be run like so:

cd .tests/integration
snakemake -s ../../Snakefile -j1 --use-singularity

# alternative command:
snakemake -s ../../workflow/Snakefile -j1 --use-singularity --configfile config/config.yaml

# generate DAG:
snakemake --cores 1 -s workflow/Snakefile --configfile config/config.yaml --rulegraph | dot -Tsvg > ./images/dag.svg

🚀 Usage

The workflow is designed for WGS data meaning huge datasets which require a lot of compute power. For HPC clusters, it is recommended to use a cluster profile and run something like:

snakemake -s /path/to/Snakefile --profile my-awesome-profile

🧑‍⚖️ Rule Graph

rule_graph