Skip to content

Snakemake module containing different analyses provided by parabricks.

License

GPL-3.0, GPL-3.0 licenses found

Licenses found

GPL-3.0
LICENSE
GPL-3.0
LICENSE.md
Notifications You must be signed in to change notification settings

hydra-genetics/parabricks

Repository files navigation

🐍 hydra-genetics/parabricks

Snakemake module containing an array of steps provided by the parabricks tookit

compatibility lint snakefmt snakemake dry run

License: GPL-3

💬 Introduction

The module contains rules to align .fastq-files and call variants in the resulting .bam-files using Clara Parabricks. To use this module a server with access to one or more compatible NVIDIA GPUs is required. Input data should be trimmed .fastq-files and we recommend to generate these with hydra-genetics/prealignment for a smooth transition. In order to make use of read group information, add machine, flowcell and library specifics to units.tsv.

❗ Dependencies

In order to use this module, the following dependencies are required:

hydra-genetics pandas parabricks python snakemake

🎒 Preparations

Sample and unit data

Input data should be added to samples.tsv and units.tsv. The following information need to be added to these files:

Column Id Description
samples.tsv
sample unique sample/patient id, one per row
tumor_content ratio of tumor cells to total cells
units.tsv
sample same sample/patient id as in samples.tsv
type data type identifier (one letter), can be one of Tumor, Normal, RNA
platform type of sequencing platform, e.g. NovaSeq
machine specific machine id, e.g. NovaSeq instruments have @Axxxxx
flowcell identifer of flowcell used
lane flowcell lane number
barcode sequence library barcode/index, connect forward and reverse indices by +, e.g. ATGC+ATGC
fastq1/2 absolute path to forward and reverse reads
adapter adapter sequences to be trimmed, separated by comma

Reference data

Reference files should be specified in config.yaml in the section reference. A .fasta-file is needed as well as a .vcf file containing known indels used during the alignment process. For the RNA alignment part, genome_dir should specify a directory containing reference files generated by STAR.

🚀 Usage

To use this module in your workflow, follow the description in the snakemake docs. Add the module to your Snakefile like so:

module parabricks:
    snakefile:
        github(
            "hydra-genetics/parabricks",
            path="workflow/Snakefile",
            tag="1.0.0",
        )
    config:
        config


use rule * from parabricks as parabricks_*

Compatibility

Latest:

  • prealignment:v1.1.0

See COMPATIBLITY.md file for a complete list of module compatibility.

Output files

The following output files should be targeted via another rule:

File Description
parabricks/pbrun_deepvariant/{sample}.vcf variant call file generated by deepvariant
parabricks/pbrun_fq2bam/{sample}_{type}.bam alignment file generated by BWA-mem
parabricks/pbrun_mutectcaller_t/{sample}_T.vcf variant call file generated by Mutect2 using tumor-only mode
parabricks/pbrun_mutectcaller_tn/{sample}.vcf variant call file generated by Mutect2 using tumor/normal mode
parabricks/pbrun_rna_fq2bam/{sample}_R.bam alignment file generated by STAR

🧑‍⚖️ Rule Graph

rule_graph

About

Snakemake module containing different analyses provided by parabricks.

Resources

License

GPL-3.0, GPL-3.0 licenses found

Licenses found

GPL-3.0
LICENSE
GPL-3.0
LICENSE.md

Stars

Watchers

Forks

Packages

No packages published

Languages