Skip to content

dbespiatykh/TBvar

Repository files navigation

A snakemake workflow for variant calling and lineage barcoding of the Mycobacterium tuberculosis samples

Snakemake Tests

Installation

The usage of this workflow is described in the Snakemake Workflow Catalog, alternatively it can be installed as described below.

Use the Conda package manager and BioConda channel to install TBvar.

If you do not have conda installed do the following:

# Download Conda installer
wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh
# Set permissions
chmod -X Miniconda3-latest-Linux-x86_64.sh
# Install
bash Miniconda3-latest-Linux-x86_64.sh

Set up channels:

conda config --add channels defaults
conda config --add channels bioconda
conda config --add channels conda-forge
conda config --set channel_priority strict

Get TBvar pipeline:

git clone https://github.com/dbespiatykh/TBvar.git

Install required dependencies:

conda install -c conda-forge mamba
mamba env create --file environment.yml

Usage

Activate TBvar environment:

conda activate TBvar
cd TBvar

👉 In config folder edit config.yml and add your samples.tsv table location, it should be formatted like this:

Run_accession R1 R2
SRR2024996 /path/to/SRR2024996_1.fastq.gz /path/to/SRR2024996_2.fastq.gz
SRR2024925 /path/to/SRR2024925_1.fastq.gz /path/to/SRR2024925_2.fastq.gz
SRR12882189 /path/to/SRR12882189.fastq.gz

Run_accession - Run accession number or sample name;
R1 - Path to the first read pair;
R2 - Path to the second read pair.

Run pipeline:

snakemake --conda-frontend mamba --use-conda -j 48 -c 48 --max-threads 48 -k --rerun-incomplete

It is recommended to use dry run if you are running pipeline for the first time, to see if everything is in working order, for this you can use -n flag:

snakemake --conda-frontend mamba -np