WideVariant: Lieberman Lab SNP calling pipeline

Overview

This pipeline and toolkit is used to detect and analyze single nucleotide differences between closely related bacterial isolates.

Noteable features
- Avoids false-negative mutations due to low coverage; if a mutation is found in at least one isolate in a set, the evidence at that position will be investigated to make a best-guess call.
- Avoids false-positives mutations by facilitating visualization of raw data, across samples (whereas pileup formats must be investigated on a sample-by-sample basis) and changing of threshold to best fit your use case.
- Enables easy evolutionary analysis, including phylogenetic construction, nonsynonmous vs synonymous mutation counting, and parallel evolution
Inputs (to Snakemake cluster step):
- short-read sequencing data of closely related bacterial isolates
- an annotated reference genome
Outputs (of local analysis step):
- table of high-quality SNVs that differentiate isolates from each other
- parsimony tree of how the isolates are related to each other

The pipeline is split into two main components, as described below. A complete tutorial can be found at the bottom of this page.

1. Snakemake pipeline

The first portion of WideVariant aligns raw sequencing data from bacterial isolates to a reference genome, identifies candidate SNV positions, and creates useful data structure for supervised local data filtering. This step is implemented in a workflow management system called Snakemake and is executed on a SLURM cluster. More information is available here.

2. Local python analysis

The second portion of WideVariant filters candidate SNVs based on data arrays generated in the first portion and generates a high-quality SNV table and a parsimony tree. This step is implemented with a custom python script. More information can be found here.

Tutorial Table of Contents

Main WideVariant pipeline README

Example use cases

Previous iterations of this pipeline have been used to study:

Name		Name	Last commit message	Last commit date
Latest commit History 98 Commits
local_analysis		local_analysis
readme_files		readme_files
snake_pipeline		snake_pipeline
.gitignore		.gitignore
LICENSE.md		LICENSE.md
README.md		README.md
doc_dump		doc_dump

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

WideVariant: Lieberman Lab SNP calling pipeline

Overview

1. Snakemake pipeline

2. Local python analysis

Tutorial Table of Contents

Example use cases

About

Releases

Packages

Languages

License

keyfm/WideVariant

Folders and files

Latest commit

History

Repository files navigation

WideVariant: Lieberman Lab SNP calling pipeline

Overview

1. Snakemake pipeline

2. Local python analysis

Tutorial Table of Contents

Example use cases

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages