Assemble bacterial genome and make variants call.
This project contains two pipelines handled by Snakemake :
- A de novo bacterial genome assembler coupled with gene annotation design for paired end reads.
- A pipeline for variant calling by comparing against a reference genome. It's design for paired end reads. Can be used with a metadata file listing the different bacterial strains used and the experimental conditions to regroup output (ex :
variant_calling/metadata.csv
).
Install the latest version of BAoBAb tools in your directory.
git clone https://github.com/BAoBAb-biofilm/BAoBAb.git
Installing all the dependencies manually:
Soft | Version |
---|---|
AbritAMR | 1.0.13 |
fastQC | 0.11.9 |
Prokka | 1.14.6 |
Quast | 5.2.0 |
Trimmomatic | 0.39 |
Unicycler | 0.5.0 |
Snippy | 4.6.0 |
To use the two differents tools you need paired end sequence reads files in fastq format (can be in .gz format). Forward and reverse files must be named like this :
strain_name_R1.fastq.gz
strain_name_R2.fastq.gz
The pipelines are designed for Escherichia coli strains, please change species and genus in prokka and abritamr rules of the genome assembly tool.
snakemake -s snakefile_assembly.py
- Input : All paired reads need to be located in a
raw_reads
folder. - The Trimmomatic parameters need to be changed to match the analyses obtained with FastQC.
snakemake -s snakefile_variant_calling.py --configfile config.yaml
- Input : All paired reads need to be located in a
raw_reads
folder. - The Trimmomatic parameters need to be changed to match the analyses obtained with FastQC.