This python script ( will run a docker container to assemble genomes 'de novo' using SPAdes (version number below in third party software).
- Genome Sequences of Two Lytic Staphylococcus aureus Bacteriophages Isolated from Wastewater
- Genome Sequence of a Lytic Staphylococcus aureus Bacteriophage Isolated from Breast Milk
- Complete Genome Sequences of Four Pseudomonas aeruginosa Bacteriophages: Kara-mokiny 8, Kara-mokiny 13, Kara-mokiny 16, and Boorn-mokiny 1
The main functions of phanatic are:
- De novo assembly for phages
- Reads quality checks run using fastqc
- Assembly quality and completeness check using CheckV
- Extraction of 'Complete' and 'High-quality' contigs (determined via assembly QC)
- OPTIONAL: Read mapping to host strains to check assembly contamination, generalised transduction
- Log file with each sample process detailed (phanatic_log.tsv)
If you use this software please cite one of the papers below and look at the third party software to cite the correct versions of software utilised by this container.
To run this pipeline you first need a working docker installation.
Install using pip
pip install Phanatic==2.2.4
Run the help command to see options -h
Run a basic assembly with default configuration -i <PATH TO READS DIR> -o <PATH TO OUTPUT DIR>
- Assembled genome
- CheckV analysis files
- SPAdes assembly files
- Reads QC files (fastqc)
Software | Version | Description | Please cite |
SPAdes | 3.15.4 | The St.Petersburg genome assembler containing various pipelines released under GPLv2 | |
bbmap | 38.18 | U.S Department of Energy (DOE) Joint Genome Institute (JGI) toolset containing a set of fast bioinformatic tools for DNA/RNA sequencing data | |
biopython | 1.78 | A set of tools written in python for biological computation | |
checkv | 1.0.1 | CheckV quality and completeness analysis for viral genomes | |
checkv-db | 1.5 | Database version in this container | |
fastqc | 0.11.9 | Quality control for reads | |
Software | Version | Description | Please cite |
bwa | 0.7.17 | -------- | -------- |
samtools | 1.9 | -------- | -------- |
This is the default config file, copy this and specify its location using '-c' to use your own with adjustments.
image = iszatt/phanatic:2.2.4
author = 'Joshua J Iszatt'
citation = 'pending'
normalise = True
filter = True
fastqc = True
barcode = False
mapping = True
re_assembly = True
identify_termini = False
RAM = 24000m
r1_ext = _R1.fastq.gz
r2_ext = _R2.fastq.gz
read_length = 150
trim_length = 12
minimum_length = 100
read_quality = 15
minimum_insert = 120
minimum_overlap = 20
target_coverage = 250
memory_gb = 24
threads = 24
filter_length = 1000
prefix = phage
barcode_length = 5
An example csv formatted mapping file, notice that multiple sets of reads can be mapped to a single host genome. To use this: specify the path using the '--host_mapping' flag