A hecatomb is a great sacrifice or an extensive loss. Heactomb the software empowers an analyst to make data driven decisions to 'sacrifice' false-positive viral reads from metagenomes to enrich for true-positive viral reads. This process frequently results in a great loss of suspected viral sequences / contigs.
For detailed pipeline overview, installation, usage and customisation instructions, please refer to the documentation hosted at Read the Docs.
Hecatomb is currently on BioRxiv!
Hecatomb is powered by Snakemake and greatly benefits from the use of Snakemake profiles for HPC Clusters. More information and example for setting up Snakemake profiles for Hecatomb in the documentation.
# create conda env and install
conda create -n hecatomb -c conda-forge -c bioconda hecatomb
# activate conda env
conda activate hecatomb
# check the installation
hecatomb -h
# download the databases - you only have to do this once
# locally: using 8 threads (default is 32 threads)
hecatomb install --threads 8
# HPC: using a snakemake profile named 'slurm'
hecatomb install --profile slurm
# locally: uses 32 threads and 64 GB RAM by default
hecatomb run --test
# HPC: using a profile named 'slurm'
hecatomb run --test --profile slurm
Hecatomb is currently designed to only work with paired-end reads. We have considered making a branch for single-end reads, but that is not currently available.
When you specify a directory of reads with --reads
, Hecatomb expects paired sequencing reads in the format
sampleName_R1/R2.fastq(.gz). e.g.
sample1_R1.fastq.gz
sample1_R2.fastq.gz
sample2_R1.fastq.gz
sample2_R2.fastq.gz
When you specify a TSV file with --reads
, Hecatomb expects a 3-column tab separated file with the first column
specifying a sample name, and the other columns the relative or full paths to the forward and reverse read files. e.g.
sample1 /path/to/reads/sample1.1.fastq.gz /path/to/reads/sample1.2.fastq.gz
sample2 /path/to/reads/sample2.1.fastq.gz /path/to/reads/sample2.2.fastq.gz
The only dependency you need to get up and running with Hecatomb is conda. Hecatomb relies on conda (and mamba) to ensure portability and ease of installation of its dependencies. All of Hecatomb's dependencies are installed during installation or runtime, so you don't have to worry about a thing!