GitHub - shandley/hecatomb at 554da4875851ef84c7ecab3a62ae760e4c9bdc7d

Name	Name	Last commit message	Last commit date
Latest commit History 658 Commits
accessory	accessory
bin	bin
build/hecatomb	build/hecatomb
docs	docs
snakemake	snakemake
test_data	test_data
.gitignore	.gitignore
.readthedocs.yaml	.readthedocs.yaml
CITATION.md	CITATION.md
LICENSE	LICENSE
README.md	README.md
VERSION	VERSION
mkdocs.yml	mkdocs.yml

Name

Last commit message

Last commit date

A hecatomb is a great sacrifice or an extensive loss. Heactomb the software empowers an analyst to make data driven decisions to 'sacrifice' false-positive viral reads from metagenomes to enrich for true-positive viral reads. This process frequently results in a great loss of suspected viral sequences / contigs.

For detailed pipeline overview, installation, usage and customisation instructions, please refer to the documentation hosted at Read the Docs.

Citation

Hecatomb is currently on BioRxiv!

Quick start guide

Running on HPC

Hecatomb is powered by Snakemake and greatly benefits from the use of Snakemake profiles for HPC Clusters. More information and example for setting up Snakemake profiles for Hecatomb in the documentation.

Install

# create conda env and install
conda create -n hecatomb -c conda-forge -c bioconda hecatomb

# activate conda env
conda activate hecatomb

# check the installation
hecatomb -h

# download the databases - you only have to do this once
  # locally: using 8 threads (default is 32 threads)
hecatomb install --threads 8

  # HPC: using a snakemake profile named 'slurm'
hecatomb install --profile slurm

Run the test dataset

# locally: uses 32 threads and 64 GB RAM by default
hecatomb run --test

# HPC: using a profile named 'slurm'
hecatomb run --test --profile slurm

Current limitations

Hecatomb is currently designed to only work with paired-end reads. We have considered making a branch for single-end reads, but that is not currently available.

When you specify a directory of reads with --reads, Hecatomb expects paired sequencing reads in the format sampleName_R1/R2.fastq(.gz). e.g.

sample1_R1.fastq.gz
sample1_R2.fastq.gz
sample2_R1.fastq.gz
sample2_R2.fastq.gz

When you specify a TSV file with --reads, Hecatomb expects a 3-column tab separated file with the first column specifying a sample name, and the other columns the relative or full paths to the forward and reverse read files. e.g.

sample1    /path/to/reads/sample1.1.fastq.gz    /path/to/reads/sample1.2.fastq.gz
sample2    /path/to/reads/sample2.1.fastq.gz    /path/to/reads/sample2.2.fastq.gz

Dependencies

The only dependency you need to get up and running with Hecatomb is conda. Hecatomb relies on conda (and mamba) to ensure portability and ease of installation of its dependencies. All of Hecatomb's dependencies are installed during installation or runtime, so you don't have to worry about a thing!

Links

Hecatomb @ bio.tools

Hecatomb @ WorkflowHub

Contributors 11

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Citation

Quick start guide

Running on HPC

Install

Run the test dataset

Current limitations

Dependencies

Links

About

Releases 19

Packages

Contributors 11

Languages

License

shandley/hecatomb

Folders and files

Latest commit

History

Repository files navigation

Citation

Quick start guide

Running on HPC

Install

Run the test dataset

Current limitations

Dependencies

Links

About

Topics

Resources

License

Stars

Watchers

Forks

Releases 19

Packages 0

Contributors 11

Languages

Packages