Sunbeam: a robust, extensible metagenomic sequencing pipeline
Sunbeam is a pipeline written in snakemake that simplifies and automates many of the steps in metagenomic sequencing analysis. It uses conda to manage dependencies, so it doesn't have pre-existing dependencies or admin privileges, and can be deployed on most Linux workstations and clusters.
Sunbeam currently automates the following tasks:
- Quality control, including adaptor trimming, host read removal, and quality filtering;
- Taxonomic assignment of reads to databases using Kraken;
- Assembly of reads into contigs using Megahit;
- Contig annotation using BLAST[n/p/x];
- Mapping of reads to target genomes; and
- ORF prediction using Prodigal.
Sunbeam was designed to be modular and extensible. Some extensions have been built for:
- IGV for viewing read alignments
- KrakenHLL, an alternate read classifier
- Kaiju, a read classifier that uses BWA rather than kmers
- Anvi'o, a downstream analysis pipeline that does lots of stuff!
To get started, see our documentation!
v1.2.1 (May 24, 2018)
- Minor bugfixes
v1.2.0 (May 2, 2018)
- Low-complexity reads are now removed by default rather than masked
- Bug fixes related to single-end sequencing experiments
- Documentation updates
v1.1.0 (April 8, 2018)
- Reports include number of filtered reads per host, rather than in aggregate
- Static binary dependency for komplexity for easier deployment
- Remove max length filter for contigs
v1.0.0 (March 22, 2018)
- First stable release!
- Support for single-end sequencing experiments
- Low-complexity read masking via komplexity
- Support for extensions
- Documentation on ReadTheDocs.io
- Better assembler (megahit)
- Better ORF finder (prodigal)
- Can remove reads from any number of host/contaminant genomes
- Semantic versioning checks
- Integration tests and continuous deployment