Skip to content
A robust, extensible metagenomics pipeline
Python Shell
Branch: dev
Clone or download
ressy Update Changelog for dev in Readme
This lists each change merged to dev since v2.0.2, except for those
already merged into stable as hotfixes.  This should help us track
notable changes as we prepare the next stable release.
Latest commit dbbedd2 Oct 29, 2019

Sunbeam: a robust, extensible metagenomic sequencing pipeline

CircleCI Documentation Status

Sunbeam is a pipeline written in snakemake that simplifies and automates many of the steps in metagenomic sequencing analysis. It uses conda to manage dependencies, so it doesn't have pre-existing dependencies or admin privileges, and can be deployed on most Linux workstations and clusters.

Sunbeam currently automates the following tasks:

  • Quality control, including adaptor trimming, host read removal, and quality filtering;
  • Taxonomic assignment of reads to databases using Kraken;
  • Assembly of reads into contigs using Megahit;
  • Contig annotation using BLAST[n/p/x];
  • Mapping of reads to target genomes; and
  • ORF prediction using Prodigal.

Sunbeam was designed to be modular and extensible. Some extensions have been built for:

  • IGV for viewing read alignments
  • KrakenHLL, an alternate read classifier
  • Kaiju, a read classifier that uses BWA rather than kmers
  • Anvi'o, a downstream analysis pipeline that does lots of stuff!

More extensions can be found at the extension page:

To get started, see our documentation!


Development Version (as of October 29, 2019)

  • Integration test updates to schedule weekly builds (#222)
  • Script updates to use conda commands instead of source commands (#220)
  • Add h5py package explicitly to avoid dependency metadata problem (#219)
  • Add multiQC to build QC report (#203)
  • Use multithreading for cutadapt in QC (#202)
  • Correct conda channel priority during install (#201)
  • Update documentation to spell out requirements (#199)
  • New megahit failure handling (#194)
  • Enforce sample wildcard constraints in Snakemake rules (#190)
  • Run megahit multithreaded (#189)

v2.0.2 (August 28, 2019)

  • Add implicit dependencies (samtools and bcftools) to environment file to make them explicit

v2.0.1 (July 24, 2019)

  • Increment Snakemake version requirement for compatibility with recent conda
  • Specify earlier megahit version to ensure compatbility with existing assembly behavior
  • Integration test improvements

v2.0.0 (January 22, 2019)

  • Start a project using resources directly from the SRA using sunbeam init --data_acc [SRA ###]. For more information, see the docs
  • New extension website:
  • Improved documentation
  • Numerous bugfixes and optimizations

v1.2.1 (May 24, 2018)

  • Minor bugfixes

v1.2.0 (May 2, 2018)

  • Low-complexity reads are now removed by default rather than masked
  • Bug fixes related to single-end sequencing experiments
  • Documentation updates

v1.1.0 (April 8, 2018)

  • Reports include number of filtered reads per host, rather than in aggregate
  • Static binary dependency for komplexity for easier deployment
  • Remove max length filter for contigs

v1.0.0 (March 22, 2018)

  • First stable release!
  • Support for single-end sequencing experiments
  • Low-complexity read masking via komplexity
  • Support for extensions
  • Documentation on
  • Better assembler (megahit)
  • Better ORF finder (prodigal)
  • Can remove reads from any number of host/contaminant genomes
  • Semantic versioning checks
  • Integration tests and continuous deployment


You can’t perform that action at this time.