Skip to content

0. Introduction

kbseah edited this page Jun 28, 2018 · 4 revisions

Various tools and approaches exist for metagenomic binning - the process of defining individual genomes in a metagenomic assembly. These tools are designed for interactive exploration and binning of low-diversity microbial metagenomes in R.

A useful way to visualize a metagenomic assembly is to plot the coverage (depth) and GC% of the assembled scaffolds. Scaffolds coming from the same genome would tend to have similar coverage and GC%, and so form clusters in the plots. To aid in distinguishing the clusters, the taxonomic affiliation of each scaffold can be evaluated either by searching the entire scaffold sequence against a database like NCBI nr, or by searching specific marker genes.

Examples of tools that use GC-coverage plots and taxonomic annotation include Blobology and Metawatt.

Another visualization or binning method relies on the variation in coverage for different genomes between different samples. If the coverage of a metagenome assembly in one sample is plotted against the coverage in another sample, individual genomes would again tend to cluster together.

Examples of tools that use differential coverage binning: Multi-metagenome, GroopM.

Genome-bin-tools builds on concepts from Multi-metagenome, but it offers more:

  • Higher-level functions for plotting - Save time spent on typing and copy-pasting commands
  • Designed to work with free assembly and annotation tools (BBMap for mapping, barrnap for finding rRNAs, tRNAscan-SE for finding tRNAs, AMPHORA2 for finding marker genes)
  • Needing minimal software installation - start R, import some tables, load R functions, and go!
  • Interactive - select bins, see summary statistics for bins, save scaffold lists for later processing

Next step

Produce and annotate metagenomic assembly