AMBER: Assessment of Metagenome BinnERs
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Failed to load latest commit information.



AMBER (Assessment of Metagenome BinnERs) is an evaluation package for the comparative assessment of genome reconstructions from metagenome benchmark datasets. It provides performance metrics, results rankings, and comparative visualizations for assessing multiple programs or parameter effects. The provided metrics were used in the first community benchmarking challenge of the initiative for the Critical Assessment of Metagenomic Interpretation.

AMBER produces print-ready and interactive plots. See a produced example page at


See default.txt for all dependencies.



Install pip first (tested on Linux Ubuntu 17.10):

sudo apt install python3-pip

Then run:

pip3 install cami-amber 

Make sure to add AMBER to your PATH:

echo 'PATH=$PATH:${HOME}/.local/bin' >> ~/.bashrc
source ~/.bashrc

You can also run AMBER as a Biobox.

User Guide


As input, AMBER's main tool uses three files:

  1. A gold standard mapping of contigs or read IDs to genomes in the CAMI binning Bioboxes format. Columns are tab separated. Example:

RH|P|C37126  Sample6_89 25096
RH|P|C3274   Sample9_91 10009
RH|P|C26099  1053046    689201
RH|P|C35075  1053046    173282
RH|P|C20873  1053046    339258

See here another example. Note: column _LENGTH is optional, but eliminates the need for a FASTA or FASTQ file (input 3 below).

  1. One or more files with bin assignments for the sequences also in the CAMI binning Bioboxes format, with each file containing all the bin assignments from a binning program. A tool for converting FASTA files, such that each file represents a bin, is available (see src/utils/
  2. A FASTA or FASTQ file with the sequences for obtaining their lengths. Optionally, the lenghts may be added to the gold standard mapping file at column _LENGTH using tool src/utils/ In this way, no longer requires a FASTA or FASTQ file.

Additional parameters may be specified - see below.

List of metrics and abbreviations

  • avg_purity: purity averaged over genome bins
  • std_dev_purity: standard deviation of purity averaged over genome bins
  • sem_purity: standard error of the mean of purity averaged over genome bins
  • avg_completeness: completeness averaged over genome bins
  • std_dev_completeness: standard deviation of completeness averaged over genome bins
  • sem_completeness: standard error of the mean of completeness averaged over genome bins
  • avg_purity_per_bp: average purity per base pair
  • avg_completeness_per_bp: average completeness per base pair
  • rand_index_by_bp: Rand index weighed by base pairs
  • rand_index_by_seq: Rand index weighed by sequence counts
  • a_rand_index_by_bp: adjusted Rand index weighed by base pairs
  • a_rand_index_by_seq: adjusted Rand index weighed by sequence counts
  • percent_assigned_bps: percentage of base pairs that were assigned to bins
  • accuracy: accuracy
  • >0.5compl<0.1cont: number of bins with more than 50% completeness and less than 10% contamination
  • >0.7compl<0.1cont: number of bins with more than 70% completeness and less than 10% contamination
  • >0.9compl<0.1cont: number of bins with more than 90% completeness and less than 10% contamination
  • >0.5compl<0.05cont: number of bins with more than 50% completeness and less than 5% contamination
  • >0.7compl<0.05cont: number of bins with more than 70% completeness and less than 5% contamination
  • >0.9compl<0.05cont: number of bins with more than 90% completeness and less than 5% contamination


                [-p FILTER] [-r REMOVE_GENOMES] [-k KEYWORD] -o OUTPUT_DIR
                [-m] [-x MIN_COMPLETENESS] [-y MAX_CONTAMINATION]
                bin_files [bin_files ...]

Compute all metrics and figures for one or more binning files; output summary
to screen and results per binning file to chosen directory

positional arguments:
  bin_files             Binning files

optional arguments:
  -h, --help            show this help message and exit
  -g GOLD_STANDARD_FILE, --gold_standard_file GOLD_STANDARD_FILE
                        Gold standard - ground truth - file
  -f FASTA_FILE, --fasta_file FASTA_FILE
                        FASTA or FASTQ file with sequences of gold standard
                        (required if gold standard file misses column _LENGTH)
  -l LABELS, --labels LABELS
                        Comma-separated binning names
  -p FILTER, --filter FILTER
                        Filter out [FILTER]% smallest bins (default: 0)
  -r REMOVE_GENOMES, --remove_genomes REMOVE_GENOMES
                        File with list of genomes to be removed
  -k KEYWORD, --keyword KEYWORD
                        Keyword in the second column of file with list of
                        genomes to be removed (no keyword=remove all genomes
                        in list)
  -o OUTPUT_DIR, --output_dir OUTPUT_DIR
                        Directory to write the results to
  -m, --map_by_completeness
                        Map genomes to bins by maximizing completeness
                        Comma-separated list of min. completeness thresholds
                        (default %: 50,70,90)
                        Comma-separated list of max. contamination thresholds
                        (default %: 10,5)


python3 -g test/gsa_mapping.binning \
-l "MaxBin 2.0, CONCOCT, MetaBAT" \
-p 1 \
-r test/unique_common.tsv \
-k "circular element" \
test/naughty_carson_2 \
test/goofy_hypatia_2 \
test/elated_franklin_0 \
-o output_dir/


tool       avg_purity std_dev_purity sem_purity avg_completeness std_dev_completeness sem_completeness avg_purity_per_bp avg_completeness_per_bp rand_index_by_bp rand_index_by_seq a_rand_index_by_bp a_rand_index_by_seq percent_assigned_bps accuracy >0.5compl<0.1cont >0.7compl<0.1cont >0.9compl<0.1cont >0.5compl<0.05cont >0.7compl<0.05cont >0.9compl<0.05cont
MaxBin 2.0 0.948      0.095          0.016      0.799            0.364                0.058            0.934             0.838                   0.995            0.951             0.917              0.782               0.864                0.807    28                28                24                23                 23                 21
CONCOCT    0.837      0.266          0.052      0.517            0.476                0.069            0.684             0.936                   0.972            0.946             0.644              0.751               0.967                0.661    18                17                15                16                 16                 14
MetaBAT    0.822      0.256          0.047      0.57             0.428                0.065            0.724             0.825                   0.976            0.965             0.674              0.860               0.917                0.664    17                16                12                17                 16                 12

Directory output_dir will contain:

  • summary.tsv: contains the same table as the output above with tab-separated values
  • summary.html: HTML page with results summary and interactive graphs
  • avg_purity_completeness.png + .pdf: figure of average purity vs. average completeness
  • avg_purity_completeness_per_bp.png + .pdf: figure of purity vs. completeness per base pair
  • ari_vs_assigned_bps.png + .pdf: figure of adjusted Rand index weighed by number of base pairs vs. percentage of assigned base pairs
  • rankings.txt: tools sorted by average purity, average completeness, and sum of average purity and average completeness

In the same directory, subdirectories naughty_carson_2, goofy_hypatia_2, and elated_franklin_0 will be created with the following files:

  • rand_index.tsv: contains value of (adjusted) Rand index and percentage of assigned/binned bases. Rand index is both weighed and unweighed by base pairs
  • purity_completeness.tsv: contains purity and completeness per genome bin
  • purity_completeness_avg.tsv: contains purity and completeness averaged over genome bins. Includes standard deviation and standard error of the mean
  • purity_completeness_by_bpcount.tsv: contains purity and completeness weighed by base pairs
  • heatmap.png + .pdf: heatmap representing base pair assignments to predicted bins vs. their true origins from the underlying genomes

For a complete list of tools, see

Run AMBER as a Biobox

Build and run the AMBER docker image with the commands:

docker build -t cami/amber:latest .
docker run -v $(pwd)/input/gold_standard.fasta:/bbx/input/gold_standard.fasta -v $(pwd)/input/gsa_mapping.binning:/bbx/input/gsa_mapping.binning  -v  $(pwd)/input/test_query.binning:/bbx/input/test_query.binning  -v  $(pwd)/output:/bbx/output -v $(pwd)/input/biobox.yaml:/bbx/input/biobox.yaml cami/amber:latest default

where biobox.yaml contains the following:

version: 0.11.0
  - fasta:
      value: /bbx/input/gold_standard.fasta
      type: contig
  - labels:
      value: /bbx/input/gsa_mapping.binning
      type: binning
  - predictions:
      value: /bbx/input/test_query.binning
      type: binning

Developer Guide

We are using tox for project automation.


If you want to run tests, just type tox in the project's root directory:


You can use all libraries that AMBER depends on by activating tox's virtual environment with the command:

source  <project_directory>/.tox/py35/bin/activate

Update GitHub page

In order to update, modify file index.html.

Make a Release

If the dev branch is merged into the master branch:

  1. Update according to semantic versioning on the dev branch.

  2. Merge the dev branch into the master branch.

  3. Make a release on GitHub with the same version number provided in .

  4. Create package and upload it to PyPI:

python3 sdist bdist_wheel
twine upload dist/*


Please cite AMBER as:

  • Fernando Meyer, Peter Hofmann, Peter Belmann, Ruben Garrido-Oter, Adrian Fritz, Alexander Sczyrba, and Alice C. McHardy. (2018). AMBER: Assessment of Metagenome BinnERs. GigaScience, giy069. doi:10.1093/gigascience/giy069

The metrics implemented in AMBER were used and described in the CAMI manuscript, thus you may also cite:

  • Sczyrba, Hofmann, Belmann, et al. (2017). Critical Assessment of Metagenome Interpretation—a benchmark of metagenomics software. Nature Methods, 14, 11:1063–1071. doi:10.1038/nmeth.4458