Skip to content

4less/benchpro

Repository files navigation

Benchpro

Benchpro is an automated pipeline of python and Rscripts to benchmark tools for metagenomic profiling and strain-level analysis. Reconstructing the taxonomic composition of microbiomes is a fundamental task in metagenomics and has a huge downstream effect on robustness of statistics (numerical ecology). This includes both missed results due to lack of sensitivity or wrong results caused by false positive detection. Varying performances of tools across different environments necessitate thorough benchmarking to help users make an informed decision. On top of this, existing benchmarks of taxonomic profilers are not accounting for the noise different taxonomies and phylogenies introduce. Benchpro corrects for this error signal and shows the true benchmarking performance.

Benchpro for species-level profiling provides plots and statistics for binary classification (assessment of TP, FP, FN), metrics measuring performance for reconstructing abundances and abundance thresholds for detecting taxa. Further, corrected benchmarks marked as separate rank "Species Adj" show the tools adjusted performance in a GTDB context. The following picture shows different errors types in tree context for the predicted and gold-standard taxonomic profile of a single sample. Both gold-standard and predicted profiles are using the GTDB taxonomy and are visualized with respect to (TP, FP, TN) and whether the profiler contains the exact GTDB taxon in their database. error_types

This shows a screenshot of the benchpro output html. Screenshot from 2024-05-06 19-30-42

Installation

ToDo

Run example

ToDo

Benchmark your own tool

ToDo

About

benchpro is an alternative to opal also accepting GTDB style taxonomy

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published