Benchpro is an automated pipeline of python and Rscripts to benchmark tools for metagenomic profiling and strain-level analysis. Reconstructing the taxonomic composition of microbiomes is a fundamental task in metagenomics and has a huge downstream effect on robustness of statistics (numerical ecology). This includes both missed results due to lack of sensitivity or wrong results caused by false positive detection. Varying performances of tools across different environments necessitate thorough benchmarking to help users make an informed decision. On top of this, existing benchmarks of taxonomic profilers are not accounting for the noise different taxonomies and phylogenies introduce. Benchpro corrects for this error signal and shows the true benchmarking performance.
Benchpro for species-level profiling provides plots and statistics for binary classification (assessment of TP, FP, FN), metrics measuring performance for reconstructing abundances and abundance thresholds for detecting taxa. Further, corrected benchmarks marked as separate rank "Species Adj" show the tools adjusted performance in a GTDB context. The following picture shows different errors types in tree context for the predicted and gold-standard taxonomic profile of a single sample. Both gold-standard and predicted profiles are using the GTDB taxonomy and are visualized with respect to (TP, FP, TN) and whether the profiler contains the exact GTDB taxon in their database.
This shows a screenshot of the benchpro output html.
ToDo
ToDo
ToDo