Systems Biology Model AnnoTation Evaluator
SBMate
evaluates the quality of annotations in SBML model elements, especially libsbml.Model, libsbml.Species, libsbml.Compartment, and libsbml.Reaction. Currently, it examines annotations from five knowledge resources, CHEBI, GO, KEGG, SBO, and UNIPROT.
SBMate calculates three metrics:
- Coverage checks how many model elements of the above four types (model, reaction, species, and compartment) are actually annotated.
- Consistency computes how many of such annotated entities has proper annotation. For example, a reaction object should not have a GO cellular component term (GO:0005575 or its children). SBMate identifies such instances and calculates the proprotion of model entities whose annotations are consistent.
- Finally, specificity is a measure of how 'precise' such consistent annotations are. This is obtained by utilizing the hierarchical structures of knowledge resource terms, such as the directed acyclic graphs of SBO, GO and CHEBI.
More detailed discussions can be found in our manuscript (in preparation).
It is quite easy to use SBMate as there is just one main method, sbmate.AnnotationMetrics.getMetrics
.
By default, SBMate produces a report summarizing the three scores:
Another option is to create a pandas DataFrame, as below:
And you will get the dataframe.
You can add additional metrics by creating a class that calculates metrics.
Metric values are contained in a pandas
DataFrame
.
See metric_calculator.py
to see how to write a class that calculates metrics.
When you construct AnnotationMetrics
, you will assign a value to the keyword argument metric_calculator_classes
of the constructor.