# Validating models (e.g., CellML, SBML files)

This tutorial illustrates how to check whether model files are consistent with the specifications of their associated formats.

BioSimulators currently supports several languages including
* [BioNetGen Language (BNGL)](https://bionetgen.org)
* [CellML](https://cellml.org): 1.0 and 2.0 (validation for 1.1 is not available)
* [NeuroML](https://neuroml.org/)
* [Low Entropy Model Specification (LEMS)](https://lems.github.io/LEMS/)
* [Smoldyn simulation configurations](http://www.smoldyn.org/)
* [Systems Biology Markup Language (SBML)](http://sbml.org), including all packages and versions
* [XML format for Resource Balance Analysis (RBA) models](https://github.com/SysBioInra/RBApy/blob/master/docs/XML_format%20(RBApy.xml).pdf)
* [XPP ODE format](http://www.math.pitt.edu/~bard/xpp/help/xppodes.html)

<div class="alert alert-block alert-info">
    BioSimulators integrates community-contributed validators for each model language. For some model languages, these validators provide limited validation and/or limited reports of errors. We welcome contributions of improved validation tools.
</div>

## 1. Validate a model online

The easiest way to validate models is to use the web interface at https://run.biosimulations.org. An HTTP API for validating models is also available at [https://combine.api.biosimulations.org](https://combine.api.biosimulations.org/).

## 2. Validate a model with the BioSimulators command-line application

First, install [BioSimulators-utils](https://github.com/biosimulators/Biosimulators_utils). Installation instructions are available at [https://docs.biosimulators.org](https://docs.biosimulators.org/Biosimulators_utils). Note, BioSimulators-utils must be installed with the installation options for the model languages that you wish to validate. A Docker image with BioSimulators utils and all dependencies is also available ([`ghcr.io/biosimulators/biosimulators`](https://github.com/biosimulators/Biosimulators/pkgs/container/biosimulators)).

Inline help for the `biosimulators-utils` command-line program is available by running the program with the `--help` option.

In [1]:
!biosimulators-utils --help

usage: biosimulators-utils [-h] [-d] [-q] [-v]
                           {convert,exec,validate-project,validate-metadata,validate-simulation,validate-model,build-project}
                           ...

Utilities for working with containerized biosimulation tools

optional arguments:
  -h, --help            show this help message and exit
  -d, --debug           full application debug mode
  -q, --quiet           suppress all console output
  -v, --version         show program's version number and exit

sub-commands:
  {convert,exec,validate-project,validate-metadata,validate-simulation,validate-model,build-project}
    convert             Convert files among formats
    exec                Execute a model project (COMBINE/OMEX archive)
    validate-project    Validate a model project (COMBINE/OMEX archive)
    validate-metadata   Validate metadata (OMEX Metadata file)
    validate-simulation
                        Validate a simulation experiment (SED-ML file)
    validate-model   

In [2]:
!biosimulators-utils validate-model --help

usage: biosimulators-utils validate-model [-h] language filename

Validate a model (e.g., CellML, SBML file)

positional arguments:
  language    Model language (`BNGL`, `CellML`, `LEMS`, `RBA`, `SBML`,
              `Smoldyn`, or `XPP`)
  filename    Path to model

optional arguments:
  -h, --help  show this help message and exit


Next, use the command-line program to validate the [model](../_data/Ciliberto-J-Cell-Biol-2003-morphogenesis-checkpoint-continuous.xml).

In [3]:
!biosimulators-utils validate-model SBML ../_data/Ciliberto-J-Cell-Biol-2003-morphogenesis-checkpoint-continuous.xml

  - The value of the 'sboTerm' attribute on a <species> is expected to be an SBO identifier (http://www.biomodels.net/SBO/). In SBML Level 2 prior to Version 4 it is expected to refer to a participant physical type (i.e., terms derived from SBO:0000236, "participant physical type"); in Versions 4 and above it is expected to refer to a material entity (i.e., terms derived from SBO:0000240, "material entity").
    Reference: L2V4 Section 5
     SBO term 'SBO:0000014' on the <species> is not in the appropriate branch.
    
  - The value of the 'sboTerm' attribute on a <species> is expected to be an SBO identifier (http://www.biomodels.net/SBO/). In SBML Level 2 prior to Version 4 it is expected to refer to a participant physical type (i.e., terms derived from SBO:0000236, "participant physical type"); in Versions 4 and above it is expected to refer to a material entity (i.e., terms derived from SBO:0000240, "material entity").
    Reference: L2V4 Section 5
     SBO term 'SBO:0000236' on t

If the model is invalid, a list of errors will be printed to your console.

## 3. Validate a model programmatically with Python

First, install [BioSimulators-utils](https://github.com/biosimulators/Biosimulators_utils). Installation instructions are available at [https://docs.biosimulators.org](https://docs.biosimulators.org/Biosimulators_utils). Note, BioSimulators-utils must be installed with the installation options for the model languages that you wish to validate. A Docker image with BioSimulators utils and all dependencies is also available ([`ghcr.io/biosimulators/biosimulators`](https://github.com/biosimulators/Biosimulators/pkgs/container/biosimulators)).

Next, import BioSimulators-utils' enumeration of model languages and model validation method.

In [4]:
from biosimulators_utils.sedml.data_model import ModelLanguage
from biosimulators_utils.sedml.validation import validate_model_with_language

This enumeration can be inspected to determine the key for each model language.

In [5]:
print('\n'.join(sorted('ModelLanguage.' + lang for lang in ModelLanguage.__members__.keys())))

ModelLanguage.BNGL
ModelLanguage.CellML
ModelLanguage.CopasiML
ModelLanguage.GINML
ModelLanguage.HOC
ModelLanguage.Kappa
ModelLanguage.LEMS
ModelLanguage.MASS
ModelLanguage.MorpheusML
ModelLanguage.NeuroML
ModelLanguage.RBA
ModelLanguage.SBML
ModelLanguage.Smoldyn
ModelLanguage.VCML
ModelLanguage.XPP
ModelLanguage.ZGINML
ModelLanguage.pharmML


Next, use the `validate_model_with_language` method to check the validity of a model file and retrieve list of errors and warnings and information about the model.

In [6]:
model_filename = '../_data/Ciliberto-J-Cell-Biol-2003-morphogenesis-checkpoint-continuous.xml'
model_language = ModelLanguage.SBML
errors, warnings, model = validate_model_with_language(model_filename, model_language)

The first and second outputs (`errors` and `warnings`) are nested lists of error and warning messages. Next, use the `flatten_nested_list_of_strings` method to print out human-readable messages.

In [7]:
from biosimulators_utils.utils.core import flatten_nested_list_of_strings
from warnings import warn

if warnings:
    warn(flatten_nested_list_of_strings(warnings), UserWarning)

if errors:
    raise ValueError(flatten_nested_list_of_strings(errors))

  Reference: L2V4 Section 5
   SBO term 'SBO:0000014' on the <species> is not in the appropriate branch.
  
- The value of the 'sboTerm' attribute on a <species> is expected to be an SBO identifier (http://www.biomodels.net/SBO/). In SBML Level 2 prior to Version 4 it is expected to refer to a participant physical type (i.e., terms derived from SBO:0000236, "participant physical type"); in Versions 4 and above it is expected to refer to a material entity (i.e., terms derived from SBO:0000240, "material entity").
  Reference: L2V4 Section 5
   SBO term 'SBO:0000236' on the <species> is not in the appropriate branch.
  
- The value of the 'sboTerm' attribute on a <species> is expected to be an SBO identifier (http://www.biomodels.net/SBO/). In SBML Level 2 prior to Version 4 it is expected to refer to a participant physical type (i.e., terms derived from SBO:0000236, "participant physical type"); in Versions 4 and above it is expected to refer to a material entity (i.e., terms derived fr

The third output of `validate_model_with_language` (`model`) contains information about the model. This type of this output depends on the model langauge. For SBML, this output is an instance of `libsbml.SBMLDocument`.

In [8]:
model.__class__

libsbml.SBMLDocument

`get_parameters_variables_outputs_for_simulation` uses this third output to identify the inputs (e.g., constants, initiation conditions) and outputs (observables, such as concentrations of species and velocities of reactions, that could be recorded from simulations) of models. See the [model introspection tutorial](../1.%20Introspecting%20models/Introspecting%20models.ipynb) for more information.