# Checking Model Quality

This notebooks example demonstrates the various methods for ensuring quality and consistency in models. Here, the functions of the `qcqa` submodule are used to inspect a broken model and identify the issues that need attention.

In [1]:
import mass.test

from mass import MassConfiguration
from mass.util import qcqa

model = mass.test.create_test_model("Model_To_Repair")

Academic license - for non-commercial use only


## Inspecting a Model

To quickly identify all issues in a model, the `qcqa_model` function of the `qcqa` submodule can be used. The function takes a `MassModel` and booleans for various flags, identifies issues in the model based on the flags, and prints a report outlining possible issues.

In [2]:
qcqa.qcqa_model(
    model,
    parameters=True,        # Check for undefined but necessary parameters in the model
    concentrations=True,    # Check for undefined but necessary concentrations in the model
    fluxes=True,            # Check for undefined steady state fluxes for reactions in the model
    superfluous=True,       # Check for excess parameters and ensure they are consistent.
    elemental=True,         # Check mass and charge balancing of reactions in the model
    simulation_only=True,  # Check for values necessary for simulation only
)

╒═══════════════════════════════════════════════════════════════╕
│ MODEL ID: RBC_PFK                                             │
│ SIMULATABLE: False                                            │
│ PARAMETERS NUMERICALY CONSISTENT: False                       │
╞═══════════════════════════════════════════════════════════════╡
│                      MISSING PARAMETERS                       │
│ Reaction Parameters    Custom Parameters    S.S. Fluxes       │
│ ---------------------  -------------------  -------------     │
│ PGI: Keq; kf           PFK_R01: Keq_PFK_A   GAPD              │
│ PGK: kf                PFK_R11: Keq_PFK_A                     │
│ PGM: Keq               PFK_R21: Keq_PFK_A                     │
│                        PFK_R31: Keq_PFK_A                     │
│                        PFK_R41: Keq_PFK_A                     │
├───────────────────────────────────────────────────────────────┤
│                    MISSING CONCENTRATIONS                     │
│ Initial 

The `simulation_only` flag as `True` ensures that identified missing values in the report (excluding steady state fluxes) are necessary for simulation. As seen above, there are a number of missing values and consistency issues that need to be addressed.

## Identifying Missing Values

The report printed by the `qcqa_model` function shows that there are a number of values in the model that have not yet been defined. Here, the functions of the `qcqa` submodule will be used to retrieve the objects in the model that are missing values so that they can be defined.

### Missing Parameters

To identify the reactions that have missing parameter values, the `parameters` flag is set as `True`. Reaction parameters for mass action rate laws (e.g. forward and reverse rate constants, equilibrium constants) and custom parameters are both checked for undefined numerical values.

In [3]:
qcqa.qcqa_model(model, parameters=True)

╒══════════════════════════════════════════════╕
│ MODEL ID: RBC_PFK                            │
│ SIMULATABLE: False                           │
│ PARAMETERS NUMERICALY CONSISTENT: False      │
╞══════════════════════════════════════════════╡
│             MISSING PARAMETERS               │
│ Reaction Parameters    Custom Parameters     │
│ ---------------------  -------------------   │
│ PGI: Keq; kf           PFK_R01: Keq_PFK_A    │
│ PGK: kf                PFK_R11: Keq_PFK_A    │
│ PGM: Keq               PFK_R21: Keq_PFK_A    │
│                        PFK_R31: Keq_PFK_A    │
│                        PFK_R41: Keq_PFK_A    │
╘══════════════════════════════════════════════╛


The report shows that the PGI, PGK, and PGM reactions are missing numerical values for forward rate and equilibrium constants. The `get_missing_reaction_parameters` function can be used to get these reaction objects from the model:

In [4]:
qcqa.get_missing_reaction_parameters(model)

{<MassReaction PGI at 0x11adb9d90>: 'Keq; kf',
 <MassReaction PGK at 0x11e345950>: 'kf',
 <MassReaction PGM at 0x11e345bd0>: 'Keq'}

The `get_missing_reaction_parameters` function returns a `dict` containing reaction objects and a string corresponding to their missing parameters. To get a subset of these reactions, a list of reaction identifiers can be passed to the `reaction_list` argument. For example, to seperate the reactions missing forward rate constants from those that are missing equilibrium constants:

In [5]:
missing_kfs = qcqa.get_missing_reaction_parameters(model, reaction_list=["PGI", "PGK"])
missing_Keqs = qcqa.get_missing_reaction_parameters(model, reaction_list=["PGI", "PGM"])

print("Missing forward rate constants: {0!r}".format(list(missing_kfs)))
print("Missing equilibrium constants: {0!r}".format(list(missing_Keqs)))

Missing forward rate constants: [<MassReaction PGI at 0x11adb9d90>, <MassReaction PGK at 0x11e345950>]
Missing equilibrium constants: [<MassReaction PGI at 0x11adb9d90>, <MassReaction PGM at 0x11e345bd0>]


To identify missing custom parameters and the reactions that utilize them, the `get_missing_custom_parameters` function can be used.

In [6]:
qcqa.get_missing_custom_parameters(model)

{<EnzymeModuleReaction PFK_R01 at 0x11e3ca5d0>: 'Keq_PFK_A',
 <EnzymeModuleReaction PFK_R11 at 0x11e3cac10>: 'Keq_PFK_A',
 <EnzymeModuleReaction PFK_R21 at 0x11e3d47d0>: 'Keq_PFK_A',
 <EnzymeModuleReaction PFK_R31 at 0x11e3d42d0>: 'Keq_PFK_A',
 <EnzymeModuleReaction PFK_R41 at 0x11e3dcf10>: 'Keq_PFK_A'}

Once defined, the parameters no longer appear in the returned `dict`. An empty `dict` means no undefined values were found.

In [7]:
# Define missing parameters and update model
missing_parameters = {
    "kf_PGI": 2961.11, "Keq_PGI": 0.41,
    "kf_PGK": 1061655.085,
    "Keq_PGM": 0.147059,
    "Keq_PFK_A": 14.706}
model.update_parameters(missing_parameters)

print("Missing reaction parameters: {0!r}".format(qcqa.get_missing_reaction_parameters(model)))
print("Missing custom parameters: {0!r}".format(qcqa.get_missing_custom_parameters(model)))

Missing reaction parameters: {}
Missing custom parameters: {}


### Missing Fluxes
To identify the reactions that have missing steady state flux values, the `fluxes` flag is set as `True`.

In [8]:
qcqa.qcqa_model(model, fluxes=True)

╒══════════════════════════════════════════════╕
│ MODEL ID: RBC_PFK                            │
│ SIMULATABLE: False                           │
│ PARAMETERS NUMERICALY CONSISTENT: False      │
╞══════════════════════════════════════════════╡
│             MISSING PARAMETERS               │
│ S.S. Fluxes                                  │
│ -------------                                │
│ GAPD                                         │
╘══════════════════════════════════════════════╛


To get the reaction objects that are missing steady state fluxes, the `get_missing_steady_state_fluxes` function can be used. An empty `list` indicates no missing values were found.

In [9]:
missing_fluxes = qcqa.get_missing_steady_state_fluxes(model)
print("Before: {0!r}".format(missing_fluxes))

# Define missing flux value
missing_fluxes[0].steady_state_flux = 2.305

missing_fluxes = qcqa.get_missing_steady_state_fluxes(model)
print("After: {0!r}".format(missing_fluxes))

Before: [<MassReaction GAPD at 0x11e345350>]
After: []


### Missing Concentrations

To identify the metabolites that have missing concentrations, the `concentrations` flag is set as `True`. Metabolite concentrations refer the initial conditions and the boundary conditions defined in the model.

In [10]:
qcqa.qcqa_model(model, concentrations=True)

╒══════════════════════════════════════════════════════════╕
│ MODEL ID: RBC_PFK                                        │
│ SIMULATABLE: False                                       │
│ PARAMETERS NUMERICALY CONSISTENT: False                  │
╞══════════════════════════════════════════════════════════╡
│                 MISSING CONCENTRATIONS                   │
│ Initial Conditions               Boundary Conditions     │
│ -------------------------------  ---------------------   │
│ glc__D_c (in HEX1, SK_glc__D_c)  h2o_b (in SK_h2o_c)     │
╘══════════════════════════════════════════════════════════╛


The `get_missing_initial_conditions` function can be used to return a list of metabolite objects that have undefined initial conditions:

In [11]:
missing_ics = qcqa.get_missing_initial_conditions(model)
print(missing_ics)

[<MassMetabolite glc__D_c at 0x11e31fc50>]


The `get_missing_boundary_conditions` function can be used to return a list of metabolite objects that have undefined initial conditions:

In [12]:
qcqa.get_missing_boundary_conditions(model)

['h2o_b']

Once defined, the metabolites no longer appear in the returned `list`. An empty `list` means no undefined concentration were found.

In [13]:
# Define missing initial condition
missing_ics[0].initial_condition = 1.3
# Define mising boundary condition
model.boundary_conditions["h2o_b"] = 1

# Check model to ensure they have been defined
print("Missing initial conditions: {0!r}".format(qcqa.get_missing_initial_conditions(model)))
print("Missing boundary conditions: {0!r}".format(qcqa.get_missing_boundary_conditions(model)))

Missing initial conditions: []
Missing boundary conditions: []


After defining the missing values, the report displayed by the `qcqa_model` function shows that the model is simulatable. However, the model parameters are not considered numerically consistent, which may present some problems during the simulation processes.

In [14]:
qcqa.qcqa_model(model, parameters=True, concentrations=True, fluxes=True)

╒═══════════════════════════════════════════╕
│ MODEL ID: RBC_PFK                         │
│ SIMULATABLE: True                         │
│ PARAMETERS NUMERICALY CONSISTENT: False   │
╞═══════════════════════════════════════════╡
╘═══════════════════════════════════════════╛


## Consistency Checks

In addition to the undefined numerical values in the model, the initial report printed by the `qcqa_model` function also indicated some issues in parameter consistency and elemental balancing. Here, the functions of the `qcqa` submodule will be used to retrieve the objects in the model that have consistency issues so that they can be corrected.

### Elemental
To identify the reactions that are not elementally balanced, the `elemental` flag is set as `True`. Note that pseudoreactions are typically not balanced and although boundary reactions are excluded by default, other pseudoreactions may exist in the system. In this model, the two pseudoreactions expected to be unbalanced are the `DM_nadh` and the `GSHR` reactions.

In [15]:
qcqa.qcqa_model(model, elemental=True)

╒══════════════════════════════════════════════╕
│ MODEL ID: RBC_PFK                            │
│ SIMULATABLE: True                            │
│ PARAMETERS NUMERICALY CONSISTENT: False      │
╞══════════════════════════════════════════════╡
│             CONSISTENCY CHECKS               │
│ Elemental                                    │
│ ---------------------------------            │
│ HEX1: {H: -3.0; O: -4.0; P: -1.0}            │
│ PGI: {H: 3.0; O: 4.0; P: 1.0}                │
│ G6PDH2r: {H: 3.0; O: 4.0; P: 1.0}            │
│ DM_nadh: {charge: 2.0}                       │
│ GSHR: {charge: 2.0}                          │
╘══════════════════════════════════════════════╛


However, as seen above, there are reactions other than the two expected pseudoreactions that appear in the printed report. Specifically, these are reactions witgh an imbalance in phosphoric acid (H3PO4). To get the imbalanced reaction objects, the `check_elemental_consistency` function can be used.

In [16]:
imbalanced_reactions = qcqa.check_elemental_consistency(
    model, reaction_list=["HEX1", "PGI", "G6PDH2r"])
imbalanced_reactions

{<MassReaction HEX1 at 0x11e345290>: 'H: -3.0; O: -4.0; P: -1.0',
 <MassReaction PGI at 0x11adb9d90>: 'H: 3.0; O: 4.0; P: 1.0',
 <MassReaction G6PDH2r at 0x11e3942d0>: 'H: 3.0; O: 4.0; P: 1.0'}

By looking at the reactions, their stoichiometries, and which elements/moieties are not balanced, it is clear that glucose 6-phosphate (G6P) is missing a phosphoric acid in its chemica formula.

In [17]:
for reaction, unbalanced in imbalanced_reactions.items():
    print(reaction)

g6p_c = model.metabolites.get_by_id("g6p_c")
print("\n{0} formula before: {1}".format(g6p_c.id, repr(g6p_c.formula)))

HEX1: atp_c + glc__D_c <=> adp_c + g6p_c + h_c
PGI: g6p_c <=> f6p_c
G6PDH2r: g6p_c + nadp_c <=> _6pgl_c + h_c + nadph_c

g6p_c formula before: 'C6H8O5'


The current elemental composition of G6P can be combined with the elemental composition of phosphoric acid:

In [18]:
# Get existing formula composition
formula_composition = g6p_c.elements

# Update with the phosphoric acid
phosphoric_acid = {"H": 3, "P": 1, "O": 4}
for element, to_add in phosphoric_acid.items():
    if element in formula_composition:
        formula_composition[element] += to_add
    else:
        formula_composition[element] = to_add

# Change the existing formula to the new one
g6p_c.elements = formula_composition

print("{0} formula after: {1}".format(g6p_c.id, repr(g6p_c.formula)))

g6p_c formula after: 'C6H11O9P'


The reactions are no longer considered imbalanced:

In [19]:
imbalanced_reactions = qcqa.check_elemental_consistency(
    model, reaction_list=["HEX1", "PGI", "G6PDH2r"])
imbalanced_reactions

{}

### Superfluous Parameters

To identify the reactions with superfluous parameters, the `superfluous` flag is set as `True`. If a reaction has superfluous parameters, they are checked to ensure if they are numerically consistent:

In [20]:
qcqa.qcqa_model(model, superfluous=True)

╒══════════════════════════════════════════════╕
│ MODEL ID: RBC_PFK                            │
│ SIMULATABLE: True                            │
│ PARAMETERS NUMERICALY CONSISTENT: False      │
╞══════════════════════════════════════════════╡
│             CONSISTENCY CHECKS               │
│ Superfluous Parameters                       │
│ ------------------------                     │
│ HEX1: Inconsistent                           │
│ PYK: Consistent                              │
╘══════════════════════════════════════════════╛


The pyruvate kinase reaction (PYK) contains a consistent superfluous parameter. A consistent superfluous parmameter indicates that although an extra parameter is defined, the forward rate constant, reverse rate constant, and the equilibrium constant are numerically consistent with consistency being determined as $|k_{f} / K_{eq} - k_{r}| \le tolerance$. The tolerance is determined by the `decimal_precision` of the `MassConfiguration` object (e.g. a `decimal_precision` of 8 corresponds to rounding at the 8th digit right of the decimal, equivalent to $|k_{f} / K_{eq} - k_{r}| <= 10^{-8}$.

In [21]:
PYK = model.reactions.get_by_id("PYK")
print(abs(PYK.kf / PYK.Keq - PYK.kr))

0.0


The hexokinase reaction (HEX1) contains an inconsistent superfluous parameter.

In [22]:
HEX1 = model.reactions.get_by_id("HEX1")
print(abs(HEX1.kf / HEX1.Keq - HEX1.kr))

10.0


Inconsistent superfluous parameters can quickly be fixed by defining them to be a consistent value, or ignored by setting the value as `None`.

In [23]:
HEX1.kr = None
qcqa.qcqa_model(model, superfluous=True)

╒══════════════════════════════════════════════╕
│ MODEL ID: RBC_PFK                            │
│ SIMULATABLE: True                            │
│ PARAMETERS NUMERICALY CONSISTENT: True       │
╞══════════════════════════════════════════════╡
│             CONSISTENCY CHECKS               │
│ Superfluous Parameters                       │
│ ------------------------                     │
│ PYK: Consistent                              │
╘══════════════════════════════════════════════╛


After addressing several of the model issues, the `qcqa_model` function from the beginning of the notebook can be ran again. This time, the report indicates that the model is elementally balanced and contains the numerical values necessary for simulation

In [24]:
qcqa.qcqa_model(
    model,
    parameters=True,        # Check for undefined but necessary parameters in the model
    concentrations=True,    # Check for undefined but necessary concentrations in the model
    fluxes=True,            # Check for undefined steady state fluxes for reactions in the model
    superfluous=True,       # Check for excess parameters and ensure they are consistent.
    elemental=True,         # Check mass and charge balancing of reactions in the model
)

╒════════════════════════════════════════════════════╕
│ MODEL ID: RBC_PFK                                  │
│ SIMULATABLE: True                                  │
│ PARAMETERS NUMERICALY CONSISTENT: True             │
╞════════════════════════════════════════════════════╡
│                CONSISTENCY CHECKS                  │
│ Superfluous Parameters    Elemental                │
│ ------------------------  ----------------------   │
│ PYK: Consistent           DM_nadh: {charge: 2.0}   │
│                           GSHR: {charge: 2.0}      │
╘════════════════════════════════════════════════════╛
