# GEnome-scale Regulatory and Metabolic (GERM) models

MEWpy supports the integration of regulatory and metabolic models at the genome-scale.
All tools required to build, simulate, and analyze GEnome-scale Regulatory and Metabolic (GERM) models
are available in the `mewpy.germ` module.

This example uses the integrated _E. coli_ core model published by [Orth _et al_, 2010](https://doi.org/10.1128/ecosalplus.10.2.1).
This model includes a standard Genome-Scale Metabolic (GEM) model for the central carbon metabolism of E. coli. The GEM model includes several reactions (w/ GPRs), metabolites and genes associated with the central carbon metabolism in E. coli. It also includes exchange reactions defining the environmental conditions of the system.

In addition, this example uses a Transcriptional Regulatory Network (TRN) for the central carbon metabolism of E. coli. The TRN includes several interactions (w/ boolean algebra expressions), target genes and regulators associated with the central carbon metabolism and linked to genes in the metabolic model. It also includes external stimuli (effectors) associated with metabolite concentration, reactions' rates or environmental conditions.

In [1]:
# imports
import os
from pathlib import Path

from mewpy.io import read_model, Engines, Reader

## Reading GERM models

_E. coli_ integrated model is available in two separate files:
- metabolic model _models/regulation/e_coli_core.xml_
- regulatory model _models/regulation/e_coli_core_trn.csv_

To assemble a GERM model, we use `mewpy.io.read_model` function. This function accepts multiple readers having different engines.
MEWpy contains the following engines that can be used in the `Reader` object:
- `BooleanRegulatoryCSV`
- `CoExpressionRegulatoryCSV`
- `TargetRegulatorRegulatoryCSV`
- `RegulatorySBML`
- `MetabolicSBML`
- `CobraModel`
- `ReframedModel`
- `JSON`

In addition, the `Reader` accepts other arguments such as the _filename_, _sep_, among others.
Although `mewpy.io.read_model` function is the preferred interface for reading models, MEWpy contains other read/write methods available at `mewpy.io`.

In [2]:
# current directory
path = Path(os.getcwd())
reg_path = path.joinpath('models', 'germ')

# a reader for the E. coli core GEM model
gem_model = reg_path.joinpath('e_coli_core.xml')
gem_reader = Reader(Engines.MetabolicSBML, gem_model)

# a reader for the E. coli core TRN model
# (it accepts specific parameters for reading the TRN CSV file)
trn_model = reg_path.joinpath('e_coli_core_trn.csv')
trn_reader = Reader(Engines.BooleanRegulatoryCSV,
                    trn_model,
                    sep=',',
                    id_col=0,
                    rule_col=2,
                    aliases_cols=[1],
                    header=0)

# reading the integrated regulatory-metabolic model
model = read_model(gem_reader, trn_reader)
model

0,1
Model,e_coli_core
Name,E. coli core model - Orth et al 2010
Types,"metabolic, regulatory"
Compartments,"c, e"
Reactions,95
Metabolites,72
Genes,137
Exchanges,20
Demands,0
Sinks,0


## Working with GERM models

A GERM model contains relevant metabolic information:
- `objective` - Attribute with the current objective function; a dictionary of type variable-coefficient
- `reactions` - Attribute/container with the model reactions; a dictionary of type reaction identifier-reaction variable
- `metabolites` - Attribute/container with the model metabolites; a dictionary of type metabolite identifier-metabolite variable
- `genes` - Attribute/container with the model genes; a dictionary of type gene identifier-gene variable
- `gprs` - Attribute/container with the model GPRs; a dictionary of type reaction identifier-GPR expression
- `compartments` - Attribute/container with the model compartments; a dictionary of type compartment identifier-compartment name
- `exchanges` - Attribute/container with the model exchanges; a dictionary of type reaction identifier-reaction variable
- `demands` - Attribute/container with the model demands; a dictionary of type reaction identifier-reaction variable
- `sinks` - Attribute/container with the model sinks; a dictionary of type reaction identifier-reaction variable
- `external_compartment` - Attribute with the model external compartment - compartment having larger number of exchange reactions

A GERM model contains relevant regulatory information:
- `interactions` - Attribute/container with the model interactions; a dictionary of type interaction identifier-interaction variable
- `targets` - Attribute/container with the model targets; a dictionary of type target identifier-target variable
- `regulators` - Attribute/container with the model regulators; a dictionary of type regulator identifier-regulator variable
- `regulatory_reactions` - Attribute/container with the model regulatory reactions; a dictionary of type reaction identifier-reaction variable
- `regulatory_metabolites` - Attribute/container with the model regulatory metabolites; a dictionary of type metabolite identifier-metabolite variable
- `environmental_stimuli` - Attribute/container with the model stimuli; a dictionary of type variable identifier-variable variable.

In [3]:
# the objective function
model.objective

{Biomass_Ecoli_core || 1.496 3pg_c + 3.7478 accoa_c + 59.81 atp_c + 0.361 e4p_c + 0.0709 f6p_c + 0.129 g3p_c + 0.205 g6p_c + 0.2557 gln__L_c + 4.9414 glu__L_c + 59.81 h2o_c + 3.547 nad_c + 13.0279 nadph_c + 1.7867 oaa_c + 0.5191 pep_c + 2.8328 pyr_c + 0.8977 r5p_c -> 59.81 adp_c + 4.1182 akg_c + 3.7478 coa_c + 59.81 h_c + 3.547 nadh_c + 13.0279 nadp_c + 59.81 pi_c: 1.0}

In [4]:
# reactions
model.reactions

{'ACALD': ACALD || 1.0 acald_c + 1.0 coa_c + 1.0 nad_c <-> 1.0 accoa_c + 1.0 h_c + 1.0 nadh_c,
 'ACALDt': ACALDt || 1.0 acald_e <-> 1.0 acald_c,
 'ACKr': ACKr || 1.0 ac_c + 1.0 atp_c <-> 1.0 actp_c + 1.0 adp_c,
 'ACONTa': ACONTa || 1.0 cit_c <-> 1.0 acon_C_c + 1.0 h2o_c,
 'ACONTb': ACONTb || 1.0 acon_C_c + 1.0 h2o_c <-> 1.0 icit_c,
 'ACt2r': ACt2r || 1.0 ac_e + 1.0 h_e <-> 1.0 ac_c + 1.0 h_c,
 'ADK1': ADK1 || 1.0 amp_c + 1.0 atp_c <-> 2.0 adp_c,
 'AKGDH': AKGDH || 1.0 akg_c + 1.0 coa_c + 1.0 nad_c -> 1.0 co2_c + 1.0 nadh_c + 1.0 succoa_c,
 'AKGt2r': AKGt2r || 1.0 akg_e + 1.0 h_e <-> 1.0 akg_c + 1.0 h_c,
 'ALCD2x': ALCD2x || 1.0 etoh_c + 1.0 nad_c <-> 1.0 acald_c + 1.0 h_c + 1.0 nadh_c,
 'ATPM': ATPM || 1.0 atp_c + 1.0 h2o_c -> 1.0 adp_c + 1.0 h_c + 1.0 pi_c,
 'ATPS4r': ATPS4r || 1.0 adp_c + 4.0 h_e + 1.0 pi_c <-> 1.0 atp_c + 1.0 h2o_c + 3.0 h_c,
 'Biomass_Ecoli_core': Biomass_Ecoli_core || 1.496 3pg_c + 3.7478 accoa_c + 59.81 atp_c + 0.361 e4p_c + 0.0709 f6p_c + 0.129 g3p_c + 0.205 g6p

In [5]:
# interactions
model.interactions

{'b0008_interaction': b0008 || 1 = 1,
 'b0080_interaction': b0080 || 1 = ( ~ surplusFDP),
 'b0113_interaction': b0113 || 1 = ( ~ surplusPYR),
 'b0114_interaction': b0114 || 1 = (( ~ b0113) | b3261),
 'b0115_interaction': b0115 || 1 = (( ~ b0113) | b3261),
 'b0116_interaction': b0116 || 1 = 1,
 'b0118_interaction': b0118 || 1 = 1,
 'b0351_interaction': b0351 || 1 = 1,
 'b0356_interaction': b0356 || 1 = 1,
 'b0399_interaction': b0399 || 1 = b0400,
 'b0400_interaction': b0400 || 1 = ( ~ (pi_e > 0)),
 'b0451_interaction': b0451 || 1 = 1,
 'b0474_interaction': b0474 || 1 = 1,
 'b0485_interaction': b0485 || 1 = 1,
 'b0720_interaction': b0720 || 1 = 1,
 'b0721_interaction': b0721 || 1 = (( ~ (b4401 | b1334)) | b3357 | b3261),
 'b0722_interaction': b0722 || 1 = (( ~ (b4401 | b1334)) | b3357 | b3261),
 'b0723_interaction': b0723 || 1 = (( ~ (b4401 | b1334)) | b3357 | b3261),
 'b0724_interaction': b0724 || 1 = (( ~ (b4401 | b1334)) | b3357 | b3261),
 'b0726_interaction': b0726 || 1 = 1,
 'b0727_

### Access variables in GERM models
A GERM model includes several containers to store reactions, metabolites, genes, interactions, targets, and regulators.
These containers are regular Python dictionaries, thus one can **access** (**and only access**) variables using their identifier or the dictionary interface.
The common API: `model.reactions['REACTION_IDENTIFIER']`
One can also yield variables from the model using the model `yield_...` methods, such as `model.yield_regulators()`

**IMPORTANT NOTE**:
Although one can add/remove variables to/from a model container using the dictionaries, these changes will not alter the model directly!!

In [6]:
# get PDH reaction from the model
pdh = model.reactions['PDH']
pdh

0,1
Identifier,PDH
Name,
Aliases,
Model,e_coli_core
Types,reaction
Equation,1.0 coa_c + 1.0 nad_c + 1.0 pyr_c -> 1.0 accoa_c + 1.0 co2_c + 1.0 nadh_c
Bounds,"(0.0, 1000.0)"
Reversibility,False
Metabolites,"coa_c, nad_c, pyr_c, accoa_c, co2_c, nadh_c"
Boundary,False


In [7]:
# get the PdhR regulator
pdh_r = model.regulators.get('b0113')
pdh_r

0,1
Identifier,b0113
Name,b0113
Aliases,"PdhR, b0113"
Model,e_coli_core
Types,"regulator, target"
Coefficients,"(0.0, 1.0)"
Active,True
Interactions,"b0114_interaction, b0115_interaction"
Targets,"b0114, b0115"
Environmental stimulus,False


In [8]:
# iterate over genes
for i, regulator in enumerate(model.yield_regulators()):
    print(regulator)
    if i == 20:
        break

Biomass_Ecoli_core || 1.496 3pg_c + 3.7478 accoa_c + 59.81 atp_c + 0.361 e4p_c + 0.0709 f6p_c + 0.129 g3p_c + 0.205 g6p_c + 0.2557 gln__L_c + 4.9414 glu__L_c + 59.81 h2o_c + 3.547 nad_c + 13.0279 nadph_c + 1.7867 oaa_c + 0.5191 pep_c + 2.8328 pyr_c + 0.8977 r5p_c -> 59.81 adp_c + 4.1182 akg_c + 3.7478 coa_c + 59.81 h_c + 3.547 nadh_c + 13.0279 nadp_c + 59.81 pi_c
FBP || 1.0 fdp_c + 1.0 h2o_c -> 1.0 f6p_c + 1.0 pi_c
GLCpts || 1.0 glc__D_e + 1.0 pep_c -> 1.0 g6p_c + 1.0 pyr_c
LDH_D || 1.0 lac__D_c + 1.0 nad_c <-> 1.0 h_c + 1.0 nadh_c + 1.0 pyr_c
ME1 || 1.0 mal__L_c + 1.0 nad_c -> 1.0 co2_c + 1.0 nadh_c + 1.0 pyr_c
ME2 || 1.0 mal__L_c + 1.0 nadp_c -> 1.0 co2_c + 1.0 nadph_c + 1.0 pyr_c
PFK || 1.0 atp_c + 1.0 f6p_c -> 1.0 adp_c + 1.0 fdp_c + 1.0 h_c
PGI || 1.0 g6p_c <-> 1.0 f6p_c
PYK || 1.0 adp_c + 1.0 h_c + 1.0 pep_c -> 1.0 atp_c + 1.0 pyr_c
SUCCt2_2 || 2.0 h_e + 1.0 succ_e -> 2.0 h_c + 1.0 succ_c
TALA || 1.0 g3p_c + 1.0 s7p_c <-> 1.0 e4p_c + 1.0 f6p_c
TKT2 || 1.0 e4p_c + 1.0 xu5p__D_c <-

### Manipulate variables in GERM models
A GERM model contains a simple interface to add/remove variables.
A GERM model supports the following operations:
- `get(identifier, default=None)` - It retrieves the variable by its identifier
- `add(variables)` - It adds new variables to the model; variables are added to containers according to the variable types
- `remove(variables)` - It removes variables from the model; variables are removed from containers according to the variable types
- `update(variables, objective, ...)` - It updates variables, compartments, objective, etc, in the model
- `copy()` - It makes a shallow copy of the model
- `deepcopy()` - It makes a deep copy of the model
- `to_dict()` - It exports the model to a dictionary

In [9]:
# get the Crp regulator
crp = model.get('b3357')
crp

0,1
Identifier,b3357
Name,b3357
Aliases,"b3357, Crp"
Model,e_coli_core
Types,"regulator, target"
Coefficients,"(0.0, 1.0)"
Active,True
Interactions,"b0721_interaction, b0722_interaction, b0723_interaction, b0724_interaction, b0902_interaction, b0903_interaction, b0904_interaction, b1524_interaction, b2492_interaction, b3114_interaction, b3115_interaction, b3870_interaction, b4122_interaction"
Targets,"b0721, b0722, b0723, b0724, b0902, b0903, b0904, b1524, b2492, b3114, b3115, b3870, b4122"
Environmental stimulus,False


In [10]:
# remove the regulator from the model
model.remove(crp)
'b3357' in model.regulators

False

In [11]:
# add the regulatory back to the model
model.add(crp)
'b3357' in model.regulators

True

In [12]:
# shallow copy only performs a copy of the containers
model_copy = model.copy()
model is model_copy

False

In [13]:
# variables are still the same
crp is model_copy.regulators['b3357']

True

In [14]:
# deep copy performs a copy of the containers and variables
model_copy = model.deepcopy()
crp is model_copy.regulators['b3357']

False

In [15]:
# export the model to a dictionary
model_dict = model.to_dict()
model_dict

{'types': ('metabolic', 'regulatory'),
 'id': 'e_coli_core',
 'name': 'E. coli core model - Orth et al 2010',
 'genes': {'b0351': b0351 || 1 = 1,
  'b1241': b1241 || 1 = (( ~ (o2_e > 0)) | ( ~ ((o2_e > 0) & b0080)) | b3261),
  's0001': s0001 || 1 = 1,
  'b2296': b2296 || 1 = 1,
  'b3115': b3115 || 1 = (b3357 | b1334),
  'b1849': b1849 || 1 = 1,
  'b0118': b0118 || 1 = 1,
  'b1276': b1276 || 1 = 1,
  'b0474': b0474 || 1 = 1,
  'b0726': b0726 || 1 = 1,
  'b0116': b0116 || 1 = 1,
  'b0727': b0727 || 1 = 1,
  'b2587': b2587 || 1 = 1,
  'b1478': b1478 || 1 = 1,
  'b0356': b0356 || 1 = 1,
  'b3738': b3738 || 1 = 1,
  'b3736': b3736 || 1 = 1,
  'b3737': b3737 || 1 = 1,
  'b3735': b3735 || 1 = 1,
  'b3733': b3733 || 1 = 1,
  'b3731': b3731 || 1 = 1,
  'b3732': b3732 || 1 = 1,
  'b3734': b3734 || 1 = 1,
  'b3739': b3739 || 1 = 1,
  'b0720': b0720 || 1 = 1,
  'b0978': b0978 || 1 = 1,
  'b0979': b0979 || 1 = 1,
  'b0733': b0733 || 1 = (( ~ b1334) | b4401),
  'b0734': b0734 || 1 = (( ~ b1334) | b4

### Temporary changes in a GERM model
GERM models support temporary changes using the `with` context manager. In addition, one can manually `undo()`, `redo()`, `reset()` and `restore()` a GERM model.

In [16]:
pfk = model.get('PFK')

with model:
    model.remove(pfk)
    print('Is PFK in the model?', 'PFK' in model.reactions)

print('Has PFK removal been reverted?', 'PFK' in model.reactions)

Is PFK in the model? False
Has PFK removal been reverted? True


In [17]:
# let's reset the model to the initial state
model.objective = {pfk: 1}
print('New objective function:', model.objective)
print()

model.reset()
print('Original objective function:', model.objective)

New objective function: {PFK || 1.0 atp_c + 1.0 f6p_c -> 1.0 adp_c + 1.0 fdp_c + 1.0 h_c: 1}

Original objective function: {Biomass_Ecoli_core || 1.496 3pg_c + 3.7478 accoa_c + 59.81 atp_c + 0.361 e4p_c + 0.0709 f6p_c + 0.129 g3p_c + 0.205 g6p_c + 0.2557 gln__L_c + 4.9414 glu__L_c + 59.81 h2o_c + 3.547 nad_c + 13.0279 nadph_c + 1.7867 oaa_c + 0.5191 pep_c + 2.8328 pyr_c + 0.8977 r5p_c -> 59.81 adp_c + 4.1182 akg_c + 3.7478 coa_c + 59.81 h_c + 3.547 nadh_c + 13.0279 nadp_c + 59.81 pi_c: 1.0}


### Working with multi-type GERM models

A GERM model is by default a multi-type model supporting manipulation of both a **metabolic** and **regulatory** model at the same time.
However, one can manipulate a **single** regulatory or metabolic model.

MEWpy allows building single- or multi-type models easily. And, if you are confused about which model you are working with, you can always check the model types. For instance, one can check `model.types` or use a type checker `model.is_regulatory()`

In [18]:
from mewpy.germ.models import RegulatoryModel

# creating a new regulatory model
reg_model = RegulatoryModel(identifier='my_regulatory_model')
reg_model

0,1
Model,my_regulatory_model
Name,my_regulatory_model
Types,regulatory
Compartments,
Regulatory interactions,0
Targets,0
Regulators,0
Regulatory reactions,0
Regulatory metabolites,0
Environmental stimuli,0


In [19]:
# check if the model is metabolic
reg_model.is_metabolic()

False

In [20]:
# IMPORTANT: VARIABLES MUST BELONG TO A SINGLE MODEL. WE CAN MAKE A DEEPCOPY OF VARIABLES TOO
interaction = model.get('b0721_interaction').deepcopy()

# If you use the comprehensive option, interaction children will be added to the model too.
reg_model.add(interaction, comprehensive=True)

reg_model

0,1
Model,my_regulatory_model
Name,my_regulatory_model
Types,regulatory
Compartments,
Regulatory interactions,1
Targets,1
Regulators,4
Regulatory reactions,0
Regulatory metabolites,0
Environmental stimuli,4


In [21]:
from mewpy.germ.models import Model

pfk = model.get('PFK').deepcopy()

# one can build GERM models in many ways
met_model_1 = Model.from_types(('metabolic', ), 
                               identifier='my_metabolic_model', 
                               reactions={'pfk': pfk})
met_model_1

0,1
Model,my_metabolic_model
Name,my_metabolic_model
Types,metabolic
Compartments,
Reactions,1
Metabolites,5
Genes,2
Exchanges,0
Demands,0
Sinks,0


In [22]:
pfk = model.get('PFK').deepcopy()

met_model_2 = Model.from_metabolic(identifier='my_metabolic_model', 
                                   reactions={'pfk': pfk})
met_model_2

0,1
Model,my_metabolic_model
Name,my_metabolic_model
Types,metabolic
Compartments,
Reactions,1
Metabolites,5
Genes,2
Exchanges,0
Demands,0
Sinks,0


In [23]:
# One can read a regulatory model only
e_coli_trn = read_model(trn_reader)
e_coli_trn

0,1
Model,e_coli_core_trn
Name,model
Types,regulatory
Compartments,
Regulatory interactions,159
Targets,159
Regulators,45
Regulatory reactions,0
Regulatory metabolites,0
Environmental stimuli,23


In [24]:
# or the metabolic one
e_coli_gem = read_model(gem_reader)
e_coli_gem

0,1
Model,e_coli_core
Name,E. coli core model - Orth et al 2010
Types,metabolic
Compartments,"c, e"
Reactions,95
Metabolites,72
Genes,137
Exchanges,20
Demands,0
Sinks,0


## Working with GERM model variables

MEWpy contains several **metabolic** and **regulatory** variables having the following main attributes:
- `Reaction` - Object to represent metabolic reactions having bounds, stoichiometry (metabolite/coefficient) and GPRs
- `Metabolite` - Object to represent metabolic compounds having charge, compartment, formula and reactions
- `Gene` - Object to represent metabolic genes having coefficients and reactions (found in GPR expressions)
- `Interaction` - Object to represent regulatory interactions having a target and associated regulatory events (coefficient/boolean rule)
- `Target` - Object to represent regulatory targets having coefficients and interaction
- `Regulator` - Object to represent regulatory having coefficients and interactions

As we can see, variables have different attributes that can be inspected and changed using several methods. Variables are often connected to other variables and have special attributes, such as boolean expressions, coefficients and dictionaries of metabolites (stoichiometry).

Variables also have some interfaces of the GERM models. All GERM model variables support:
- `copy()` - It makes a shallow copy of the model
- `deepcopy()` - It makes a deep copy of the model
- **Temporary changes** using `with`, `undo()`, `redo()`, `reset()`, `restore()`,
- **yield linked variables**, such as `yield_metabolites()`

### Reactions, Metabolites and Genes

**Reactions** have the following **attributes**:
- identifier - id of the variable
- _**name**_ - name of the variable
- _**aliases**_ - aliases of the variable
- _**bounds**_ - reaction bounds; it must be a tuple with both values; (-1000, 1000) by default
- _**lower_bound**_ - reaction lower bound
- _**upper_bound**_ - reaction upper bound
- reversibility - whether the reaction is reversible
- _**stoichiometry**_ - reaction stoichiometry; a dictionary of metabolite variable-coefficient
- _**gpr**_ - a symbolic expression containing the boolean logic of the gene variables; AND (symbolic &); OR (symbolic |)
- gene_protein_reaction_rule - symbolic representation of the GPR expression
- metabolites - reaction metabolites; a dictionary of metabolite identifier-metabolite variable
- reactants - reaction reactants; a dictionary of metabolite identifier-metabolite variable
- products - reaction products; a dictionary of metabolite identifier-metabolite variable
- compartments - all compartments associated with the reaction metabolites
- boundary - whether the reaction is exchange, demand or sink
- equation - notation with reactants, products and reversibility
- charge_balance - charge balance of the reaction
- mass_balance - mass balance of the reaction

and the following **methods**:
- `ko()` - reaction deletion; it sets the bounds to zero
- `add_metabolites(stoichiometry)` - add metabolites to the reaction
- `remove_metabolites(metabolite)` - remove metabolites from the reaction
- `add_gpr(gpr)` - add/replacing gpr to the reaction
- `remove_gpr()` - remove gpr from the reaction

**Metabolites** have the following **attributes**:
- identifier - id of the variable
- _**name**_ - name of the variable
- _**aliases**_ - aliases of the variable
- _**charge**_ - metabolite charge
- _**compartment**_ - metabolite compartment
- _**formula**_ - metabolite chemical formula
- atoms - frequency of each atom in the chemical formula
- molecular_weight - metabolite molecular weight
- exchange_reaction - the first exchange reaction associated with the metabolite
- exchange_reactions - the list of all exchange reactions associated with the metabolite
- reactions - the reactions associated with this metabolite; a dictionary of reaction identifier-reaction variable

**Genes** have the following **attributes**:
- identifier - id of the variable
- _**name**_ - name of the variable
- _**aliases**_ - aliases of the variable
- _**coefficients**_ - the gene coefficients; all possible values that a gene can take during GPR evaluation; (0, 1) by default
- is_active - whether the maximum coefficient is bigger than zero
- reactions - the reactions associated with this gene; a dictionary of reaction identifier-reaction variable

and the following **methods**:
- `ko()` - gene deletion; it sets the coefficients to zero

Bold-italicized properties can be set with new values (e.g., `reaction.bounds = (0, 1000)`).

In [25]:
# inspecting a reaction
ack = model.get('ACKr')
ack

0,1
Identifier,ACKr
Name,
Aliases,
Model,e_coli_core
Types,reaction
Equation,1.0 ac_c + 1.0 atp_c <-> 1.0 actp_c + 1.0 adp_c
Bounds,"(-1000.0, 1000.0)"
Reversibility,True
Metabolites,"ac_c, atp_c, actp_c, adp_c"
Boundary,False


In [26]:
# inspecting a metabolite
acetate = model.get('ac_c')
acetate

0,1
Identifier,ac_c
Name,Acetate
Aliases,"ac_c, Acetate"
Model,e_coli_core
Types,metabolite
Compartment,c
Formula,C2H3O2
Molecular weight,59.04402
Charge,-1
Reactions,"ACKr, ACt2r"


In [27]:
# inspecting a gene
b3115 = model.get('b3115')
b3115

0,1
Identifier,b3115
Name,b3115
Aliases,"tdcD, b3115"
Model,e_coli_core
Types,"target, gene"
Coefficients,"(0.0, 1.0)"
Active,True
Interaction,b3115 || 1 = (b3357 | b1334)
Regulators,"b3357, b1334"
Reactions,ACKr


One can create Reactions, Metabolites and Genes using the objects mentioned above.

In [28]:
# imports
from mewpy.germ.algebra import Expression, parse_expression
from mewpy.germ.variables import Reaction, Metabolite, Gene

In [29]:
# creating the Genes
g1 = Gene(identifier='b4067', name='actP', coefficients=(0, 1))
g2 = Gene(identifier='b0010', name='satP', coefficients=(0, 1))
g1

0,1
Identifier,b4067
Name,actP
Aliases,
Model,
Types,gene
Coefficients,"(0, 1)"
Active,True
Reactions,


In [30]:
# Creating the GPR. A GPR is a boolean algebra expression
boolean_rule = parse_expression('b4067 and b0010')
genes = {'b4067': g1, 'b0010': g2}
gpr = Expression(symbolic=boolean_rule, variables=genes)
gpr

In [31]:
# creating the metabolites
m1 = Metabolite(identifier='ac_c', name='acetate cytoplasm', compartment='c', formula='C2H3O2', charge=-1)
m2 = Metabolite(identifier='ac_e', name='acetate extracellular', compartment='e', formula='C2H3O2', charge=-1)
m1

0,1
Identifier,ac_c
Name,acetate cytoplasm
Aliases,
Model,
Types,metabolite
Compartment,c
Formula,C2H3O2
Molecular weight,59.04402
Charge,-1
Reactions,


In [32]:
# creating the reaction
stoichiometry = {m1: -1, m2: 1}
rxn = Reaction(identifier='ac_t', 
               name='acetate transport',
               bounds=(0, 1000),
               stoichiometry=stoichiometry,
               gpr=gpr)
rxn

0,1
Identifier,ac_t
Name,acetate transport
Aliases,
Model,
Types,reaction
Equation,1 ac_c -> 1 ac_e
Bounds,"(0, 1000)"
Reversibility,False
Metabolites,"ac_c, ac_e"
Boundary,False


In [33]:
# copying the acetate transport and creating the acetate exchange
rxn2 = rxn.deepcopy()
rxn2.name = 'acetate exchange'
rxn2.stoichiometry = {m2: -1}
rxn2.remove_gpr()
rxn2

0,1
Identifier,ac_t
Name,acetate exchange
Aliases,
Model,
Types,reaction
Equation,1 ac_e ->
Bounds,"(0, 1000)"
Reversibility,False
Metabolites,ac_e
Boundary,True


Reactions can be created automatically from GPRs in a string format. This avoids creating GPR expressions manually using the boolean expression parser. Note that Genes are also created automatically using the identifiers in the string

In [34]:
# from a GPR string
rxn3 = Reaction.from_gpr_string(identifier='ac_t2',
                                name='a second reaction for acetate transport having different genes',
                                rule='b0001 and b0002',
                                bounds=(0, 1000),
                                stoichiometry=stoichiometry)
rxn3

0,1
Identifier,ac_t2
Name,a second reaction for acetate transport having different genes
Aliases,
Model,
Types,reaction
Equation,1 ac_c -> 1 ac_e
Bounds,"(0, 1000)"
Reversibility,False
Metabolites,"ac_c, ac_e"
Boundary,False


A Reaction's GPR is a boolean algebra expression that can be evaluated using regular boolean operators or custom operators (useful to evaluate gene expression data).

In [35]:
# gpr is a boolean algebra expression that can be evaluated
rxn3.gpr.evaluate(values={'b0001': 1, 'b0002': 1})

1

In [36]:
rxn3.gpr.evaluate(values={'b0001': 1, 'b0002': 0})

0

In [37]:
rxn3.gpr.evaluate(values={'b0001': True, 'b0002': True})

1

In [38]:
from mewpy.germ.algebra import And

# using a custom operator for AND
rxn3.gpr.evaluate(values={'b0001': 100, 'b0002': 50}, operators={And: min})

50

### Interactions, Targets and Regulators

**Interactions** have the following **attributes**:
- identifier - id of the variable
- _**name**_ - name of the variable
- _**aliases**_ - aliases of the variable
- _**target**_ - interaction target; Interactions can only have a single target gene!
- _**regulatory_events**_ - a dictionary of coefficient-symbolic expressions. The symbolic expressions contain the boolean logic of regulators to activate or not the target gene; the key of a regulatory event is the expression coefficient that the target can take if the expression is evaluated to True.
- regulators - interaction regulators; a dictionary of regulator identifier-regulator variable
- regulatory_truth_table - a table with the possible coefficients of the target variable according to the regulatory events and regulators' coefficients


and the following **methods**:
- `add_target(target)` - add the target to the interaction. It removes the current target.
- `remove_target(target)` - remove the target from the interaction.
- `add_regulatory_event(coefficient, expression)` - add a new regulatory event for a target coefficient. It removes the current coefficient if available.
- `remove_regulatory_event(coefficient)` - remove the regulatory event for the target coefficient.

**Targets** have the following **attributes**:
- identifier - id of the variable
- _**name**_ - name of the variable
- _**aliases**_ - aliases of the variable
- _**coefficients**_ - the target coefficients; all possible values that a target can take during expression evaluation; (0, 1) by default
- is_active - whether the maximum coefficient is bigger than zero
- _**interaction**_ - the target interaction.
- regulators - target regulators; a dictionary of regulator identifier-regulator variable

and the following **methods**:
- `ko()` - target deletion; it sets the coefficients to zero

**Regulators** have the following **attributes**:
- identifier - id of the variable
- _**name**_ - name of the variable
- _**aliases**_ - aliases of the variable
- _**coefficients**_ - the regulator coefficients; all possible values that a regulator can take during expression evaluation; (0, 1) by default
- is_active - whether the maximum coefficient is bigger than zero
- interactions - regulator interactions; a dictionary of interaction identifier-interaction variable
- targets - regulator targets; a dictionary of target identifier-target variable

and the following **methods**:
- `ko()` - regulator deletion; it sets the coefficients to zero

Bold-italicized properties can be set with new values (e.g., `regulator.coefficients = (1,)`).

In [39]:
# inspecting an interaction
sdhc_interaction = model.get('b0721_interaction')
sdhc_interaction

0,1
Identifier,b0721_interaction
Name,b0721_interaction
Aliases,b0721
Model,e_coli_core
Types,interaction
Target,b0721 || 1 = (( ~ (b4401 | b1334)) | b3357 | b3261)
Regulators,"b4401, b1334, b3357, b3261"
Regulatory events,1 = (( ~ (b4401 | b1334)) | b3357 | b3261)


In [40]:
sdhc_interaction.regulatory_truth_table

Unnamed: 0,result,b4401,b1334,b3357,b3261
b0721,0,,,,
b0721,1,1.0,1.0,1.0,1.0


In [41]:
# inspecting a regulator
fnr = model.get('b1334')
fnr

0,1
Identifier,b1334
Name,b1334
Aliases,"b1334, Fnr"
Model,e_coli_core
Types,"regulator, target"
Coefficients,"(0.0, 1.0)"
Active,True
Interactions,"b0721_interaction, b0722_interaction, b0723_interaction, b0724_interaction, b0733_interaction, b0734_interaction, b0902_interaction, b0903_interaction, b0904_interaction, b1612_interaction, b2276_interaction, b2277_interaction, b2278_interaction, b2279_interaction, b2280_interaction, b2281_interaction, b2282_interaction, b2283_interaction, b2284_interaction, b2285_interaction, b2286_interaction, b2287_interaction, b2288_interaction, b2492_interaction, b3114_interaction, b3115_interaction, b3951_interaction, b3952_interaction, b4122_interaction, b4151_interaction, b4152_interaction, b4153_interaction, b4154_interaction"
Targets,"b0721, b0722, b0723, b0724, b0733, b0734, b0902, b0903, b0904, b1612, b2276, b2277, b2278, b2279, b2280, b2281, b2282, b2283, b2284, b2285, b2286, b2287, b2288, b2492, b3114, b3115, b3951, b3952, b4122, b4151, b4152, b4153, b4154"
Environmental stimulus,False


In [42]:
# inspecting a target
sdhc = model.get('b0721')
sdhc

0,1
Identifier,b0721
Name,b0721
Aliases,"sdhC, b0721"
Model,e_coli_core
Types,"target, gene"
Coefficients,"(0.0, 1.0)"
Active,True
Interaction,b0721 || 1 = (( ~ (b4401 | b1334)) | b3357 | b3261)
Regulators,"b4401, b1334, b3357, b3261"
Reactions,SUCDi


One can create Interactions, Targets and Regulators using the objects mentioned above.

In [43]:
# imports
from mewpy.germ.algebra import Expression, parse_expression
from mewpy.germ.variables import Target, Interaction, Regulator

In [44]:
# creating the regulators
b0001 = Regulator(identifier='b0001', name='thrL', coefficients=(0, 1))
b0002 = Regulator(identifier='b0002', name='thrA', coefficients=(0, 1))
b0002

0,1
Identifier,b0002
Name,thrA
Aliases,
Model,
Types,regulator
Coefficients,"(0, 1)"
Active,True
Interactions,
Targets,
Environmental stimulus,True


In [45]:
# creating the target
b0003 = Target(identifier='b0003', name='thrB', coefficients=(0, 1))
b0003

0,1
Identifier,b0003
Name,thrB
Aliases,
Model,
Types,target
Coefficients,"(0, 1)"
Active,True
Interaction,
Regulators,


In [46]:
# creating a regulatory event
b0003_expression = Expression(symbolic=parse_expression('b0002 and not b0001'),
                              variables={'b0001': b0001, 'b0002': b0002})
b0003_expression

In [47]:
# creating the interaction
# it is always a good practice to build the expression of a given interaction first, and then use it in the
# Interaction constructor. Otherwise, interaction has alternative constructors (from_expression or from_string)
b0003_interaction = Interaction(identifier='interaction_b0003',
                                regulatory_events={1.0: b0003_expression},
                                target=b0003)
b0003_interaction

0,1
Identifier,interaction_b0003
Name,
Aliases,
Model,
Types,interaction
Target,b0003 || 1.0 = (b0002 & ( ~ b0001))
Regulators,"b0001, b0002"
Regulatory events,1.0 = (b0002 & ( ~ b0001))


In [48]:
b0003_interaction.regulatory_truth_table

Unnamed: 0,b0001,b0002,result
b0003,1,1,0


Interactions can be created automatically from a regulatory rule in a string format. This avoids creating regulatory expressions manually using the boolean expression parser. Note that Regulators are also created automatically using the identifiers in the string

In [49]:
b0004 = Target(identifier='b0004')
# creating an interaction from string. Note that propositional logic is also accepted
b0004_interaction = Interaction.from_string(identifier='b0004_interaction',
                                            name='interaction from string creates new genes',
                                            rule='(b0005 and b0006) or (b0007 > 0)',
                                            target=b0004)
b0004_interaction

0,1
Identifier,b0004_interaction
Name,interaction from string creates new genes
Aliases,
Model,
Types,interaction
Target,b0004 || 1.0 = ((b0005 & b0006) | (b0007 > 0))
Regulators,"b0005, b0006, b0007"
Regulatory events,1.0 = ((b0005 & b0006) | (b0007 > 0))


One can change the outcome of a regulatory expression by changing the coefficients of the regulators.

In [50]:
# changing the regulatory expression by altering the regulators coefficients
b0005 = b0004_interaction.regulators['b0005']
b0005.coefficients = (0,)

b0007 = b0004_interaction.regulators['b0007']
b0007.coefficients = (0,)
b0004_interaction.regulatory_truth_table

Unnamed: 0,b0005,b0006,b0007,result
b0004,0,1.0,0,0


In [51]:
# evaluating the regulatory expression with different regulators coefficients (it does not change the regulators coefficients though)
b0004_expression = b0004_interaction.regulatory_events.get(1)
b0004_expression.evaluate(values={'b0005': 1})

1

### Working with multi-type GERM model variables
A GERM model variable is by default a multi-type variable. Integrated models often include multi-type variables representing simultaneously regulators and metabolites or targets and metabolic genes, among others. A single GERM model variable can store the information of a multi-type variable. For instance, a single variable object can share attributes and methods of a metabolite and regulator. More importantly, genes associated with reactions in a metabolic model often correspond to target genes having a regulatory interaction in the regulatory model.

MEWpy builds multi-type variables when reading GERM models. One can check `variable.types` or use a type checker `variable.is_regulator()` to access a variable type.

In [52]:
# reading the integrated regulatory-metabolic model again
model = read_model(gem_reader, trn_reader)
model

0,1
Model,e_coli_core
Name,E. coli core model - Orth et al 2010
Types,"metabolic, regulatory"
Compartments,"c, e"
Reactions,95
Metabolites,72
Genes,137
Exchanges,20
Demands,0
Sinks,0


In [53]:
# access to a reaction and find all regulators associated
pdh = model.get('PDH')

pdh_regulators = []
for gene in pdh.yield_genes():
    if gene.is_target():
        pdh_regulators.extend(gene.yield_regulators())
print('PDH regulators: ', ', '.join(reg.id for reg in pdh_regulators))
pdh_regulators[0]

PDH regulators:  b0113, b3261, b0113, b3261


0,1
Identifier,b0113
Name,b0113
Aliases,"PdhR, b0113"
Model,e_coli_core
Types,"regulator, target"
Coefficients,"(0.0, 1.0)"
Active,True
Interactions,"b0114_interaction, b0115_interaction"
Targets,"b0114, b0115"
Environmental stimulus,False


In [54]:
from mewpy.germ.variables import Variable

# one can create multi-type variables as follows
Variable.from_types(types=('target', 'gene'), identifier='b0001')

0,1
Identifier,b0001
Name,
Aliases,
Model,
Types,"target, gene"
Coefficients,"(0.0, 1.0)"
Active,True
Interaction,
Regulators,
Reactions,
