In [1]:
# expand cells to the 95% of the display width
from IPython.core.display import display, HTML
display(HTML("<style>.container { width: 95% !important; }</style>"))

# Tutorial: Automatic rule-based modeling of the *Escherichia coli* lactose metabolism, including protein-protein interactions and regulation of gene expression employing *Atlas*

Authors: Rodrigo Santibáñez[1,2], Daniel Garrido[2], and Alberto Martín[1]

Date: August 2020

Affiliations:
1. Centro de Genómica y Bioinformática, Facultad de Ciencias, Universidad Mayor, Santiago, 8580745, Chile.
2. Department of Chemical and Bioprocess Engineering, School of Engineering, Pontificia Universidad Católica de Chile, Santiago, 7820436, Chile

Notes: This tutorial was created for the manuscript "*Atlas*: Automatic modeling of regulation of bacterial gene expression and metabolism using rule-based languages", first submitted for peer-review to Bioinformatics on May, 2020.

## Prerequisites

0. The tutorial was prepared and executed on Ubuntu 20.04, PathwayTools version 24, and docker engine version 19.03.8.<br/><br/>

1. PathwayTools must be installed and running to obtain data from the EcoCyc database. Please, run ```pathway-tools -lisp -python-local-only``` before to obtain any data.<br/>
   (Optional) The PathwayTools software could be executed in the background, with help of ```nohup pathway-tools -lisp -python-local-only > /dev/null 2> /dev/null &```.<br/>
   Please follow instructions at http://pathwaytools.org/ to obtain a licensed copy of the software from https://biocyc.org/download-bundle.shtml. However, data could be manually formatted using a text-based editor or a spreadsheet software.
   
   Note: If you ran into the ```pathway-tools/aic-export/pathway-tools/ptools/22.5/exe/aclssl.so: undefined symbol: CRYPTO_set_locking_callback``` error, please follow instructions here: https://github.com/glucksfall/atlas/tree/master/PTools. Instructions will guide you to install a docker image that is able to run pathway tools, but does not include it, so you still need to obtain the software with a valid license.<br/><br/>
   
2. (Highly recommended) Install Docker. Please follow instructions for a supported Operating System https://docs.docker.com/engine/install/:<br/>
   On Ubuntu, install it with ```apt-get install docker.io```.<br/>
   On Win10, install Docker Desktop with WSL2 support https://docs.docker.com/docker-for-windows/wsl/.<br/>
   On MacOS, install Docker Desktop https://docs.docker.com/docker-for-mac/install/.<br/><br/>
   The Docker ```networkbiolab/pleiades```installs the python packages, the jupyter server, and the stochastic simulators.<br/><br/>

3. (Recommended) Jupyter notebook. We recommend the use of Anaconda3 https://www.anaconda.com/products/individual because of the easier installation of the stochastic simulators from https://anaconda.org/alubbock.<br/><br/>

4. (Optional) A stochastic simulator, supported by the pySB python package ([BNG2](https://github.com/RuleWorld/bionetgen), [NFsim](https://github.com/ruleworld/nfsim/tree/9178d44455f6e27a81f398074eeaafb2a1a4b4bd), [KaSim](https://github.com/Kappa-Dev/KappaTools) or [Stochkit](https://github.com/StochSS/StochKit)). pySB requires BNG2 to simulate models with NFsim.<br/><br/>

5. (Optional) Cytoscape to visualize metabolic networks and others.

## Installation

0. If you are running the docker image "pleiades", please go directly to the section "Preamble".<br/><br/>
1. To install, please follow one of the following steps:<br/><br/>
   1. Install the docker image "pleiades" using ```docker pull networkbiolab/pleiades```. The container is based on the Anaconda3 software and it installs Atlas, and the stochastic simulators BNG2, NFsim, KaSim, and Stochkit. After building the image, please run the container with ```docker run --detach --publish 10000:8888 networkbiolab/pleiades```, and go to ```localhost:10000``` in your preferred browser. The required password is ```pleiades```.<br/><br/>
   2. Download or clone the Github repository from https://github.com/networkbiolab/pleiades with ```git clone https://github.com/networkbiolab/pleiades foo``` (where ```foo``` is an absolute or relative path). Then, you could build the docker image with ```docker build foo --tag pleiades``` and run it with ```docker run --detach --publish 10000:8888 pleiades```. Finally, go to ```localhost:10000``` in your preferred browser. The required password is ```pleiades```.<br/><br/>
   3. Install with pip3: ```sudo -H python3 -m pip install pleiades``` or ```python3 -m pip install pleiades --user```. Pleiades is a meta-package that install Atlas (the rule-based modeller), Pleione (a genetic algorithm for parameter calibration of RBMs), Alcyone (to perform identifiability analysis of parameters), and Sterope (to perform sensitivity analysis of parameters in kappa RBMs).<br/>
      You should install, configure, and run the jupyter notebook on your own: example ```sudo -H pip3 install jupyter && nohup python3 -m jupyter notebook --port=8888 --no-browser --port-retries=0 > /dev/null 2> /dev/null &```.<br/><br/>
   4. Download or clone the Github repository from https://github.com/networkbiolab/atlas with ```git clone https://github.com/networkbiolab/atlas foo``` (where ```foo``` is an absolute or relative path). Requisites must be fulfilled manually with pip3: ```sudo -H python3 -m pip install pandas pysb pythoncyc jupyter seaborn``` or ```python3 -m pip install pandas pysb pythoncyc jupyter seaborn --user```.

## Preamble: load *Atlas*

In [2]:
# testing source code
import sys
sys.path.append("..") # If installed from GitHub and this notebooks is executed from the tutorial directory.

import atlas_rbm.atlas as atlas
import atlas_rbm.utils as utils
import atlas_rbm.export as export
import atlas_rbm.simulation as simulation

In [3]:
utils.checkPathwayTools()

PathwayTools is running. Available PGDB are: META, ECOLI


True

In [4]:
utils.execPToolsDocker('ptools-v24')

Doing nothing since PathwayTools is running.


## Modeling metabolism

In this tutorial, we will model the metabolism of lactose degradation in *Escherichia coli* as a test-bed of the software Atlas. Moreover, we will couple the metabolism to the protein-protein interactions and gene expression and regulations that occurs naturally in the bacteria. We choose the lactose metabolism since it was discovered in the decade of 1960s and it is a common model of gene regulation with more than 50 years of biochemical information. In an side note, the characterization of the lactose operon and others rewarded their authors the 1965 Nobel Prize in Physiology or Medicine (https://www.nobelprize.org/prizes/medicine/1965/summary/)

The lactose operon from *E. coli* consists of three genes: the $\beta$-galactosidase gene lacZ, the lactose permease gene lacY (also known as lactose-proton symporter), and the galactoside O-acetyltransferase gene lacA. We could obtain the metabolic reactions with help of the ```utils.metabolicNetwork.FromGeneList``` function from the ```ECOLI``` database of PathwayTools:

In [5]:
# df_genes = utils.returnCommonNames('ECOLI')
# %time network = utils.metabolicNetwork.FromGeneList('ECOLI', ['lacZ', 'lacA', 'lacY'], fmt = 'genes', precalculated = df_genes)
# network

The output is a pandas dataframe that could be exported with ```network.to_csv(path)```, or in a two-columns format that Cytoscape could interpret as a network. The ```utils.metabolicNetwork.expand_network``` function reorders and exports the dataframe as a text file (in this case to ```./tutorial.txt```).

In [6]:
# %time utils.metabolicNetwork.expand_network(network, './lactose-metabolism-cytoscape-v1.txt')

The following image was prepared from the ```tutorial.txt``` file, and you could reproduce it with Cytoscape:<br/>
1. Click on the ```Import Network from File System``` icon or click on ```File -> Import -> Network from File...```.
2. Navigate to the file and click on ```Open```.
3. SOURCE, TARGET, and EDGE ATTRIBUTE are OK, but the 4th columns must be the SOURCE NODE ATTRIBUTE and the 5th column the TARGET NODE ATTRIBUTE. Click on the header and change it to the correct attribute. The attributes will help later to filter and to add format to nodes and edges.
4. Click on ```Filter``` (on the right), then on the ```+``` icon and finally on ```Column Filter```:
  1. On the selector, click on ```Edge: EDGE_ATTRIBUTE``` and change ```contains``` to ```is```.
    1. Write ```NO_REVERSIBLE``` that will select edges that correspond to irreversible reactions. Click on ```Style```, then ```Edge``` (in the bottom), and click on the 3rd column to bypass the format of the ```Target Arrow Shape``` and select your favorite arrow shape.
    2. Write ```REVERSIBLE``` and bypass the format of the ```Source Arrow Shape``` AND ```Target Arrow Shape```, and select your favorite arrow shape.
  2. On the selector, click on ```Node: SOURCE_NODE_ATTRIBUTE```:
    1. Write ```RXN``` that will select nodes enconding the reactions. Click on ```Style```, then on ```Node``` and bypass the ```Fill Color```. In the new window, you could set-up the color, e.g. #00AA50
    2. Write ```GENE_PROD``` that will select nodes encoding the gene name, protein name, or the enzyme name. Click on ```Style```, then on ```Node``` and bypass the ```Fill Color```. In the new window, you could set-up the color, e.g. #CC0033
    3. Write ```MET``` that will select nodes encoding substrate metabolites. Click on ```Style```, then on ```Node``` and bypass the ```Fill Color```. In the new window, you could set-up the color, e.g. #00ABDD
  3. On the selector, click on ```Node: TARGET_NODE_ATTRIBUTE```:
    1. Write ```MET``` that will select nodes encoding product metabolites. Click on ```Style```, then on ```Node``` and bypass the ```Fill Color```. In the new window, you could set-up the color, e.g. #00ABDD
    
<a id='figS1'></a>The result will be similar to <img src="lactose-metabolism-cytoscape-v1.png" alt="drawing" width="750"/>

If we inspect the network, we could highlight four things:
1. The lacA reaction is disconnect from the network formed by the lacZ and lacY reactions;
2. The lacY reactions do not inform the metabolite compartment, so substrates and products refer to the same node;
3. The utilization of identification names for certain compounds; and
4. The impossibility of alpha-lactose degradation into glucose (glucopyranose) and galactose.

The advantage of the procedure is the ability to modify the data manually using python functions (https://pandas.pydata.org/) or export the data and manipulate it using a text processor or a spreadsheet software. For routinary changes, we included utilitary functions to make changes to the data:

In [7]:
# # Transport reactions
# %time network = utils.metabolicNetwork.setTransport(network, geneLst = ['lacY'], fromLst = ['PER'], toLst = ['CYT'])
# network

**Note**: By default, Atlas interprets the default location of monomers as cytoplasmatic. When setting the location to ```CYT```, the ```setTransport()``` function will delete a previous compartment or append nothing to the name of the monomer.

In [8]:
# # Irreversibility of reactions per gene
# %time network = utils.metabolicNetwork.setIrreversibility(network, geneLst = ['lacY', 'lacA'])
# network

In [9]:
# # Irreversibility of reactions per reaction. The beta-galactosidase has also isomerase activity. Note: The data from BioCyc shows correctly the reversibility of lacZ reactions
# %time network = utils.metabolicNetwork.setIrreversibility(network, rxnLst = ['BETAGALACTOSID-RXN', 'RXN-17726', 'RXN0-7219'])
# network

In [10]:
# # Compartment of reactions. The lacY gene is a inner membrane protein. For teaching purpose. The BioCyc informs correctly the location of lacY
# %time network = utils.metabolicNetwork.setEnzymeLocation(network, geneLst = ['lacY'], compartmentLst = ['MEM'])
# network

In [11]:
%time utils.metabolicNetwork.expand_network(network, './lactose-metabolism-cytoscape-v2.txt')

NameError: name 'network' is not defined

<img src="lactose-metabolism-cytoscape-v2.png" alt="drawing" width="750"/>

We also considered anomers and non-enzymatic reactions to make valid the method to that kind of reactions as the EcoCyc database informs 145 spontaneous reactions:

<!---
The modeling of similar corrections for other enzymes could be useful to understand the dynamic properties of metabolic pathways before experimental validation of the kinetics properties of each enzyme. For the case of lactose degradation, simulations of the curated metabolic network are shown in Fig.~\ref{fig:01}B for the two anomers of glucose, galactose, and allolactose produced from a source of 100 molecules (or an arbitrary concentration unit) of $\beta$-lactose. % for the metabolic reactions considered up to now. %As expected and with $\beta$-lactose, protons, and water as the only available substrates in the model, the dead rules reported by KaSA refer to the consumption of melibiose,do not represent any biological system lactulose, 3-O-galactosylarabinose, and melibionate; and the acetylation of galactose (because of unavailability of acetyl-CoA in the model).
--->

In [12]:
import pythoncyc
len(pythoncyc.select_organism('ECOLI').all_rxns(type_of_reactions = 'spontaneous'))

145

Data from literature (e.g. \citealp{Huber1981anomeric, Juers2012LacZ}) was used to complete data derived from the BioCyc database and added manually in the network (Supplementary Table S1, Figure S1, and Table S2) and the final network is depicted in Fig 1A and here. 

In [13]:
%time network = atlas.read_network('network-lactose-metabolism-complex.tsv')
%time utils.metabolicNetwork.expand_network(network, './lactose-metabolism-cytoscape-v3.txt')
network

CPU times: user 4.55 ms, sys: 143 µs, total: 4.69 ms
Wall time: 4.17 ms
CPU times: user 0 ns, sys: 2.08 ms, total: 2.08 ms
Wall time: 2.07 ms


Unnamed: 0,GENE OR COMPLEX,ENZYME LOCATION,REACTION,SUBSTRATES,PRODUCTS,FWD_RATE,RVS_RATE
0,spontaneous,"[cytosol,periplasmic space]",LACTOSE-MUTAROTATION,alpha-lactose,beta-lactose,1,1
1,spontaneous,"[cytosol,periplasmic space]",GALACTOSE-MUTAROTATION,alpha-GALACTOSE,beta-GALACTOSE,1,1
2,spontaneous,"[cytosol,periplasmic space]",GLUCOSE-MUTAROTATION,alpha-glucose,beta-glucose,1,1
3,lacY,inner membrane,TRANS-RXN-24,"PER-PROTON,PER-alpha-lactose","PROTON,alpha-lactose",1,0
4,lacY,inner membrane,TRANS-RXN-24-beta,"PER-PROTON,PER-beta-lactose","PROTON,beta-lactose",1,0
5,lacY,inner membrane,TRANS-RXN-94,"PER-PROTON,PER-MELIBIOSE","PROTON,MELIBIOSE",1,0
6,lacY,inner membrane,RXN0-7215,"PER-PROTON,PER-CPD-3561","PROTON,CPD-3561",1,0
7,lacY,inner membrane,RXN0-7217,"PER-PROTON,PER-CPD-3785","PROTON,CPD-3785",1,0
8,lacY,inner membrane,RXN-17755,"PER-PROTON,PER-CPD-3801","PROTON,CPD-3801",1,0
9,"[lacZ,lacZ,lacZ,lacZ]","[cytosol,cytosol,cytosol,cytosol]",BETAGALACTOSID-RXN,"beta-lactose,WATER","beta-GALACTOSE,beta-glucose",1,0


<img src="lactose-metabolism-cytoscape-v3.png" alt="drawing" width="750"/>

In [47]:
# %time model = atlas.construct_model_from_metabolic_network(network, verbose = False) # use the dataframe to build the model
%time model = atlas.construct_model_from_metabolic_network('network-lactose-metabolism-complex.tsv', verbose = False) # read a file to build the model

CPU times: user 10.9 s, sys: 31.8 ms, total: 10.9 s
Wall time: 10.9 s


We could inspect the model accessing its properties:

In [15]:
# model.monomers
model.monomers[0] # for simplicity of the output

Monomer('met', ['name', 'loc', 'prot'], {'name': ['ACETYL_COA', 'CO_A', 'CPD_3561', 'CPD_3785', 'CPD_3801', 'D_ARABINOSE', 'Fructofuranose', 'MELIBIOSE', 'PROTON', 'WATER', '_6_Acetyl_beta_D_Galactose', 'alpha_ALLOLACTOSE', 'alpha_GALACTOSE', 'alpha_glucose', 'alpha_lactose', 'beta_ALLOLACTOSE', 'beta_GALACTOSE', 'beta_glucose', 'beta_lactose'], 'loc': ['cyt', 'cytosk', 'ex', 'mem', 'per', 'wall', 'bnuc', 'cproj', 'imem', 'omem']})

In [48]:
# model.rules
model.rules[0] # for simplicity of the output

Rule('LACTOSE_MUTAROTATION_CYT', met(name='alpha_lactose', loc='cyt', prot=None) | met(name='beta_lactose', loc='cyt', prot=None), fwd_LACTOSE_MUTAROTATION_CYT, rvs_LACTOSE_MUTAROTATION_CYT)

In [17]:
# model.initials
model.initials[0] # for simplicity of the output

Initial(met(name='ACETYL_COA', loc='cyt', prot=None), t0_met_ACETYL_COA_cyt)

In [18]:
# model.parameters
print(model.parameters[0]) # for simplicity of the output

Parameter('t0_met_ACETYL_COA_cyt', 0.0)


In [19]:
# model.observables
print(model.observables[0]) # for simplicity of the output

Observable('obs_met_ACETYL_COA_cyt', met(name='ACETYL_COA', loc='cyt', prot=None))


In [20]:
utils.analyseConnectivity(model, '/opt/git-repositories/KaSim.Kappa-Dev/KaSa')

Every rule may be applied.
Every monomer and complex of monomers may occur in the model.


To simulate, we need to set the initial condition:

In [21]:
# # initial condition
# # for metabolites
# simulation.set_initial.met(model, 'beta_lactose', 'per', 100)
# simulation.set_initial.met(model, 'PROTON', 'per', 100) # required for lactose transport
# simulation.set_initial.met(model, 'WATER', 'cyt', 100) # required for lactose hydrolysis

# # for proteins
# simulation.set_initial.prot(model, 'lacY', 'imem', 1)

# # and for complexes
# simulation.set_initial.cplx(model, 'lacAx3', 'cyt', 1) # the code name for complexes is their monomers times its stoichiometry
# simulation.set_initial.cplx(model, 'lacZx4', 'cyt', 1)

In [22]:
# # simulation
# bng = '/opt/git-repositories/bionetgen.RuleWorld/bng2/'
# kasim = '/opt/git-repositories/KaSim4.Kappa-Dev/'

# %time data0 = simulation.scipy(model, start = 0, finish = 2, points = 200)
# %time data1 = simulation.bngODE(model, start = 0, finish = 2, points = 200, path = bng)
# %time data2 = simulation.bngSSA(model, start = 0, finish = 2, points = 200, n_runs = 20, path = bng)
# %time data3 = simulation.kasim(model, start = 0, finish = 2, points = 200, n_runs = 20, path = kasim)

Finally, we plot the simulation results. The result of the ```simulation.ode()``` function is a pandas dataframe. In the case of stochastic simulations (SSA, KaSim, NFsim, Stochkit), the function returns a dictionary with a list of dataframe for each simulations (```sims``` key), and dataframe with the average (```avrg``` key) and a dataframe with the standard deviation (```stdv``` key) of those simulations. Currently, we included three kind of plots, although the user could access the dataframes and plot diretly with methods in the seaborn package (https://seaborn.pydata.org/), in the pandas package (https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.plot.html), or with matplotlib (https://matplotlib.org/).

In [23]:
# import seaborn
# import matplotlib.pyplot as plt

# palette = seaborn.color_palette('colorblind')

# for kind in ['scatter', 'plot']:
#     # first plot, periplasmic concentration
#     fig, ax = plt.subplots(1, 2, figsize = (4*2, 3*1), dpi = 100)
#     simulation.plot.metabolite(data2['avrg'], 'alpha-lactose', 'per', ax = ax[0], **{'kind' : kind}, 
#                                plt_kws = {'s' : 2, 'color' : palette[0], 'label' : r'$\alpha$-lactose', 'alpha' : .5})

#     simulation.plot.metabolite(data2['avrg'], 'beta-lactose', 'per', ax = ax[0], **{'kind' : kind}, 
#                                plt_kws = {'s' : 2, 'color' : palette[1], 'label' : r'$\beta$-lactose', 'alpha' : .5})

#     # second plot, cytoplasmic concentration
#     simulation.plot.metabolite(data2['avrg'], 'alpha-GALACTOSE', 'cyt', ax = ax[1], **{'kind' : kind}, 
#                                plt_kws = {'s' : 2, 'color' : palette[2], 'label' : r'$\alpha$-galactose', 'alpha' : .5})

#     simulation.plot.metabolite(data2['avrg'], 'beta-GALACTOSE', 'cyt', ax = ax[1], **{'kind' : kind}, 
#                                plt_kws = {'s' : 2, 'color' : palette[3], 'label' : r'$\beta$-galactose', 'alpha' : .5})

#     ax[0].set_ylim(top = 100)
#     ax[1].set_ylim(top = 100)

#     seaborn.despine()

In [24]:
# import seaborn
# import matplotlib.pyplot as plt

# palette = seaborn.color_palette('colorblind')

# # first plot, periplasmic concentration
# fig, ax = plt.subplots(1, 2, figsize = (4*2, 3*1), dpi = 100)
# simulation.plot.metabolite(data2, 'alpha-lactose', 'per', ax = ax[0], **{'kind' : 'fill_between', 'weight' : .5}, 
#     plt_kws = {'s' : 2, 'color' : palette[0], 'label' : r'$\alpha$-lactose', 'alpha' : .5})

# simulation.plot.metabolite(data2, 'beta-lactose', 'per', ax = ax[0], **{'kind' : 'fill_between', 'weight' : .5}, 
#     plt_kws = {'s' : 2, 'color' : palette[1], 'label' : r'$\beta$-lactose', 'alpha' : .5})

# simulation.plot.metabolite(data2, 'alpha-GALACTOSE', 'cyt', ax = ax[1], **{'kind' : 'fill_between', 'weight' : .5}, 
#     plt_kws = {'s' : 2, 'color' : palette[2], 'label' : r'$\alpha$-galactose', 'alpha' : .5})

# simulation.plot.metabolite(data2, 'beta-GALACTOSE', 'cyt', ax = ax[1], **{'kind' : 'fill_between', 'weight' : .5}, 
#     plt_kws = {'s' : 2, 'color' : palette[3], 'label' : r'$\beta$-galactose', 'alpha' : .5})

# ax[0].set_ylim(top = 100)
# ax[1].set_ylim(top = 100)

# seaborn.despine()

As expected, the degradation of lactose into glucose and galactose is complete because most reactions are irreversible, while mutarotation allows equilibrium of anomers. However, we must note that we considered the enzymatic reactions are performed by its complex (e.g. four monomers of LacZ catalyze one reaction), although biochemical information informs the monomers are catalytically active only when the complex is assembled (e.g. one LacZ tetramer catalyzes four reactions). This will be considered next, when we will model protein-protein interactions.

<!---
%Although system parameters could be found in databases or calibrated (e.g. with \textit{Pleione}, \citealp{Santibanez2019Pleione:}), the results show that modeling of RBMs for metabolic networks can be done in an automatized manner %and is a valid methodology to obtain genome-scale kinetic models of metabolism for deterministic and stochastic simulation.
--->

## Modeling protein-protein interactions

In [25]:
atlas.read_network('network-lactose-operon-protprot.tsv')

Unnamed: 0,SOURCE,TARGET,FWD_RATE,RVS_RATE,LOCATION
0,lacZ,lacZ,1.0,0.0,cytosol
1,"[lacZ,lacZ]","[lacZ,lacZ]",1.0,0.0,cytosol
2,lacA,lacA,1.0,0.0,cytosol
3,lacA,"[lacA,lacA]",1.0,0.0,cytosol


In [26]:
%time model2 = atlas.construct_model_from_interaction_network('network-lactose-operon-protprot.tsv')
model2

CPU times: user 1.66 s, sys: 15.5 ms, total: 1.68 s
Wall time: 1.68 s


<Model 'atlas_rbm.construct_model_from_interaction_network' (monomers: 1, rules: 4, parameters: 68, expressions: 0, compartments: 0) at 0x7f3bc94abc70>

We created a second model, and next, we need to combine both models into one:

In [27]:
%time combined = atlas.combine_models([model, model2]) # you will note that model2 and combined refer to the same object
combined

CPU times: user 11.6 s, sys: 21.5 ms, total: 11.6 s
Wall time: 11.6 s


<Model 'atlas_rbm.atlas' (monomers: 2, rules: 25, parameters: 320, expressions: 0, compartments: 0) at 0x7f3c2c705af0>

In [28]:
utils.analyzeConnectivity(combined)

Every rule may be applied.
Every monomer and complex of monomers may occur in the model.


In [29]:
# # initial condition
# # for metabolites (set again doesn't hurt)
# simulation.set_initial.met(combined, 'beta_lactose', 'per', 100)
# simulation.set_initial.met(combined, 'PROTON', 'per', 100) # required for lactose transport
# simulation.set_initial.met(combined, 'WATER', 'cyt', 100) # required for lactose hydrolysis

# # for proteins
# simulation.set_initial.prot(combined, 'lacY', 'imem', 1)
# simulation.set_initial.prot(combined, 'lacZ', 'cyt', 12)
# simulation.set_initial.prot(combined, 'lacA', 'cyt', 12)

# # and for complexes. We set to zero to simulate complex assembly as a requisite for metabolic activity
# simulation.set_initial.cplx(combined, 'lacAx3', 'cyt', 0) # the code name for complexes is their monomer names times its stoichiometry
# simulation.set_initial.cplx(combined, 'lacZx4', 'cyt', 0)

# # simulation
# bng = '/opt/git-repositories/bionetgen.RuleWorld/bng2/'
# kasim = '/opt/git-repositories/KaSim4.Kappa-Dev/'

# %time data0 = simulation.scipy(combined, start = 0, finish = 2, points = 200)
# %time data1 = simulation.bngODE(combined, start = 0, finish = 2, points = 200, path = bng)
# %time data2 = simulation.bngSSA(combined, start = 0, finish = 2, points = 200, n_runs = 20, path = bng)
# %time data3 = simulation.kasim(combined, start = 0, finish = 2, points = 200, n_runs = 20, path = kasim)

In [30]:
# import seaborn
# import matplotlib.pyplot as plt

# palette = seaborn.color_palette('colorblind')

# # first plot, periplasmic concentration
# fig, ax = plt.subplots(1, 2, figsize = (4*2, 3*1), dpi = 100)
# simulation.plot.protein(data2, 'lacA', 'cyt', ax = ax[0], **{'kind' : 'fill_between', 'weight' : .5}, 
#     plt_kws = {'s' : 2, 'color' : palette[0], 'label' : 'lacA monomer', 'alpha' : .5})

# simulation.plot.protein(data2, 'lacZ', 'cyt', ax = ax[0], **{'kind' : 'fill_between', 'weight' : .5}, 
#     plt_kws = {'s' : 2, 'color' : palette[1], 'label' : 'lacZ monomer', 'alpha' : .5})

# simulation.plot.cplx(data2, 'lacAx3', 'cyt', ax = ax[1], **{'kind' : 'fill_between', 'weight' : .5}, 
#     plt_kws = {'s' : 2, 'color' : palette[2], 'label' : 'lacA trimer', 'alpha' : .5})

# simulation.plot.cplx(data2, 'lacZx4', 'cyt', ax = ax[1], **{'kind' : 'fill_between', 'weight' : .5}, 
#     plt_kws = {'s' : 2, 'color' : palette[3], 'label' : 'lacZ tetramer', 'alpha' : .5})

# ax[0].set_ylim(top = 12)
# ax[1].set_ylim(top = 4)

# seaborn.despine()

In [31]:
# import seaborn
# import matplotlib.pyplot as plt

# palette = seaborn.color_palette('colorblind')

# # first plot, periplasmic concentration
# fig, ax = plt.subplots(1, 2, figsize = (4*2, 3*1), dpi = 100)
# simulation.plot.metabolite(data2, 'alpha_lactose', 'per', ax = ax[0], **{'kind' : 'fill_between', 'weight' : .5}, 
#     plt_kws = {'s' : 2, 'color' : palette[0], 'label' : r'$\alpha$-lactose', 'alpha' : .5})

# simulation.plot.metabolite(data2, 'beta_lactose', 'per', ax = ax[0], **{'kind' : 'fill_between', 'weight' : .5}, 
#     plt_kws = {'s' : 2, 'color' : palette[1], 'label' : r'$\beta$-lactose', 'alpha' : .5})

# simulation.plot.metabolite(data2, 'alpha_GALACTOSE', 'cyt', ax = ax[1], **{'kind' : 'fill_between', 'weight' : .5}, 
#     plt_kws = {'s' : 2, 'color' : palette[2], 'label' : r'$\alpha$-galactose', 'alpha' : .5})

# simulation.plot.metabolite(data2, 'beta_GALACTOSE', 'cyt', ax = ax[1], **{'kind' : 'fill_between', 'weight' : .5}, 
#     plt_kws = {'s' : 2, 'color' : palette[3], 'label' : r'$\beta$-galactose', 'alpha' : .5})

# ax[0].set_ylim(top = 100)
# ax[1].set_ylim(top = 100)

# seaborn.despine()

## Correcting rules: $\beta$-galactosidase and galactoside O-acetyltransferase activity per monomer, not per complex.

In [32]:
# %time model = atlas.construct_model_from_metabolic_network(network, verbose = False)
%time model = atlas.construct_model_from_metabolic_network('network-lactose-metabolism-genes.tsv', verbose = False)

CPU times: user 9.63 s, sys: 19 ms, total: 9.65 s
Wall time: 9.66 s


In [33]:
model.rules[12]

Rule('BETAGALACTOSID_RXN', prot(name='lacZ', loc='cyt') + met(name='beta_lactose', loc='cyt', prot=None) + met(name='WATER', loc='cyt', prot=None) | prot(name='lacZ', loc='cyt') + met(name='beta_GALACTOSE', loc='cyt', prot=None) + met(name='beta_glucose', loc='cyt', prot=None), fwd_BETAGALACTOSID_RXN, rvs_BETAGALACTOSID_RXN)

In [34]:
model = atlas.replace_rule(model, 'BETAGALACTOSID_RXN', "Rule('BETAGALACTOSID_RXN', prot(name='lacZ', loc='cyt', up = ANY, dw = ANY) + met(name='beta_lactose', loc='cyt', prot=None) + met(name='WATER', loc='cyt', prot=None) | prot(name='lacZ', loc='cyt', up = ANY, dw = ANY) + met(name='beta_GALACTOSE', loc='cyt', prot=None) + met(name='beta_glucose', loc='cyt', prot=None), fwd_BETAGALACTOSID_RXN, rvs_BETAGALACTOSID_RXN)")

In [35]:
model.rules[-1] # the new rule appears at the end of the "rules" list

Rule('BETAGALACTOSID_RXN', prot(name='lacZ', loc='cyt', up=ANY, dw=ANY) + met(name='beta_lactose', loc='cyt', prot=None) + met(name='WATER', loc='cyt', prot=None) | prot(name='lacZ', loc='cyt', up=ANY, dw=ANY) + met(name='beta_GALACTOSE', loc='cyt', prot=None) + met(name='beta_glucose', loc='cyt', prot=None), fwd_BETAGALACTOSID_RXN, rvs_BETAGALACTOSID_RXN)

In [36]:
# %time model = atlas.construct_model_from_metabolic_network(network, verbose = False)
%time model = atlas.construct_model_from_metabolic_network('network-lactose-metabolism-genes.tsv', verbose = False)

CPU times: user 9.76 s, sys: 31.6 ms, total: 9.79 s
Wall time: 9.79 s


In [37]:
model.rules[12:20]

[Rule('BETAGALACTOSID_RXN', prot(name='lacZ', loc='cyt') + met(name='beta_lactose', loc='cyt', prot=None) + met(name='WATER', loc='cyt', prot=None) | prot(name='lacZ', loc='cyt') + met(name='beta_GALACTOSE', loc='cyt', prot=None) + met(name='beta_glucose', loc='cyt', prot=None), fwd_BETAGALACTOSID_RXN, rvs_BETAGALACTOSID_RXN),
 Rule('BETAGALACTOSID_RXN_alpha', prot(name='lacZ', loc='cyt') + met(name='alpha_lactose', loc='cyt', prot=None) + met(name='WATER', loc='cyt', prot=None) | prot(name='lacZ', loc='cyt') + met(name='alpha_GALACTOSE', loc='cyt', prot=None) + met(name='alpha_glucose', loc='cyt', prot=None), fwd_BETAGALACTOSID_RXN_alpha, rvs_BETAGALACTOSID_RXN_alpha),
 Rule('RXN0_5363', prot(name='lacZ', loc='cyt') + met(name='alpha_lactose', loc='cyt', prot=None) | prot(name='lacZ', loc='cyt') + met(name='alpha_ALLOLACTOSE', loc='cyt', prot=None), fwd_RXN0_5363, rvs_RXN0_5363),
 Rule('RXN0_5363_beta', prot(name='lacZ', loc='cyt') + met(name='beta_lactose', loc='cyt', prot=None) | pr

In [38]:
model = atlas.modify_rules(
    model, 
    oldString = "prot(name='lacZ', loc='cyt')",
    newString = "prot(name='lacZ', loc='cyt', up = ANY, dw = ANY)",
    names = [
        'BETAGALACTOSID_RXN',
        'BETAGALACTOSID_RXN_alpha', 
        'RXN0_5363', 
        'RXN0_5363_beta', 
        'ALLOLACTOSE_DEG_alpha', 
        'ALLOLACTOSE_DEG_beta', 
        'RXN_17726', 
        'RXN0_7219'])

In [39]:
model.rules[13:21] # the new rules appear at the end of the list

[Rule('BETAGALACTOSID_RXN', prot(name='lacZ', loc='cyt', up=ANY, dw=ANY) + met(name='beta_lactose', loc='cyt', prot=None) + met(name='WATER', loc='cyt', prot=None) | prot(name='lacZ', loc='cyt', up=ANY, dw=ANY) + met(name='beta_GALACTOSE', loc='cyt', prot=None) + met(name='beta_glucose', loc='cyt', prot=None), fwd_BETAGALACTOSID_RXN, rvs_BETAGALACTOSID_RXN),
 Rule('BETAGALACTOSID_RXN_alpha', prot(name='lacZ', loc='cyt', up=ANY, dw=ANY) + met(name='alpha_lactose', loc='cyt', prot=None) + met(name='WATER', loc='cyt', prot=None) | prot(name='lacZ', loc='cyt', up=ANY, dw=ANY) + met(name='alpha_GALACTOSE', loc='cyt', prot=None) + met(name='alpha_glucose', loc='cyt', prot=None), fwd_BETAGALACTOSID_RXN_alpha, rvs_BETAGALACTOSID_RXN_alpha),
 Rule('RXN0_5363', prot(name='lacZ', loc='cyt', up=ANY, dw=ANY) + met(name='alpha_lactose', loc='cyt', prot=None) | prot(name='lacZ', loc='cyt', up=ANY, dw=ANY) + met(name='alpha_ALLOLACTOSE', loc='cyt', prot=None), fwd_RXN0_5363, rvs_RXN0_5363),
 Rule('RXN

In [40]:
# %time model = atlas.construct_model_from_metabolic_network(network, verbose = False)
%time model_ppi = atlas.construct_model_from_interaction_network('network-lactose-operon-protprot.tsv', verbose = False)

CPU times: user 1.34 s, sys: 0 ns, total: 1.34 s
Wall time: 1.34 s


In [41]:
model_ppi.rules[:]

[Rule('PhysicalInteractionRule_1', prot(name='lacZ', loc='cyt', dna=None, met=None, prot=None, rna=None, up=None, dw=None) + prot(name='lacZ', loc='cyt', dna=None, met=None, prot=None, rna=None, up=None, dw=None) | prot(name='lacZ', loc='cyt', dna=None, met=None, prot=None, rna=None, up=None, dw=1) % prot(name='lacZ', loc='cyt', dna=None, met=None, prot=None, rna=None, up=1, dw=None), fwd_PhysicalInteractionRule_1, rvs_PhysicalInteractionRule_1),
 Rule('PhysicalInteractionRule_2', prot(name='lacZ', loc='cyt', dna=None, met=None, prot=None, rna=None, up=None, dw=2) % prot(name='lacZ', loc='cyt', dna=None, met=None, prot=None, rna=None, up=2, dw=None) + prot(name='lacZ', loc='cyt', dna=None, met=None, prot=None, rna=None, up=None, dw=1) % prot(name='lacZ', loc='cyt', dna=None, met=None, prot=None, rna=None, up=1, dw=None) | prot(name='lacZ', loc='cyt', dna=None, met=None, prot=None, rna=None, up=None, dw=1) % prot(name='lacZ', loc='cyt', dna=None, met=None, prot=None, rna=None, up=1, dw=

In [42]:
model_ppi = atlas.replace_rule(model_ppi, 'PhysicalInteractionRule_2', 
                                "Rule('PhysicalInteractionRule_2', \
                                    prot(name='lacZ', loc='cyt', dna=None, met=None, prot=None, rna=None, up=None, dw=2) % \
                                    prot(name='lacZ', loc='cyt', dna=None, met=None, prot=None, rna=None, up=2, dw=None) + \
                                    prot(name='lacZ', loc='cyt', dna=None, met=None, prot=None, rna=None, up=None, dw=1) % \
                                    prot(name='lacZ', loc='cyt', dna=None, met=None, prot=None, rna=None, up=1, dw=None) | \
                                    prot(name='lacZ', loc='cyt', dna=None, met=None, prot=None, rna=None, up=4, dw=1) % \
                                    prot(name='lacZ', loc='cyt', dna=None, met=None, prot=None, rna=None, up=1, dw=2) % \
                                    prot(name='lacZ', loc='cyt', dna=None, met=None, prot=None, rna=None, up=2, dw=3) % \
                                    prot(name='lacZ', loc='cyt', dna=None, met=None, prot=None, rna=None, up=3, dw=4), \
                                    fwd_PhysicalInteractionRule_2, rvs_PhysicalInteractionRule_2)")

In [43]:
model_ppi = atlas.replace_rule(model_ppi, 'PhysicalInteractionRule_4',
                                "Rule('PhysicalInteractionRule_4', \
                                    prot(name='lacA', loc='cyt', dna=None, met=None, prot=None, rna=None, up=None, dw=None) + \
                                    prot(name='lacA', loc='cyt', dna=None, met=None, prot=None, rna=None, up=None, dw=1) % \
                                    prot(name='lacA', loc='cyt', dna=None, met=None, prot=None, rna=None, up=1, dw=None) | \
                                    prot(name='lacA', loc='cyt', dna=None, met=None, prot=None, rna=None, up=3, dw=1) % \
                                    prot(name='lacA', loc='cyt', dna=None, met=None, prot=None, rna=None, up=1, dw=2) % \
                                    prot(name='lacA', loc='cyt', dna=None, met=None, prot=None, rna=None, up=2, dw=3), \
                                    fwd_PhysicalInteractionRule_4, rvs_PhysicalInteractionRule_4)")

In [44]:
combined = atlas.combine_models([model, model_ppi])
combined

<Model 'atlas_rbm.atlas' (monomers: 2, rules: 25, parameters: 320, expressions: 0, compartments: 0) at 0x7f3bc91b9070>

In [45]:
# # initial condition
# # for metabolites (set again doesn't hurt)
# simulation.set_initial.met(combined, 'beta_lactose', 'per', 100)
# simulation.set_initial.met(combined, 'PROTON', 'per', 100) # required for lactose transport
# simulation.set_initial.met(combined, 'WATER', 'cyt', 100) # required for lactose hydrolysis

# # for proteins
# simulation.set_initial.prot(combined, 'lacY', 'imem', 1)
# simulation.set_initial.prot(combined, 'lacZ', 'cyt', 12)
# simulation.set_initial.prot(combined, 'lacA', 'cyt', 12)

# # and for complexes. We set to zero to simulate complex assembly as a requisite for metabolic activity
# simulation.set_initial.cplx(combined, 'lacAx3', 'cyt', 0) # the code name for complexes is their monomer names times its stoichiometry
# simulation.set_initial.cplx(combined, 'lacZx4', 'cyt', 0)

# # simulation
# bng = '/opt/git-repositories/bionetgen.RuleWorld/bng2/'
# kasim = '/opt/git-repositories/KaSim4.Kappa-Dev/'

# %time data0 = simulation.scipy(combined, start = 0, finish = 2, points = 200)
# %time data1 = simulation.bngODE(combined, start = 0, finish = 2, points = 200, path = bng)
# %time data2 = simulation.bngSSA(combined, start = 0, finish = 2, points = 200, n_runs = 20, path = bng)
# %time data3 = simulation.kasim(combined, start = 0, finish = 2, points = 200, n_runs = 20, path = kasim)

In [46]:
# import seaborn
# import matplotlib.pyplot as plt

# palette = seaborn.color_palette('colorblind')

# # first plot, periplasmic concentration
# fig, ax = plt.subplots(1, 2, figsize = (4*2, 3*1), dpi = 100)
# simulation.plot.metabolite(data2, 'alpha_lactose', 'per', ax = ax[0], **{'kind' : 'fill_between', 'weight' : .5}, 
#     plt_kws = {'s' : 2, 'color' : palette[0], 'label' : r'$\alpha$-lactose', 'alpha' : .5})

# simulation.plot.metabolite(data2, 'beta_lactose', 'per', ax = ax[0], **{'kind' : 'fill_between', 'weight' : .5}, 
#     plt_kws = {'s' : 2, 'color' : palette[1], 'label' : r'$\beta$-lactose', 'alpha' : .5})

# simulation.plot.metabolite(data2, 'alpha_GALACTOSE', 'cyt', ax = ax[1], **{'kind' : 'fill_between', 'weight' : .5}, 
#     plt_kws = {'s' : 2, 'color' : palette[2], 'label' : r'$\alpha$-galactose', 'alpha' : .5})

# simulation.plot.metabolite(data2, 'beta_GALACTOSE', 'cyt', ax = ax[1], **{'kind' : 'fill_between', 'weight' : .5}, 
#     plt_kws = {'s' : 2, 'color' : palette[3], 'label' : r'$\beta$-galactose', 'alpha' : .5})

# ax[0].set_ylim(top = 100)
# ax[1].set_ylim(top = 100)

# seaborn.despine()

## Modeling DNA transcription and RNA translation: Coupling gene expression to metabolism.