# Functions

In [1]:
from IPython.display import display
from cobramod import __version__
print(__version__)
# From Escher:
# This option turns off the warning message if you leave or refresh this page
import escher
escher.rc['never_ask_before_quit'] = True

Scaling...
 A: min|aij| =  1.000e+00  max|aij| =  1.000e+00  ratio =  1.000e+00
Problem data seem to be well scaled
0.5.4


## Retrieving metabolic pathway information

CobraMod can obtain metabolic pathway information from multiple databases and supports all databases from the BioCyc collection, the 
KEGG database, Plant Metabolic Pathway Databases and the BiGG Models repository.
Metabolic pathway information comes in form of metabolites, reactions and
pathways. The users can load and call `cobramod.available_databases` to show a
message about the supported databases.

Each database handles their own internal identifiers. CobraMod can use
these internal identifiers to download and store the metabolic pathway 
information. 

In [2]:
from cobramod import available_databases
                                         
available_databases

0,1
Database,URL with identifier (bold)
"BioCyc, sub-database ECOLI",https://biocyc.org/compound?orgid=ECOLI&id=PPI
"Plant Metabolic Network, sub-database CORN",https://pmn.plantcyc.org/compound?orgid=CORN&id=PPI
KEGG,https://www.genome.jp/entry/C00013
"BiGG Models Repository, universal model",http://bigg.ucsd.edu/universal/metabolites/ppi

0,1
Database,Abbreviation
BioCyc,"META or identifier of sub-database e.g: ECOLI, ARA, GCF_000010885"
Plant Metabolic Network,"Prefix ""pmn:"" with the sub-database identifier, e.g pmn:PLANT, pmn:ARA, pmn:CORN"
KEGG,KEGG
BiGG Models Repository,BIGG




The users can download the metabolic pathway information using the
`cobramod.get_data` function. In this example we download information from 
MetaCyc:


In [3]:
from cobramod import get_data
from pathlib import Path

dir_data = Path.cwd().resolve().joinpath("data")
identifiers = [
    "CPD-14074",
    "CPD-14075",
    "CPD-14076",
    "CPD-14553",
    "CPD-15317",
    "CPD-15322",
    "CPD-15323",
]

for metabolite in identifiers:
    get_data(
        directory=dir_data,
        identifier=metabolite,
        database="YEAST"
    )

The first argument in [cobramod.get_data()](
module/cobramod/index.html#cobramod.get_data) is the system path where
CobraMod stores the metabolic pathway information. We use [pathlib](
https://docs.python.org/3/library/pathlib.html#pathlib.Path) for path 
representation.

The next argument indicates the original identifier in given database. In
this example we retrieve the data from MetaCyc (META). The last argument
corresponds to the abbreviation of the database.

CobraMod creates a directory with the name of the database and stores the
metabolic pathway information in it:

```
data
`-- META
    |-- CPD-14074.xml
    |-- CPD-14075.xml
    |-- CPD-14076.xml
    |-- CPD-14553.xml
    |-- CPD-15317.xml
    |-- CPD-15322.xml
    `-- CPD-15323.xml
```

## Converting the stored-data to COBRApy objects

If the users want to analyze models using Constraint-Based Reconstruction and
Analysis (COBRA) methods, they can use COBRApy. Our package is able to
convert the metabolic pathway information into COBRApy objects so they can be
incorporated into the model of interest.

The function [cobramod.create_object()](
module/cobramod/index.html#cobramod.create_object) creates COBRApy objects from 
the metabolic pathway information retrieved using [cobramod.get_data](
module/cobramod/index.html#cobramod.get_data). In case that the metabolic
pathway information was previously not downloaded, this function retrieves it automatically.

The users can automatically create Reactions or Metabolites simply by using
this function instead of creating them by scratch.

In this example, we convert the metabolite *2-Oxoglutarate* with the
KEGG identifier [C00026](
https://www.genome.jp/dbget-bin/www_bget?C00026) to a COBRApy object.
CobraMod can identify the KEGG entry as a metabolite and converts it into a
COBRApy object. 

The first argument is the database-specific identifier (`C00026`), following
the database abbreviation (`KEGG`). The third argument is the path 
representation for the directory of the metabolic pathway information. The last argument is the compartment of the reaction (`c` for cytosol).

In [4]:
from cobramod import create_object
from pathlib import Path

# Path for the metabolic pathway information directory                                                                        
dir_data = Path.cwd().resolve().joinpath("data")

new_object = create_object(
    identifier="C00026",
    database="KEGG",
    directory=dir_data,
    compartment="c"
)
                                             
print(type(new_object))
new_object

<class 'cobra.core.metabolite.Metabolite'>


0,1
Metabolite identifier,C00026_c
Name,2-Oxoglutarate;
Memory address,0x07f20d88ef650
Formula,C5H6O5
Compartment,c
In 0 reaction(s),


In this other example, we convert the reaction [RXN-11502](
https://pmn.plantcyc.org/CORN/NEW-IMAGE?object=RXN-11502) from the Plant 
Metabolic Network sub-database CORN to a COBRApy Reaction. The
first argument is the database-specific identifier (`RXN-11502`) following
the database identifier (`pmn:CORN`). CobraMod uses the same
reversibility and genes stated in their original database entry.

In [5]:
new_object = create_object(
    identifier="RXN-11501",
    database="pmn:CORN",
    directory=dir_data,
    compartment="c"
)
                                             
print(type(new_object))
display(new_object)
new_object.genes

<class 'cobra.core.reaction.Reaction'>


0,1
Reaction identifier,RXN_11501_c
Name,alkaline α- galactosidase
Memory address,0x07f20d88f52d0
Stoichiometry,CPD_170_c + WATER_c --> ALPHA_D_GALACTOSE_c + CPD_1099_c  stachyose + H2O --> alpha-D-galactopyranose + raffinose
GPR,ZM00001D031300 or ZM00001D031303 or ZM00001D003279
Lower bound,0
Upper bound,1000


frozenset({<Gene ZM00001D003279 at 0x7f20d88fbe90>,
           <Gene ZM00001D031300 at 0x7f20d88fbf90>,
           <Gene ZM00001D031303 at 0x7f20d88fbed0>})


## Adding metabolites

The function [cobramod.add_metabolites()](
module/cobramod/index.html#cobramod.add_metabolites)
extends the COBRApy function [model.add_metabolites()](
https://cobrapy.readthedocs.io/en/latest/autoapi/cobra/core/model/index.html#cobra.core.model.Model.add_metabolites
) and can be used with a simple syntax. It can utilize a single string, a list of strings, a file path or a COBRAPy Metabolite object. In the next
examples we showcase these options. We use the *E. coli* core model from COBRApy as test model. This core model can be found under `cobramod.test.textbook`. 

When the users use `obj` as a string, this string can be the
database-specific identifier of the metabolite of interest and its compartment.
It is also possible to add user-curated metabolites. This argument uses the following syntax:

------

**SYNTAX**  
To metabolic pathway information from a database:

    database-specific_identifier, compartment

To add user-curated metabolites: [use same structure as above]

    user-curated_identifier, name, compartment, chemical_formula, molecular_charge

------

In the first example we add the metabolite *L-methionine* with the MetaCyc
identifier [MET](
https://metacyc.org/compound?orgid=META&id=MET) to the test model. The first
argument is the model to extend. The `obj` argument use the identifier
`MET` and the compartment `c`. The next argument is the database identifier
(`META`) and finally the directory where CobraMod stores and uses the
metabolic pathway information.


In [6]:
from cobramod import add_metabolites
from cobramod.test import textbook_biocyc
from pathlib import Path

# Path for the metabolic pathway information directory                                                                        
dir_data = Path.cwd().resolve().joinpath("data")
# Using copy
test_model = textbook_biocyc.copy()

add_metabolites(
    model=test_model,
    obj="MET, c",
    database="META",
    directory=dir_data,
)
print(type(test_model.metabolites.get_by_id("MET_c")))
test_model.metabolites.get_by_id("MET_c")

<class 'cobra.core.metabolite.Metabolite'>


0,1
Metabolite identifier,MET_c
Name,L-methionine
Memory address,0x07f20d81d5d10
Formula,C5H11N1O2S1
Compartment,c
In 0 reaction(s),


In this second example we add two metabolites ([methionine](
https://metacyc.org/compound?orgid=META&id=MET) and [sucrose](
https://metacyc.org/compound?orgid=META&id=SUCROSE
)) from MetaCyc. We introduce
in the argument `obj` a list with the database-specific identifier and their
compartments. The rest of  the arguments remain the same as the previous example.
CobraMod skips the addition of metabolites that are already included into
the model and shows a warning.


In [7]:
add_metabolites(
    model=test_model,
    obj=["MET, c", "SUCROSE, c"],
    database="META",
    directory=dir_data,
)
# Show metabolites in jupyter
display(test_model.metabolites.get_by_id("MET_c"))  
test_model.metabolites.get_by_id("SUCROSE_c")



0,1
Metabolite identifier,MET_c
Name,L-methionine
Memory address,0x07f20d81d5d10
Formula,C5H11N1O2S1
Compartment,c
In 0 reaction(s),


0,1
Metabolite identifier,SUCROSE_c
Name,sucrose
Memory address,0x07f20d81d7d10
Formula,C12H22O11
Compartment,c
In 0 reaction(s),


In this third example, we use a text file to add metabolites to the test
model. We have the file *metabolites.txt* in the current working directory with given content:

    SUCROSE, c  
    MET, c  
    MALTOSE_c, MALTOSE[c], c, C12H22O11, 1

CobraMod downloads the first two metabolites from MetaCyc, while `MALTOSE_c`
is a user-curated metabolite. 

The users can utilize the file path for this file in the `obj` argument to add
the metabolites in the test model. The next arguments are the same as in
the previous examples. We added two print statements to show that CobraMod adds
the metabolites to the model.

In [8]:
# Path for the metabolic pathway information directory
dir_data = Path.cwd().resolve().joinpath("data")
# This is our file
file = dir_data.joinpath("metabolites.txt")
# Using a copy
test_model = textbook_biocyc.copy()

print(f'Number of metabolites prior addition: {len(test_model.metabolites)}')
# Using CobraMod
add_metabolites(
    model=test_model,
    obj=file,
    directory=dir_data,
    database="META",
)
print(f'Number of metabolites after addition: {len(test_model.metabolites)}')
# Show metabolites in jupyter
display(test_model.metabolites.get_by_id("MET_c"))
display(test_model.metabolites.get_by_id("SUCROSE_c"))
test_model.metabolites.get_by_id("MALTOSE_c")

Number of metabolites prior addition: 72
Number of metabolites after addition: 75


0,1
Metabolite identifier,MET_c
Name,L-methionine
Memory address,0x07f20d8112d90
Formula,C5H11N1O2S1
Compartment,c
In 0 reaction(s),


0,1
Metabolite identifier,SUCROSE_c
Name,sucrose
Memory address,0x07f20d8904890
Formula,C12H22O11
Compartment,c
In 0 reaction(s),


0,1
Metabolite identifier,MALTOSE_c
Name,MALTOSE[c]
Memory address,0x07f20d88fb410
Formula,C12H22O11
Compartment,c
In 0 reaction(s),


Since this function is an extension of the original COBRApy function [model.add_metabolites()](
https://cobrapy.readthedocs.io/en/latest/autoapi/cobra/core/model/index.html#cobra.core.model.Model.add_metabolites)
the users can also utilize COBRApy Metabolites. In this example, we use a variation of the test model (`textbook_biocyc`) which uses
BioCyc identifiers for their metabolites. We copy a COBRApy Metabolite from
the test model and then add it to the BioCyc-test model. 

In [9]:
from cobramod import add_metabolites
from cobramod.test import textbook, textbook_biocyc
                        
# Copying Metabolite from original model
metabolite = textbook.metabolites.get_by_id("xu5p__D_c")
# Using a copy
test_model = textbook_biocyc.copy()
add_metabolites(
    model=test_model,
    obj=metabolite
)
                                                               
test_model.metabolites.get_by_id("xu5p__D_c")

0,1
Metabolite identifier,xu5p__D_c
Name,D-Xylulose 5-phosphate
Memory address,0x07f20d86981d0
Formula,C5H9O8P
Compartment,c
In 3 reaction(s),"RPE, TKT2, TKT1"


The users receive a warning if CobraMod detects large molecules (e.g.  enzymes) or if the metabolite information does not include a chemical
formula. In this example, we use the enzyme with the MetaCyc identifier
`Red-NADPH-Hemoprotein-Reductases` and added to the test model. In this case,
CobraMod raises a warning due to the missing chemical formula.

In [10]:
# Using a copy
test_model = textbook.copy()

add_metabolites(
    model=test_model,
    obj="Red-NADPH-Hemoprotein-Reductases, c",
    directory=dir_data,
    database="META",
)
test_model.metabolites.get_by_id("Red_NADPH_Hemoprotein_Reductases_c")

  warn(msg)


0,1
Metabolite identifier,Red_NADPH_Hemoprotein_Reductases_c
Name,Red-NADPH-Hemoprotein-Reductases
Memory address,0x07f20d820ffd0
Formula,X
Compartment,c
In 0 reaction(s),


----------------

**NOTES**

- CobraMod replaces hyphens (`-`) to underscores (`_`) in the identifiers when
creating COBRApy Metabolites.
- The users must use the same database identifier, when adding multiple
metabolites, e.g. It is not possible to use two databases with the same
function. In that case, the users should call the function twice with the
respective database identifier.

----------------

## Adding reactions

The function [cobramod.add_reactions](
module/cobramod/index.html#cobramod.add_reactions) extends the COBRApy
function [model.add_reactions()](
https://cobrapy.readthedocs.io/en/latest/autoapi/cobra/index.html?highlight=optimize#cobra.Model.add_reactions
) and can be used with a simple syntax. It can utilize a single string, a 
list of string, a file path or a COBRApy Reaction object. In the examples we
showcase these options. We use the *E. coli* core model from COBRApy as test model. This core model can be found
under `cobramod.test.textbook`. 

When the users use `obj` as a string, this string can be the
database-specific identifier of the reaction of interest and its
compartment. It is also possible to add user-curated reactions. This argument
uses the following syntax:

--------

**SYNTAX**  

To retrieve reactions from a database:

    database-specific_identifier, compartment

In case of user-curated reactions, the users can specify the identifier and the name of the reaction, following the [COBRApy reaction string syntax](
  https://cobrapy.readthedocs.io/en/latest/autoapi/cobra/core/reaction/index.html#cobra.core.reaction.Reaction.build_reaction_from_string
):

    user-curated_identifier, name | coefficient_1 metabolite_1 <-> coefficient_2 metabolite_2

Metabolites need to include its compartment with a suffix, defined by an underscore (`_`) and a letter: e.g: 

    TRANS_H2O_ec, Oxygen Transport | 2 OXYGEN-MOLECULE_e <-> 2 OXYGEN_MOLECULE_c

-------

In the first example we add the KEGG reaction [R04382](
https://www.kegg.jp/dbget-bin/www_bget?rn:R04382
) to the test model. The first argument is the model to extend. The `obj`
argument use the identifier `R04382` and the compartment `c`. The next argument is the database identifier (`KEGG`) and finally the directory where CobraMod stores and uses the metabolic pathway information. The argument `genome` is a 
KEGG-specific argument. Please read the notes below for more information about it.

CobraMod parses the metabolic pathway information and creates the corresponding
genes. In this example, CobraMod will create the gene `c0319` and add it to the
COBRApy reaction.

In [11]:
from cobramod.test import textbook_kegg
from cobramod import add_reactions
from pathlib import Path
                                                           
dir_data = Path.cwd().resolve().joinpath("data")
# Using copy
test_model = textbook_kegg.copy()
                                                           
add_reactions(
    model=test_model,
    obj="R04382, c",
    database="KEGG",
    directory=dir_data,
    genome="ecc"
)
                                                           
display(test_model.reactions.get_by_id("R04382_c"))
print(test_model.reactions.get_by_id("R04382_c").genes)

0,1
Reaction identifier,R04382_c
Name,4-(4-deoxy-alpha-D-galact-4-enuronosyl)-D-galacturonate lyase
Memory address,0x07f20d5f81850
Stoichiometry,C06118_c <=> 2.0 C04053_c  4-(4-Deoxy-alpha-D-gluc-4-enuronosyl)-D-galacturonate; <=> 2.0 5-Dehydro-4-deoxy-D-glucuronate;
GPR,c0319
Lower bound,-1000
Upper bound,1000


frozenset({<Gene c0319 at 0x7f20d8393810>})


In this second example we add two reactions ([R04382](
  https://www.kegg.jp/entry/R04382
) and [R02736](
  https://www.kegg.jp/entry/R02736
 )) from KEGG. We introduce in the argument
`obj` a list with the database-specific identifier and their compartments. The
rest of the arguments remain the same as the previous example. CobraMod skips
the addition of reactions that are already included into the model and shows
a warning.

In [12]:
add_reactions(
    model=test_model,
    obj=["R04382, c", "R02736 ,c"],
    directory=dir_data,
    database="KEGG",
    genome="ecc"
)
                                                            
display(test_model.reactions.get_by_id("R04382_c"))
test_model.reactions.get_by_id("R02736_c")



0,1
Reaction identifier,R04382_c
Name,4-(4-deoxy-alpha-D-galact-4-enuronosyl)-D-galacturonate lyase
Memory address,0x07f20d5f81850
Stoichiometry,C06118_c <=> 2.0 C04053_c  4-(4-Deoxy-alpha-D-gluc-4-enuronosyl)-D-galacturonate; <=> 2.0 5-Dehydro-4-deoxy-D-glucuronate;
GPR,c0319
Lower bound,-1000
Upper bound,1000


0,1
Reaction identifier,R02736_c
Name,beta-D-glucose-6-phosphate:NADP+ 1-oxoreductase
Memory address,0x07f20d85b2850
Stoichiometry,"C00006_c + C01172_c --> C00005_c + C00080_c + C01236_c  Nicotinamide adenine dinucleotide phosphate + beta-D-Glucose 6-phosphate --> Nicotinamide adenine dinucleotide phosphate - reduced + H+ + 6-phospho-D-glucono-1,5-lactone"
GPR,c2265
Lower bound,0
Upper bound,1000


In this new example, we use a text file to add reactions to the test model.
We have the file *reactions.txt* in the current working directory with:

    R04382, c  
    R02736, c  
    C06118_ce, digalacturonate transport | 1 C06118_c <-> 1 C06118_e

CobraMod downloads the first two reactions from KEGG, while `C06118_ce` is a
user-curated reaction.

The users can utilize the file path for this file in the `obj` argument to add
the reactions to the test model. The next arguments are the same as in the
previous examples. We added two print statements to show that CobraMod adds
the reaction to the model.

In [13]:
from cobramod.test import textbook_kegg
from cobramod import add_reactions
from pathlib import Path
                                                                     
dir_data = Path.cwd().resolve().joinpath("data")
test_model = textbook_kegg.copy()
# This is the file with text
file = dir_data.joinpath("reactions.txt")

print(f'Number of reactions prior addition: {len(test_model.reactions)}')
                                                                     
add_reactions(
    model=test_model,
    obj=file,
    directory=dir_data,
    database="KEGG",
    genome="ecc"
)

print(f'Number of reactions after addition: {len(test_model.reactions)}')
# Show in jupyter
display(test_model.reactions.get_by_id("R04382_c"))
display(test_model.reactions.get_by_id("R02736_c"))
test_model.reactions.get_by_id("C06118_ce")

Number of reactions prior addition: 95
Number of reactions after addition: 98


0,1
Reaction identifier,R04382_c
Name,4-(4-deoxy-alpha-D-galact-4-enuronosyl)-D-galacturonate lyase
Memory address,0x07f20d80763d0
Stoichiometry,C06118_c <=> 2.0 C04053_c  4-(4-Deoxy-alpha-D-gluc-4-enuronosyl)-D-galacturonate; <=> 2.0 5-Dehydro-4-deoxy-D-glucuronate;
GPR,c0319
Lower bound,-1000
Upper bound,1000


0,1
Reaction identifier,R02736_c
Name,beta-D-glucose-6-phosphate:NADP+ 1-oxoreductase
Memory address,0x07f20d862f5d0
Stoichiometry,"C00006_c + C01172_c --> C00005_c + C00080_c + C01236_c  Nicotinamide adenine dinucleotide phosphate + beta-D-Glucose 6-phosphate --> Nicotinamide adenine dinucleotide phosphate - reduced + H+ + 6-phospho-D-glucono-1,5-lactone"
GPR,c2265
Lower bound,0
Upper bound,1000


0,1
Reaction identifier,C06118_ce
Name,digalacturonate transport
Memory address,0x07f20d5f81f50
Stoichiometry,C06118_c <=> C06118_e  4-(4-Deoxy-alpha-D-gluc-4-enuronosyl)-D-galacturonate; <=> 4-(4-Deoxy-alpha-D-gluc-4-enuronosyl)-D-galacturonate;
GPR,
Lower bound,-1000
Upper bound,1000


Since this function is an extension of the original COBRApy function
[model.add_reactions()](https://cobrapy.readthedocs.io/en/latest/autoapi/cobra/index.html?highlight=optimize#cobra.Model.add_reactions) the users can also
utilize COBRApy Reactions. In this example, we use a variation of the test model
(`textbook_kegg`) which uses KEGG identifiers for their metabolites. We copy a
COBRApy Reaction from the test model and then add it to the KEGG-test model.

In [14]:
from cobramod.test import textbook_kegg, textbook
from cobramod import add_reactions
from pathlib import Path

# Using copy of test model
test_model = textbook_kegg.copy()
# Obtaining a reaction
reaction = textbook.reactions.get_by_id("ACALDt")
                                                                  
add_reactions(model=test_model, obj=reaction)

test_model.reactions.get_by_id("ACALDt")



0,1
Reaction identifier,ACALDt
Name,R acetaldehyde reversible - transport
Memory address,0x07f20d80a74d0
Stoichiometry,C00084_e <=> C00084_c  Acetaldehyde <=> Acetaldehyde
GPR,s0001
Lower bound,-1000.0
Upper bound,1000.0


 By default, COBRApy ignores metabolites that appear on
both sides of a reaction equation. CobraMod identifies these reactions and assigns one of these metabolites to the extracellular compartment and raises a warning asking the user for manual curation. In this example, we add a
[transport reaction for acetic acid](
https://biocyc.org/META/new-image?object=TRANS-RXN-455
) from BioCyc sub-database `YEAST` to the test model.

In [15]:
test_model = textbook_kegg.copy()
                                                           
add_reactions(
    model=test_model,
    obj="TRANS-RXN-455, c",
    database="YEAST",
    directory=dir_data,
)
# Show in jupyter
test_model.reactions.get_by_id("TRANS_RXN_455_c")



0,1
Reaction identifier,TRANS_RXN_455_c
Name,acetic acid uptake
Memory address,0x07f20d5ec4690
Stoichiometry,CPD_24335_e --> CPD_24335_c  acetic+acid --> acetic+acid
GPR,G3O-32144
Lower bound,0
Upper bound,1000


---

**NOTES**

- CobraMod replaces hyphens (`-`) to underscores (`_`) in the identifiers when
creating COBRApy Reactions.
-  The users must use the same database identifier, when adding multiple
reactions, e.g. It is not possible to use two databases with the same function.
In that case, the users should call the function twice with the respective
database identifier.
- CobraMod tries to identify reactions or metabolites that are already present
in the model. The metabolic pathway information contains multiple
cross-references database entries. If an entry is found in the model, then
CobraMod uses it instead of creating the COBRApy objects.
- The argument `genome` can be used with the database `KEGG` and specifies the genome for which gene information will be retrieved. The complete list is available [here](
https://www.genome.jp/kegg/catalog/org_list.html).
If no argument is given, no gene information will be retrieved and
a warning is printed as shown below:

In [16]:
test_model = textbook_kegg.copy()
                                                           
add_reactions(
    model=test_model,
    obj="R04382, c",
    database="KEGG",
    directory=dir_data,
)
test_model.reactions.get_by_id("R04382_c")



0,1
Reaction identifier,R04382_c
Name,4-(4-deoxy-alpha-D-galact-4-enuronosyl)-D-galacturonate lyase
Memory address,0x07f20d5e52b90
Stoichiometry,C06118_c <=> 2.0 C04053_c  4-(4-Deoxy-alpha-D-gluc-4-enuronosyl)-D-galacturonate; <=> 2.0 5-Dehydro-4-deoxy-D-glucuronate;
GPR,
Lower bound,-1000
Upper bound,1000


---

## Adding pathways
 
CobraMod can add metabolic pathways to a given model. The function
[cobramod.add_pathway()](
module/cobramod/index.html#cobramod.add_pathway) takes as arguments a sequence of database-specific reaction identifiers or a pathway identifier. In the
examples we showcase these two options. We use the *E. coli* core model from
COBRApy as test model. This core model can be found under 
`cobramod.test.textbook`.

In the first example, we add the [acetoacetate degradation pathway](
https://biocyc.org/ECOLI/new-image?object=ACETOACETATE-DEG-PWY
) from the BioCyc sub-database `ECOLI` to the test model. This pathway has two
reactions and six metabolites.

<img src="https://websvc.biocyc.org/ECOLI/diagram-only?type=PATHWAY&object=ACETOACETATE-DEG-PWY&pfontsize=normal"/>



The first argument is the model to extend. The `pathway` argument uses the
database-specific identifier `ACETOACETATE-DEG-PWY` and the database identifier
`ECOLI`. We define the compartment as `c` (cytosol). i.e. all COBRApy Reactions
and Metabolites have that compartment. Using the argument `filename` the users
can specify the file where the summary of the changes are written. All COBRApy
Reactions included in the Pathway are tested for a non-zero flux. Read more
about it in the corresponding title.
Calling the [cobramod.Pathway](
module/cobramod/index.html#cobramod.Pathway
) outputs a table with a summary of the object. 

In [17]:
from pathlib import Path
from cobramod import add_pathway
from cobramod.test import textbook
# Defining directory
dir_data = Path.cwd().resolve().joinpath("data")
                                   
# Using copy of test model
test_model = textbook.copy()

add_pathway(
    model=test_model,
    pathway="ACETOACETATE-DEG-PWY",
    database="ECOLI",
    compartment="c",
    filename="summary.txt",
    directory=dir_data,
)

# Display in jupyter
test_model.groups.get_by_id("ACETOACETATE-DEG-PWY")



Quantity of     new   | removed entities in
Reactions        2    |    0              
Metabolites      2    |    0              
Exchange         0    |    0              
Demand           0    |    0              
Sinks            1    |    0              
Genes            4    |    0              
Groups           1    |    0              



0,1
Pathway identifier,ACETOACETATE-DEG-PWY
Name,
Memory address,0x0139779039495504
Reactions involved,"ACETOACETYL_COA_TRANSFER_RXN_c, ACETYL_COA_ACETYLTRANSFER_RXN_c"
Genes involved,"EG11669, EG11670, EG12432, EG11672"
Visualization attributes,vertical = False color_negative = None color_positive = None color_quantile = False


Below is an example of the summary in form of a text file. The first
part shows all the names of the reactions, metabolites, exchange reactions,
auxiliary demand and sink reactions, genes and groups included in the model.
The second part of the summary shows the additions and removals from using
the function `add_pathway()`.

In [18]:
%cat summary.txt

Summary:
Model identifier: e_coli_core
Model name:

Reactions:
['ACALD', 'ACALDt', 'ACKr', 'ACONTa', 'ACONTb', 'ACt2r', 'ADK1', 'AKGDH', 'AKGt2r', 'ALCD2x', 'ATPM', 'ATPS4r', 'Biomass_Ecoli_core', 'CO2t', 'CS', 'CYTBD', 'D_LACt2', 'ENO', 'ETOHt2r', 'FBA', 'FBP', 'FORt2', 'FORti', 'FRD7', 'FRUpts2', 'FUM', 'FUMt2_2', 'G6PDH2r', 'GAPD', 'GLCpts', 'GLNS', 'GLNabc', 'GLUDy', 'GLUN', 'GLUSy', 'GLUt2r', 'GND', 'H2Ot', 'ICDHyr', 'ICL', 'LDH_D', 'MALS', 'MALt2_2', 'MDH', 'ME1', 'ME2', 'NADH16', 'NADTRHD', 'NH4t', 'O2t', 'PDH', 'PFK', 'PFL', 'PGI', 'PGK', 'PGL', 'PGM', 'PIt2r', 'PPC', 'PPCK', 'PPS', 'PTAr', 'PYK', 'PYRt2', 'RPE', 'RPI', 'SUCCt2_2', 'SUCCt3', 'SUCDi', 'SUCOAS', 'TALA', 'THD2', 'TKT1', 'TKT2', 'TPI', 'ACETOACETYL_COA_TRANSFER_RXN_c', 'ACETYL_COA_ACETYLTRANSFER_RXN_c']
Metabolites:
['13dpg_c', '2pg_c', '3pg_c', '6pgc_c', '6pgl_c', 'ac_c', 'ac_e', 'acald_c', 'acald_e', 'accoa_c', 'acon_C_c', 'actp_c', 'adp_c', 'akg_c', 'akg_e', 'amp_c', 'atp_c', 'cit_c', 'co2_c', 'co2_e', 'c

In this new example, we use a list with database-specific identifiers are put 
them in the argument `pathway`. We use the database identifier `ECOLI` and the
compartment `c` (cytosol). Additionally, we can define the name of the pathway
by using the argument `group`. User can also use this arguments and merge
pathways by using the same group.

In [19]:
from pathlib import Path
from cobramod import add_pathway
from cobramod.test import textbook_biocyc
# Defining directory
dir_data = Path.cwd().resolve().joinpath("data")

test_model = textbook_biocyc.copy()
# Defining database-specific identifiers
sequence = ["PEPDEPHOS-RXN", "PYRUVFORMLY-RXN", "FHLMULTI-RXN"]
                                                                
print(f'Number of reaction prior addition: {len(test_model.reactions)}')
                                                                
add_pathway(
    model=test_model,
    pathway=sequence,
    directory=dir_data,
    database="ECOLI",
    compartment="c",
    group="curated_pathway"
)

print(f'Number of reactions after addition: {len(test_model.reactions)}')
# Display in jupyter
test_model.groups.get_by_id("curated_pathway")

Number of reaction prior addition: 95




Quantity of     new   | removed entities in
Reactions        3    |    0              
Metabolites      2    |    0              
Exchange         0    |    0              
Demand           0    |    0              
Sinks            1    |    0              
Genes           11    |    0              
Groups           1    |    0              

Number of reactions after addition: 99


0,1
Pathway identifier,curated_pathway
Name,
Memory address,0x0139779002437712
Reactions involved,"PEPDEPHOS_RXN_c, PYRUVFORMLY_RXN_c, FHLMULTI_RXN_c"
Genes involved,"EG10803, EG10804, EG10701, G7627, EG10477, EG10285, EG10478, EG10476, EG10475, EG10480, EG10479"
Visualization attributes,vertical = False color_negative = None color_positive = None color_quantile = False


--------------------

**NOTES**

- A pathway is a set of COBRApy Reactions. All the notes about
`add_metabolites()` and `add_reactions()` apply to pathways. i. e. duplicate
elements, transport reactions and the argument `genome` for KEGG.

--------------------


## Curation process

During the creation of COBRApy Reactions and metabolites, and the CobraMod 
Pathway, the users can expect the following:

1. CobraMod tries to identify metabolites that are already included in the 
model.
2. CobraMod reads the metadata of the metabolic pathway information and try
to finds duplicates in the model.
3. If CobraMod encounters large molecules or data with missing properties,
users are warned about them.
4. CobraMod tries to find COBRApy Reactions that are already in the model 
instead of creating them.
5. CobraMod utilizes the COBRApy method [cobra.Reaction.check_mass_balance()](
https://cobrapy.readthedocs.io/en/latest/autoapi/cobra/core/reaction/index.html#cobra.core.reaction.Reaction.check_mass_balance 
) and returns warnings if imbalances are found.
6. This package respects the reaction reversibility stated in the
metabolic pathway information of the reaction. In case that the reversibility is missing, CobraMod raises a warning.
7. When CobraMod adds pathway, every single reaction will undergo through a *non-zero flux test*. This test ensures that the added reactions can carry a
non-zero flux. In case that a reaction encounter problems, CobraMod creates
auxiliary sink reactions and suggests manual curation steps based on these
auxiliary modifications.
8. All the information about the download, the creation of every single object,
the warnings and exceptions will be written in a log file with the name
`debug.log`. This file should help users keep track of the changes of the model.
Below we showed as an example partly the log file.

In [20]:
!tail debug.log -n 20

2021-07-25 09:57:08,524 INFO Reaction "PEPDEPHOS_RXN_c" added to group "curated_pathway".
2021-07-25 09:57:08,525 INFO Reaction "PYRUVFORMLY_RXN_c" was added to model.
2021-07-25 09:57:08,525 INFO Test to carry non-zero fluxes for "PYRUVFORMLY_RXN_c" started
2021-07-25 09:57:08,526 INFO Reaction "PYRUVFORMLY_RXN_c" passed the non-zero flux test.
2021-07-25 09:57:08,527 INFO Reaction "PYRUVFORMLY_RXN_c" added to group "curated_pathway".
2021-07-25 09:57:08,528 INFO Reaction "FHLMULTI_RXN_c" was added to model.
2021-07-25 09:57:08,528 INFO Test to carry non-zero fluxes for "FHLMULTI_RXN_c" started
2021-07-25 09:57:08,529 INFO Reaction "FHLMULTI_RXN_c" passed the non-zero flux test.
2021-07-25 09:57:08,530 INFO Reaction "FHLMULTI_RXN_c" added to group "curated_pathway".
2021-07-25 09:57:08,531 INFO Pathway "curated_pathway" added to Model.
2021-07-25 09:57:08,542 INFO Data for "PEPDEPHOS-RXN" retrieved from "ECOLI".
2021-07-25 09:57:08,552 INFO Data for "PYRUVFORMLY-RXN" retrie

## Non-zero flux test

When the users calls the function `add_pathway()`, CobraMod test each COBRApy
Reaction of the `Pathway` for its capability to carry a non-zero flux.
Additionally, the users can test individual COBRApy reactions by using the 
function [cobramod.test_non_zero_flux()](
module/cobramod/index.html#cobramod.test_non_zero_flux()
)

When using this function, CobraMod tests if the metabolites of given reaction
can be turned over, i.e. the reaction can carry a non-zero flux. If the text initially fails, auxiliary sink reactions will be added to the model. CobraMod will raise a warning and suggests manual curation steps. Otherwise, if no 
message appears, the test is passed.

In this example, we test the glutathione synthase reaction
(`GLUTATHIONE-SYN-RXN`) for its capability to carry a non-zero flux.
Using the function `add_reactions()` we added reactions, whose metabolites
participate in the synthase reaction. The users can specify in the argument
`ignore_list` the metabolites that users do not want to turn over, i.e CobraMod
does not create a auxiliary sink reaction for `PROTON_p`. The original reaction
cannot have a flux since protons cannot be transfer to the reaction. 

In [21]:
from cobramod import test_non_zero_flux, add_reactions
from cobramod.test import textbook_biocyc

test_model = textbook_biocyc.copy()

add_reactions(
    model=test_model,
    # These reactions will break the model and raise errors
    obj=[
        "Redox_ADP_ATP_p, Redox_ADP_ATP_p | ADP_p <-> ATP_p",
        "TRANS_Pi_cp, Transport Phosphate_cp | Pi_c <-> Pi_p",
        "TRANS_GLUTATHIONE_cp, Transport GLUTATHIONE_cp | "
        + "GLUTATHIONE_c <-> GLUTATHIONE_p",
        "GLUTATHIONE-SYN-RXN, p",
    ],
    directory=dir_data,
    database="ECOLI",
    replacement={},
)
test_non_zero_flux(
    model=test_model,
    reaction="GLUTATHIONE_SYN_RXN_p",
    ignore_list=["PROTON_p"],
)


{'charge': -1.0, 'O': 3.0, 'P': 1.0}


NotInRangeError: Non-zero flux test for reaction "GLUTATHIONE_SYN_RXN_p" failed multiple times. Flux value results lower than solver tolerance. Please add manually reactions that can turn over the metabolites of this reaction.

## Converting Group back to Pathway

The users can use the function [cobra.io.write_sbml_model](
  https://cobrapy.readthedocs.io/en/latest/autoapi/cobra/io/index.html#cobra.io.write_sbml_model
) to save their models. However, if the model has `Pathway` objects, they will
be loaded as a regular `cobra.core.group.Group` and lose their functionalities.
We developed the function [cobramod.model_convert()](
module/cobramod/core/pathway/index.html#model_convert
) to convert these groups into proper CobraMod Pathways.

In this example, we simulated the loading of a `Group` and add four reactions
into it. We only need to specify the model to convert with the argument 
`model`. Finally, we call the CobraMod Pathway and observe the specific Pathway
HTML output.

In [22]:
from cobramod import model_convert
from cobramod.test import textbook_biocyc
from cobra.core.group import Group

test_model = textbook_biocyc.copy()

test_group = Group(id="curated_pathway")
for reaction in ("GLCpts", "G6PDH2r", "PGL", "GND"):
    test_group.add_members([test_model.reactions.get_by_id(reaction)])
test_model.add_groups([test_group])

# Conversion to a Pathway
model_convert(model=test_model)
test_model.groups.get_by_id("curated_pathway")

0,1
Pathway identifier,curated_pathway
Name,
Memory address,0x0139779516962256
Reactions involved,"GLCpts, G6PDH2r, PGL, GND"
Genes involved,"b1819, b1817, b2415, b1818, b1621, b1101, b2416, b2417, b1852, b0767, b2029"
Visualization attributes,vertical = False color_negative = None color_positive = None color_quantile = False


## Visualization with Escher

CobraMod uses [Escher](https://escher.readthedocs.io/en/latest/) to visualize
pathways. Each CobraMod Pathway includes a visualization method
[Pathway.visualize()](
module/cobramod/core/pathway/index.html#cobramod.core.pathway.Pathway.visualize
) which automatically generates pathway maps of the respective set of
reactions. These pathway maps can be easily customized to visualize flux
distributions using default or user-defined colors and gradients (linear or
quantile normalized).

In this example, we call the function `visualize` without any arguments.

In [23]:
test_model.groups.get_by_id("curated_pathway").visualize()

Builder(never_ask_before_quit=True, reaction_scale={}, reaction_styles=['color', 'text'])

We can modify the orientation of our Pathway by changing the attribute 
`vertical` to `True`.

In [24]:
test_model.groups.get_by_id("curated_pathway").vertical = True
test_model.groups.get_by_id("curated_pathway").visualize()

Builder(never_ask_before_quit=True, reaction_scale={}, reaction_styles=['color', 'text'])

The visualization method can also by call with the argument `solution_fluxes`.
This function can either use a dictionary with the fluxes of the reactions
or a [COBRApy Solution](
https://cobrapy.readthedocs.io/en/latest/autoapi/cobra/core/solution/index.html#cobra.core.solution.Solution
). By using either of them, CobraMod colors the fluxes according to their
values. In this new example, we create a dictionary with fluxes and we pass it
to the visualization method.

In [25]:
# For flux visualization of the group
solution =  {
    "GLCpts": -2, "G6PDH2r": -2, "PGL": 0.4, "GND": 1
}
# Modifying attributes
test_model.groups.get_by_id("curated_pathway").visualize(
    solution_fluxes=solution
)

Builder(never_ask_before_quit=True, reaction_data={'GLCpts': -2, 'G6PDH2r': -2, 'PGL': 0.4, 'GND': 1}, reactio…

We can change the colors of the fluxes by changing the attribute 
`color_negative` and `color_positive`. In this example, we use the red color for
negative fluxes and green for positive fluxes.

In [26]:
# Modifying attributes
test_model.groups.get_by_id("curated_pathway").color_negative = "red"
test_model.groups.get_by_id("curated_pathway").color_positive = "green"
test_model.groups.get_by_id("curated_pathway").visualize(
    solution_fluxes=solution
)

Builder(never_ask_before_quit=True, reaction_data={'GLCpts': -2, 'G6PDH2r': -2, 'PGL': 0.4, 'GND': 1}, reactio…

In this example we change the color of the fluxes to orange for negative values
and light blue for positive values. Available colors are found [here](
https://www.w3schools.com/cssref/css_colors.asp
)

In [27]:
# New flux with high value
solution =  {
    "GLCpts": -2, "G6PDH2r": -2, "PGL": 0.4, "GND": 1, "Other": 1000
}

test_model.groups.get_by_id("curated_pathway").color_negative = "orange"
test_model.groups.get_by_id("curated_pathway").color_positive = "lightskyblue"
test_model.groups.get_by_id("curated_pathway").visualize(
    solution_fluxes=solution
)

Builder(never_ask_before_quit=True, reaction_data={'GLCpts': -2, 'G6PDH2r': -2, 'PGL': 0.4, 'GND': 1, 'Other':…

By default, the color of the fluxes are linear normalized. The users can
normalize the colors to the quantile of the fluxes by changing the attribute
`color_quantile` to `True`. This is useful when the fluxes values vary by 
multiple levels of magnitude.

In the previous example, we added a new reaction in the dictionary with a flux
of 1000. We can see that the positive colors are very dim. In the next example
we change the attribute `color_quantile` and now the colors are very bright.

In [28]:
test_model.groups.get_by_id("curated_pathway").color_quantile = True
test_model.groups.get_by_id("curated_pathway").visualize(
    solution_fluxes=solution
)

Builder(never_ask_before_quit=True, reaction_data={'GLCpts': -2, 'G6PDH2r': -2, 'PGL': 0.4, 'GND': 1, 'Other':…

The users can call the `Pathway` to have a quick summary of the current 
attributes.

In [29]:
test_model.groups.get_by_id("curated_pathway")

0,1
Pathway identifier,curated_pathway
Name,
Memory address,0x0139779516962256
Reactions involved,"GLCpts, G6PDH2r, PGL, GND"
Genes involved,"b1819, b1817, b2415, b1818, b1621, b1101, b2416, b2417, b1852, b0767, b2029"
Visualization attributes,vertical = True color_negative = orange color_positive = lightskyblue color_quantile = True


CobraMod Pathways are saved as an HTML file with the default name
`pathway.html`.
This is useful for users that do not use interactive platforms like Jupyter.
The users can specify the file name by using the argument `filename`. In this
example, we name the HTML to `curated_pathway.html`.


In [30]:
test_model.groups.get_by_id("curated_pathway").visualize(
    solution_fluxes=solution, filename = "curated_pathway.html"
)

Builder(never_ask_before_quit=True, reaction_data={'GLCpts': -2, 'G6PDH2r': -2, 'PGL': 0.4, 'GND': 1, 'Other':…

We can verify that the file exists by using the `ls` command.


In [31]:
!ls curated_pathway.html

curated_pathway.html
