# Getting started with cameo 

**cameo** reuses and extends model data structures defined by [cobrapy](https://opencobra.github.io/cobrapy/) (**CO**nstraints-**B**ased **R**econstruction and **A**nalysis tool for **Py**thon). So, in addition to following this quick start guide and other **cameo** tutorials, we encourage you to explore cobrapy's [documentation](https://cobrapy.readthedocs.org/en/latest/cobra.core.html) as well.

Step 1: Load a model
-------------------

Loading a model is easy. Just import the `~cameo.io.load_model` function.

In [1]:
from cameo import load_model

For example, load the most current genome-scale metabolic reconstruction of _Escherichia coli_.

In [2]:
model = load_model("iJO1366")

Models, reactions, metabolites, etc., provide return HTML when evaluated in Jupyter notebooks and can thus be easily inspected.

In [3]:
model

0,1
Name,iJO1366
Memory address,0x0115c8ceb8
Number of metabolites,1805
Number of reactions,2583
Objective expression,-1.0*BIOMASS_Ec_iJO1366_core_53p95M_reverse_5c8b1 + 1.0*BIOMASS_Ec_iJO1366_core_53p95M
Compartments,"periplasm, cytosol, extracellular space"


## Step 2: Simulate a model

The model can be simulated by executing `~cameo.core.solver_based_model.SolverBasedModel.solve`.

In [4]:
solution = model.optimize()

A quick overview of the solution can be obtained in form of a pandas [DataFrame](http://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.html) (all solution objects in cameo provide access to data frames through a `data_frame` attribute).

In [5]:
solution

Unnamed: 0,fluxes,reduced_costs
DM_4crsol_c,2.1907e-04,0.0000
DM_5drib_c,2.2103e-04,0.0000
DM_aacald_c,-0.0000e+00,0.0000
DM_amob_c,1.9647e-06,0.0000
DM_mththf_c,4.4010e-04,0.0000
...,...,...
ZN2abcpp,0.0000e+00,-0.0083
ZN2t3pp,0.0000e+00,-0.0021
ZN2tpp,3.3499e-04,0.0000
ZNabcpp,0.0000e+00,-0.0083


A data frame representation of the solution is accessible via `solution.to_frame()`.

In [6]:
solution.to_frame()

Unnamed: 0,fluxes,reduced_costs
DM_4crsol_c,2.1907e-04,0.0000
DM_5drib_c,2.2103e-04,0.0000
DM_aacald_c,-0.0000e+00,0.0000
DM_amob_c,1.9647e-06,0.0000
DM_mththf_c,4.4010e-04,0.0000
...,...,...
ZN2abcpp,0.0000e+00,-0.0083
ZN2t3pp,0.0000e+00,-0.0021
ZN2tpp,3.3499e-04,0.0000
ZNabcpp,0.0000e+00,-0.0083


Data frames make it very easy to process results. For example, let's take a look at reactions with flux != 0

In [7]:
solution.to_frame().query('fluxes != 0')

Unnamed: 0,fluxes,reduced_costs
DM_4crsol_c,2.1907e-04,0.0000e+00
DM_5drib_c,2.2103e-04,0.0000e+00
DM_amob_c,1.9647e-06,0.0000e+00
DM_mththf_c,4.4010e-04,0.0000e+00
BIOMASS_Ec_iJO1366_core_53p95M,9.8237e-01,1.8492e-15
...,...,...
UPPDC1,2.1907e-04,0.0000e+00
USHD,1.9113e-02,0.0000e+00
VALTA,-4.1570e-01,0.0000e+00
ZN2tpp,3.3499e-04,0.0000e+00


## Step 3: Exploring a model

Objects—models, reactions, metabolites, genes—can easily be explored in the Jupyter notebook, taking advantage of tab completion. For example, place your cursor after the period in `model.reactions.` and press the TAB key. A dialog will appear that allows you to navigate the list of reactions encoded in the model. 

In [8]:
model.reactions.PGK # delete PGK, place your cursor after the period and press the TAB key.

0,1
Reaction identifier,PGK
Name,Phosphoglycerate kinase
Stoichiometry,3pg_c + atp_c <=> 13dpg_c + adp_c  3-Phospho-D-glycerate + ATP <=> 3-Phospho-D-glyceroyl phosphate + ADP
GPR,b2926
Lower bound,-1000.0
Upper bound,1000.0


For example, you can access the E4PD (_Erythrose 4-phosphate dehydrogenase_) reaction in the model.

In [9]:
model.reactions.E4PD

0,1
Reaction identifier,E4PD
Name,Erythrose 4-phosphate dehydrogenase
Stoichiometry,e4p_c + h2o_c + nad_c <=> 4per_c + 2.0 h_c + nadh_c  D-Erythrose 4-phosphate + H2O + Nicotinamide adenine dinucleotide <=> 4-Phospho-D-erythronate + 2.0 H+ + Nicotinamide adenine dinucleotide - reduced
GPR,b2927 or b1779
Lower bound,-1000.0
Upper bound,1000.0


Be aware though that due variable naming restrictions in Python dot notation access to reactions (and other objects) might not work in some cases.

In [10]:
# model.reactions.12DGR120tipp  # uncommenting and running this cell will produce a syntax error

In these cases you need to use the `model.reactions.get_by_id`.

In [11]:
model.reactions.get_by_id('12DGR120tipp')

0,1
Reaction identifier,12DGR120tipp
Name,"1,2 diacylglycerol transport via flipping (periplasm to cytoplasm, n-C12:0)"
Stoichiometry,"12dgr120_p --> 12dgr120_c  1,2-Diacyl-sn-glycerol (didodecanoyl, n-C12:0) --> 1,2-Diacyl-sn-glycerol (didodecanoyl, n-C12:0)"
GPR,
Lower bound,0.0
Upper bound,1000.0


Metabolites are accessible through `model.metabolites`. For example, D-glucose in the cytosolic compartment.

In [12]:
model.metabolites.glc__D_c

0,1
Metabolite identifier,glc__D_c
Name,D-Glucose
Formula,C6H12O6


And it is easy to find the associated reactions

In [13]:
model.metabolites.glc__D_c.reactions

frozenset({<Reaction MLTG1 at 0x1163779b0>,
           <Reaction MLTG2 at 0x1163779e8>,
           <Reaction TRE6PH at 0x116557ac8>,
           <Reaction G6PP at 0x116206b00>,
           <Reaction MLTG3 at 0x116377ba8>,
           <Reaction GLCabcpp at 0x11623f3c8>,
           <Reaction MLTG4 at 0x116377c18>,
           <Reaction AMALT2 at 0x11605b438>,
           <Reaction GLCt2pp at 0x11623f470>,
           <Reaction MLTG5 at 0x116377c88>,
           <Reaction TREH at 0x116557d30>,
           <Reaction AMALT1 at 0x11605b588>,
           <Reaction AMALT3 at 0x11605b6a0>,
           <Reaction GLCATr at 0x1162336a0>,
           <Reaction AMALT4 at 0x11605b710>,
           <Reaction XYLI2 at 0x1165a1f28>,
           <Reaction HEX1 at 0x1162a7748>,
           <Reaction LACZ at 0x1162f0f98>,
           <Reaction GALS3 at 0x1162157f0>})

A list of the genes encoded in the model can be accessed via `model.genes`.

In [14]:
model.genes[0:10]

[<Gene b2215 at 0x10c6f3780>,
 <Gene b1377 at 0x10950b4e0>,
 <Gene b0241 at 0x109351be0>,
 <Gene b0929 at 0x109351048>,
 <Gene b4035 at 0x109351d68>,
 <Gene b4033 at 0x109344b38>,
 <Gene b4034 at 0x115e60518>,
 <Gene b4032 at 0x115e60550>,
 <Gene b4036 at 0x115e60588>,
 <Gene b4213 at 0x115e605c0>]

A few additional attributes have been added that are not available in a [cobrapy](https://opencobra.github.io/cobrapy/) model. For example, exchange reactions that allow certain metabolites to enter or leave the model can be accessed through `model.exchanges`.

In [15]:
model.exchanges[0:10]

[<Reaction DM_4crsol_c at 0x115f44390>,
 <Reaction DM_5drib_c at 0x115f443c8>,
 <Reaction DM_aacald_c at 0x115f44400>,
 <Reaction DM_amob_c at 0x115f44438>,
 <Reaction DM_mththf_c at 0x115f44470>,
 <Reaction DM_oxam_c at 0x115f444a8>,
 <Reaction EX_12ppd__R_e at 0x115f44550>,
 <Reaction EX_12ppd__S_e at 0x115f44588>,
 <Reaction EX_14glucan_e at 0x115f445c0>,
 <Reaction EX_15dap_e at 0x115f445f8>]

Or, the current medium can be accessed through `model.medium`.

In [16]:
model.medium

Unnamed: 0,bound
EX_ca2_e,1000.00
EX_cbl1_e,0.01
EX_cl_e,1000.00
EX_co2_e,1000.00
EX_cobalt2_e,1000.00
...,...
EX_sel_e,1000.00
EX_slnt_e,1000.00
EX_so4_e,1000.00
EX_tungs_e,1000.00


It is also possible to get a list of essential reactions ...

In [17]:
from cameo.flux_analysis.analysis import find_essential_reactions
find_essential_reactions(model)[0:10]

[<Reaction DM_4crsol_c at 0x115f44390>,
 <Reaction DM_5drib_c at 0x115f443c8>,
 <Reaction DM_amob_c at 0x115f44438>,
 <Reaction DM_mththf_c at 0x115f44470>,
 <Reaction BIOMASS_Ec_iJO1366_core_53p95M at 0x115f44518>,
 <Reaction EX_ca2_e at 0x115f5c3c8>,
 <Reaction EX_cl_e at 0x115f5c588>,
 <Reaction EX_cobalt2_e at 0x115f5c668>,
 <Reaction EX_cu2_e at 0x115f5c860>,
 <Reaction EX_glc__D_e at 0x115f697b8>]

... and essential genes.

In [18]:
from cameo.flux_analysis.analysis import find_essential_genes
find_essential_genes(model)[0:10]

[<Gene b4245 at 0x115e90048>,
 <Gene b0109 at 0x115f08080>,
 <Gene b2838 at 0x115ea80f0>,
 <Gene b0423 at 0x115f380f0>,
 <Gene b2574 at 0x115e90128>,
 <Gene b3809 at 0x115ea8128>,
 <Gene b4407 at 0x115f38128>,
 <Gene b0175 at 0x115ea8160>,
 <Gene b3992 at 0x115f38160>,
 <Gene b0928 at 0x115e90198>]