# Introduction
We've seen that in optimizing the model, there is always a lot pyruvate formed, more than should be possible.
So in this notebook, I will investigate the pyruvate node to try to find out where this discrepancy is coming from.

Also, I've seen before that we get a cycle formed from F1P, which should be fixed. I will try to do that here aswell.

In [3]:
import cameo
import pandas as pd
import cobra.io
import escher
from escher import Builder

In [4]:
model = cobra.io.read_sbml_model('../model/g-thermo.xml')

In [16]:
model_e_coli = cameo.load_model ('iML1515')

In [17]:
model_b_sub = cameo.load_model('iYO844')

__Duplicate metabolite alac__S_c__
Randomly, i spotted the duplicate metabolite alac__S_b_c and alac__S_c. These should be merged and then the faulty one removed.

In [None]:
#found two duplicate reactions
model.remove_reactions(model.reactions.PYRACT)

In [None]:
model.remove_reactions(model.reactions.PYRACTT)

In [None]:
#APLh is the same as ALACPH, so this one should be removed
model.remove_reactions(model.reactions.ALACPH)

In [None]:
model.reactions.DMORh.id = 'KARA1'

In [None]:
model.reactions.DHMBISO.add_metabolites({model.metabolites.alac__S_c:-1, model.metabolites.alac__S_b_c:1})

In [None]:
model.remove_metabolites(model.metabolites.alac__S_b_c)

In [None]:
#save&commit
cobra.io.write_sbml_model(model,'../model/g-thermo.xml')

__PFL__
In our model PFL is still reversible, which it should not be.

In [None]:
model.reactions.PFL.bounds=(-1000,0)

__SERD-L__ Should also be irreversible.

In [None]:
model.reactions.SERD_L.bounds = (0,1000)

In [None]:
#save&commit
cobra.io.write_sbml_model(model,'../model/g-thermo.xml')

## Pyruvate 
Here I will check which reactions are irreversible and associated with pyruvate. I will go through the list by hand to modify them to be sure they reflect correct thermodynamic boundaries.

In [None]:
rev_id = []
rev_reaction = []
for rct in model.metabolites.pyr_c.reactions:
    if rct.upper_bound > 0 and rct.lower_bound < 0:
        rev_id.append(rct.id)
        rev_reaction.append(rct.reaction)
    else: 
        continue

In [None]:
rev = pd.DataFrame({'ID':rev_id, 'Reaction':rev_reaction})
rev

In [None]:
#this reaction produces CO2. As geobacillus doesn't assimilate CO2, i expect it to be irreversible.
model.reactions.PYRLLOR.bounds = (0,1000)

In [None]:
#this reaction should also be irreversible, as CO2 is produced.
model.reactions.ME1.bounds = (-1000,0)

the __DAPOP__ reaction is prettry much the same as PYK, but with dATP instead of ATP. The PYK enzyme is known to be promiscuous, and can convert dNTPs as well, but really favors ATP/ADP. The efficiency of conversion is around 5x slower with dATP. (see doi:10.1016/j.enzmictec.2008.06.004) So for that reason, this enzyme will be removed, to prevent flux passing through this node instead of the main PYK activity which is to be expected. 

In [None]:
model.remove_reactions(model.reactions.DAPOP)

In [None]:
#CYSDS should be irreversible, as our strain doesnt assimilate hydrogen
model.reactions.CYSDS.bounds = (0,1000)

In [None]:
model.reactions.APLh.id = 'APL'

In [None]:
model.metabolites.thmpp_c.name = 'Thiamine diphosphate'

In [None]:
model.metabolites.get_by_id('2ahethmpp_c').name = '2-Hydroxyethyl-ThPP'

In [None]:
#in this reaction CO2 is formed, so it is to be expected that this is irreversible
model.reactions.PDHam1hi.bounds = (0,1000)

In [None]:
#save& commit
cobra.io.write_sbml_model(model,'../model/g-thermo.xml')

## Reversability of CO2 and O2 reactions
The previous work got me thinking, that reactions production CO2 and consuming O2 should be irreversible too. So here I will look into that too.

In [12]:
model = cobra.io.read_sbml_model('../model/g-thermo.xml')

In [13]:
rev_co2_id = []
rev_co2_reaction = []
for rct in model.metabolites.co2_c.reactions:
    if rct.upper_bound > 0 and rct.lower_bound < 0:
        rev_co2_id.append(rct.id)
        rev_co2_reaction.append(rct.reaction)
    else: 
        continue

In [14]:
rev_co2 = pd.DataFrame({'ID':rev_co2_id, 'Reaction':rev_co2_reaction})
rev_co2

Unnamed: 0,ID,Reaction
0,HDEACPT,hdeacp_c + malACP_c <=> ACP_c + co2_c + oxstacp_c
1,AKGDEHY,akg_c + h_c + thmpp_c <=> 3c1ht_c + co2_c
2,OAACOLY,h_c + osuc_c <=> akg_c + co2_c
3,MMSAD3,coa_c + msa_c + nad_c <=> accoa_c + co2_c + na...
4,KAS14,acACP_c + malACP_c <=> ACP_c + actACP_c + co2_c
5,MMACPAT,malACP_c + myracp_c <=> 3opalmACP_c + ACP_c + ...
6,ACEDIA,alac__S_c <=> co2_c + diact_c + h_c + hacc_c
7,PPNDH,co2_c + h2o_c + phpyr_c <=> h_c + pphn_c
8,OXGDC,akg_c + h_c <=> co2_c + sucsal_c
9,MALACPAT,3omrsACP_c + malACP_c <=> ACP_c + co2_c + tdde...


In [15]:
model.reactions.PPNDH.bounds = (-1000,0)

3OAR40_1 seems to be a wrong reaction. If you search the KEGG ID and information fit to it, you get another reaction. So I will fix the kegg annotations here, and change the name to prevent further confusion.
Again, as here CO2 is produced, I will restrict this reaction to make it irreversible.

In [108]:
model.reactions.get_by_id('3OAR40_1').notes ={}

KeyError: '3OAR40_1'

In [17]:
model.reactions.get_by_id('3OAR40_1').name = 'R10707'

In [18]:
model.reactions.get_by_id('3OAR40_1').id = 'AMACT'

In [19]:
model.reactions.AMACT.notes['ENZYME'] = 'E.C.2.3.1.180'
model.reactions.AMACT.notes['KEGG'] = '10707'
model.reactions.AMACT.notes['NAME'] = '3-ketoacyl-acyl carrier protein synthase III'
model.reactions.AMACT.notes['DEFINITION'] = 'Acetyl-CoA + Malonyl-[acyl-carrier protein] <=> Acetoacetyl-[acp] + CoA + CO2'

In [20]:
model.reactions.AMACT.bounds = (0,1000)

In [21]:
#KAS14
model.reactions.KAS14.bounds = (0,1000)

In [22]:
model.reactions.MMSAD3.bounds = (0,1000)

In [23]:
model.metabolites.get_by_id('3opalmACP_c').id = '3oxhdacp'

In [24]:
#again here we produce CO2, so should be irreversible. Also is seems to be a condensation so would make sense to expel CO2.
model.reactions.MMACPAT.bounds = (0,1000)

In [25]:
model.metabolites.sucsal_c.name = 'Succinate semialdehyde'

In [26]:
#likely irreversible
model.reactions.OXGDC.bounds = (0,1000)

In [27]:
model.reactions.PPND.bounds = (0,1000)

In [28]:
model.reactions.MALACPAT.bounds = (0,1000)

In [29]:
model.reactions.OOR3r.bounds = (0,1000)

In [30]:
model.reactions.DECACPAT.bountds = (0,1000)

In [31]:
model.reactions.OCTACPAT.bounds = (0,1000)

In [32]:
model.reactions.MMSAD1.bounds = (0,1000)

In [33]:
model.reactions.HEXACPAT.bounds = (0,1000)

In [34]:
model.reactions.CPPPGO2.bounds = (-1000,0)

In [35]:
model.reactions.IGPS.bounds = (0,1000)

In [36]:
model.reactions.UPPDC1.bounds = (-1000,0)

In [37]:
model.reactions.BUTACPAT.bounds = (0,1000)

In [38]:
model.reactions.MAHMPDC.bounds = (0,1000)

In [39]:
#based on thermodynamics
model.reactions.OAACOLY.bounds = (0,1000)

In [40]:
#based on thermodynamics
model.reactions.AKGDEHY.bounds = (0,1000)

In [41]:
model.reactions.HDEACPT.bounds = (0,1000)

In [42]:
#save& commit
cobra.io.write_sbml_model(model,'../model/g-thermo.xml')

Error encountered trying to <Setting notes on sbase: <Reaction R_AMACT "R10707">>.
LibSBML error code -3: The requested action could not be performed. This can occur in a variety of contexts, such as passing a null object as a parameter in a situation where it does not make sense to permit a null object.


### Reactions involving oxygen
Similar to the motivation about CO2, here I will investigate which reactions use oxygen and which can produce oxygen, as the latter is unrealistic to expect in many cases.

In [45]:
rev_o2_id = []
rev_o2_reaction = []
for rct in model.metabolites.o2_c.reactions:
    if rct.upper_bound > 0 and rct.lower_bound < 0:
        rev_o2_id.append(rct.id)
        rev_o2_reaction.append(rct.reaction)
    else: 
        continue

In [46]:
rev_o2 = pd.DataFrame({'ID':rev_o2_id, 'Reaction':rev_o2_reaction})
rev_o2

Unnamed: 0,ID,Reaction
0,GLYO1,gly_c + h2o_c + o2_c <=> glx_c + h2o2_c + nh4_c
1,DHORDfum,dhor__S_c + o2_c <=> h2o2_c + orot_c
2,MOX,mal__L_c + o2_c <=> h2o2_c + oaa_c
3,NODOy,nadph_c + 2.0 no_c + 2.0 o2_c <=> h_c + nadp_c...
4,PPPGO_1,3.0 o2_c + pppg9_c <=> 3.0 h2o2_c + ppp9_c


In [54]:
model.metabolites.o2_c.summary()

Unnamed: 0_level_0,Unnamed: 1_level_0,PERCENT,FLUX,REACTION_STRING
RXN_STAT,ID,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
PRODUCING,ASPO6,81.939985,33.769262,h2o2_c + h_c + iasp_c --> asp__L_c + o2_c
PRODUCING,GLYO1,15.714488,6.476284,gly_c + h2o_c + o2_c <=> glx_c + h2o2_c + nh4_c
PRODUCING,DHORDfum,1.172764,0.483322,dhor__S_c + o2_c <=> h2o2_c + orot_c
PRODUCING,NODOx,1.172764,0.483322,h_c + nad_c + 2.0 no3_c --> nadh_c + 2.0 no_c ...
CONSUMING,ASPO1,98.827236,40.728868,asp__L_c + h2o_c + o2_c --> h2o2_c + nh4_c + o...


In [59]:
#should be reversible, but kills all biomass.
#model.reactions.GLYO1.bounds = (0,1000)

Seems the DHORDfum reaction is wrong. It should contain fumarate instead of oxygen. I will change this and make it irreversible as it should be. 
Further inspection: the original EC1.3.3.1 reaction no longer exisists in databases, it has  been elucidated to be coupled to succinate and fumarate instead. I.e. EC1.3.98.1

In [67]:
model.reactions.DHORDfum.add_metabolites({model.metabolites.o2_c:1, model.metabolites.h2o2_c:-1, model.metabolites.fum_c: -1, model.metabolites.succ_c:1})

In [73]:
#this reaction should be irreversible but kills biomass
#model.reactions.DHORDfum.bounds = (0,1000)

In [81]:
model.reactions.MOX.bounds = (0,1000)

In [88]:
model.metabolites.no_c.name = 'Nitric oxide'

In [89]:
model.reactions.NODOy.bounds = (0,1000)

In [103]:
model.reactions.PPPGO_1.bounds = (0,1000)

In [111]:
model.reactions.AMACT.annotation = {}

In [112]:
model.reactions.AMACT.annotation['kegg.reaction'] = 'R10707'
model.reactions.AMACT.annotation ['sbo'] = 'SBO:0000176'

In [114]:
model.reactions.AMACT.notes ={}

In [122]:
model.reactions.AMACT.notes['ENZYME'] = 'E.C.2.3.1.180'
model.reactions.AMACT.notes['KEGG ID'] = 'R10707'
model.reactions.AMACT.notes['NAME'] = '3-ketoacyl-acyl carrier protein synthase III'
model.reactions.AMACT.notes['DEFINITION'] = 'Acetyl-CoA + Malonyl-[acyl-carrier protein] <=> Acetoacetyl-[acp] + CoA + CO2'

In [121]:
#save & commit
cobra.io.write_sbml_model(model,'../model/g-thermo.xml')

Error encountered trying to <Setting notes on sbase: <Reaction R_AMACT "R10707">>.
LibSBML error code -3: The requested action could not be performed. This can occur in a variety of contexts, such as passing a null object as a parameter in a situation where it does not make sense to permit a null object.


In [123]:
model.optimize()

Unnamed: 0,fluxes,reduced_costs
IDPh,0.00000,-7.068747e-03
CAT,480.67777,-8.840469e-18
PDHam1hi,0.00000,-1.737434e-17
CCP,0.00000,-4.213574e-19
HYDA,0.00000,-0.000000e+00
...,...,...
EX_amylose_e,0.00000,-5.340831e-02
EX_pyr_e,0.00000,-7.854163e-04
FORt,0.00000,1.355253e-20
PIt,-1000.00000,-0.000000e+00


### Propanoate cycle
When looking into the amp biosynthesis, I ran into a cycle that just dissipates ATP and produced amp. This ivolves propanoate and looks as follows: prpnte_c --> ppcoa_c --> ppap_c --> prpnte. 
Here I need to find a way to fix that. 

After looking into this issue, it seems to come from the reaction PRPNTELIG	, converting propanoate into propanoyl-coa, using ATP to do so. It was added into the model due to the automatic annotation of the gene to E.C. 6.2.1.13. I think this is a wrong annotation. Looking into the genome I cannot find this enzyme (which forms adp) but i can find the EC 6.2.2.1 reaction, which is amp forming and reflects the PRPNTELIG reaction. 

The prefered substrate of this reaction is acetate, not propanoate. In some distant organisms it has been shown to be able to function on propanoate, but this has not bee shown for B. subtilis, and so I think it makes sense to exclude this reaction from the model. This strain is not known to grow on or produce large amounts of propanoate, so even if this may be present it will ikely carry very little flux, not as is happening in the model now. 

This is a knowledge gap, and so we have to make the best decision we can with available information. 

In [32]:
model = cobra.io.read_sbml_model('../model/g-thermo.xml')

In [33]:
#saw incorrectly named reaction that should also be irreversible.
model.reactions.PCPPTh.id = 'PTA2'

In [34]:
model.reactions.PTA2.bounds = (0,1000)

In [45]:
model.metabolites.ppap_c.name = 'Propanoyl phosphate'

In [63]:
#should this reaction be in the model? I will remove it
model.remove_reactions(model.reactions.PRPNTELIG)


need to pass in a list


need to pass in a list



In [62]:
#check the EC.6.2.2.1 is there for acetate conversion: it is.
# model.reactions.ACS

0,1
Reaction identifier,ACS
Name,R00235
Memory address,0x027e39cb6b88
Stoichiometry,ac_c + atp_c + coa_c --> accoa_c + amp_c + ppi_c  Acetate + ATP + CoA --> Acetyl-CoA + AMP + Pyrophosphate
GPR,RTMO00940 or RTMO01624 or RTMO02405 or RTMO02238
Lower bound,0.0
Upper bound,1000.0


In [66]:
#save& commit
cobra.io.write_sbml_model(model,'../model/g-thermo.xml')

### Acetyl adenylate
acetyl adenylate (aad_c) is an intermediate in the formation of acetyl-coa. It seems there is also a cycle going on here.
- accoa + amp <--> aad + coa (rct AADCOAT)
- aad + ppi <--> ac + atp (rct ATPACAT)
so net reaction is accoa + amp + ppi <--> ac + atp + coa.

At the same time the model also has the ACS rection (also E.C. 6.2.1.1): 
ac_c + atp_c + coa_c --> accoa_c + amp_c + ppi_c
which we know should be irreversible. So with this combination you can see a cycle starts to arise.

To solve that, we need to define the reversability of the first two reactions, or even remove them completely. acetyl adenylate  is a metabolite not often included, and doesn't connect anywhere else into the model. So for simplicities sake (and keep it more bigg compliant), i will remove the two reactions and the aad_c metabolite.

You can observe that doing this really decreases the biomass accumulation to 12/h, which is a good sign.

In [78]:
model.remove_reactions(model.reactions.AADCOAT)


need to pass in a list


need to pass in a list



In [79]:
model.remove_reactions(model.reactions.ATPACAT)


need to pass in a list


need to pass in a list



In [80]:
model.remove_metabolites(model.metabolites.aad_c)

In [82]:
#save & commit
cobra.io.write_sbml_model(model,'../model/g-thermo.xml')

### F1P cylce (issue #34)

It seems that this may come from the ridiculous amounts of G3P that are somehow made in the PPP. 
It seems we have a lot of r5p_c, coming from r1p_c. the r1p_c comes from two reactions: adn_c and gsn_c. These in turn are made from amp and gmp respectively. So the problem is that we have so many of these in the model that influences PPP and glycolysis.
__So there is some fundamental problem with the nucleotides that causes these fluxes. This maybe ties into the RNA/DNA node problem too..__

Also, this has highlighted to me the problem with thiamin: this is imported and converted into thmpp_c and then the majority further into thmtp_c and exported... This just consumes some atp and ends up forming huge amounts of amp. why does the model do this?? 
A little bit of the thmpp_c is used for converting adhlam to alac__S_b_c, but also at higher levels than it should.

In [None]:
model.metabolites.get_by_id('2ahethmpp_c').summary()

In [None]:
model.metabolites.thmpp_c.summary()

In [137]:
model.optimize()

Unnamed: 0,fluxes,reduced_costs
IDPh,513.506272,0.000000
CAT,0.000000,0.000000
PDHam1hi,0.000000,0.000000
CCP,-0.185396,-0.000000
HYDA,0.000000,-0.000000
...,...,...
FORt,0.000000,0.000000
PIt,-1000.000000,-0.007913
EX_pi_e,1000.000000,0.000000
DCTUP,0.000000,0.000000
