# Introduction
In issue #18, we observed that there were still some exchanges that were critical to biomass formation. Here I will revist those and see if they are all still required or not. And if not, then I will remove them from the medium.

In [1]:
import cameo
import pandas as pd
import cobra.io
import escher
from escher import Builder
from cobra import Metabolite, Reaction

In [3]:
model = cobra.io.read_sbml_model('../model/g-thermo.xml')

In [4]:
model_e_coli = cameo.load_model('iML1515')

In [5]:
model_b_sub = cameo.load_model('iYO844')

__NADP__ Here we should not have any exchange or transport, as this is not expected. If removing NADP from the biomass doesn't kill the biomass prediction, I will remove the exchange as well as the transport of NADP from our model.

In [5]:
model.optimize().objective_value

1.7026543554461888

In [6]:
model.reactions.EX_nadp_e.bounds = (0,0)

In [7]:
model.optimize().objective_value

1.6952458831276083

So it seems that now our model is no longer dependent on NADP(H) supply as it should be. I will remove the exchange and transport reactions as it is not expected that these are actually present.

In [8]:
model.remove_reactions(model.reactions.EX_nadp_e)


need to pass in a list


need to pass in a list



In [9]:
model.remove_reactions(model.reactions.NADPt)

In [16]:
#save&commit
cobra.io.write_sbml_model(model,'../../model/g-thermo.xml')

__Quinone__ Qh2 is currently also supplied. Still, I cannot remove it. So I think here we need to check the quinone biosynthesis pathway. Looking into the KEGG pathway and some literature, our strain has all the genes for the menaquinone biosynthesis pathway. We always had the quinone as a more unidentified quinone, though here i think I will change the ID's and so to specify this better. 

In [10]:
#quinone becomes menaquinone

In [11]:
model.metabolites.ubiquin_c.id = 'mqn_c'

In [12]:
model.metabolites.mqn_c.name = 'Menaquinone'

In [13]:
model.metabolites.mqn_c.formula = 'C51H72O2'

In [14]:
model.metabolites.mqn_c.annotation['kegg.compound'] = 'C00828'
model.metabolites.mqn_c.annotation['chebi'] = 'CHEBI:16374'
model.metabolites.mqn_c.annotation['metanetx.chemical'] = 'MNXM97071'

In [15]:
model.metabolites.qh2_c.name = 'Menaquinol'

In [16]:
model.metabolites.qh2_c.charge = 0
model.metabolites.mqn_c.charge = 0

In [17]:
model.metabolites.qh2_c.formula = 'C51H74O2'

In [18]:
model.metabolites.qh2_c.annotation['kegg.compound'] = 'C05819'
model.metabolites.qh2_c.annotation['chebi'] = 'CHEBI:18151'
model.metabolites.qh2_c.annotation['metanetx.compound'] = 'MNXM442'

In [19]:
#check menaquinol biosynthesis from chorismate. 
#We know chorismate production is possible.

In [20]:
model.reactions.SBZCOAHYDRO.id = 'DHNCOAS'

In [21]:
#NPHS seems to be an incorrect reaction covering two other reactions. 
# I will remove it
model.remove_reactions(model.reactions.NPHS)

In [22]:
#missing conversion from 14dhncoa_c to dhna_c. Add that here

In [23]:
model.add_reaction(Reaction(id='DHNCOAT', name = '1,4-dihydroxy-2-napthoyl-CoA thioesterase'))

In [24]:
model.reactions.DHNCOAT.annotation = model_e_coli.reactions.DHNCOAT.annotation

In [25]:
model.reactions.DHNCOAT.add_metabolites({
    model.metabolites.get_by_id('14dhncoa_c'): -1,
    model.metabolites.h2o_c: -1,
    model.metabolites.coa_c: 1,
    model.metabolites.dhna_c: 1
})

The pathway seems to lead to the mql7_c metabolite: a quinone that has been added but it not used in any of the other reactions. Therefore, I will modify this so that the menaquinol and menaquinone are the outcomes of the pathway.

In [26]:
model.metabolites.get_by_id('2dmmq_c').name = 'Demethylmenaquinol'

In [27]:
model.metabolites.get_by_id('2dmmq_c').annotation['kegg.compound'] = 'C19847'
model.metabolites.get_by_id('2dmmq_c').annotation['chebi'] = 'CHEBI:55437'

In [28]:
model.metabolites.get_by_id('octdp_c').charge = -3

In [29]:
model.reactions.AMETDMMQ.id = 'AMMQLT'

In [30]:
#change this reaction so it produces the quinone mqn_c
model.reactions.AMMQLT.add_metabolites({
    model.metabolites.mql7_c:-1,
    model.metabolites.qh2_c:1,
    model.metabolites.h_c: 2
})

Now that we've fixed this, we see that removing qh2 from the medium still doesn't rescue biomass. Looking through the pathway, I've spotted that adding a sink of ahcys_c allows growth to be restored. So I need to investigate how this should be done in metabolism and fix that here.

ahcys is the demethylated version of SAM. So I need to check that the regenration pathway is present, to regenerate SAM. It seems the cycle to regenerate SAM is present, but there is one byproduct that i wonder if it can be converted elsewhere in metabolism: dhptd. This metabolite is currently a dead end. So I need to check what happens to this metabolite to connect it further in metabolism, as right now it blocks all SAM related reactions.

In E. coli this dhptd is converted into dhptdp and dhptdd. This is then converted into accoa and dhap, and so linked to furhter metabolism. 

For the first reaction, converting dhptd into dhptdp: there is a gene annotated that shows significant similarity in a blast to the bacillus subtilis Autoinducer-2 kinase. There is no other indication what may happen to this metabolite, so I will add this reaction.

In [31]:
#add dhptdp_c metabolites
model.add_metabolites(Metabolite(id='dhptdp_c', name = '(4S)-4-hydroxy-5-phosphonooxypentane-2,3-dione', compartment = 'c', formula = 'C5H7O7P',charge = -2 ))

In [32]:
model.metabolites.dhptdp_c.annotation = model_e_coli.metabolites.dhptdp_c.annotation

In [33]:
#add AI2K reaction
model.add_reaction(Reaction(id='AI2K', name = 'Autoinducer-2 kinase'))

In [34]:
model.reactions.AI2K.annotation = model_e_coli.reactions.AI2K.annotation

In [35]:
model.reactions.AI2K.add_metabolites({
    model.metabolites.atp_c:-1,
    model.metabolites.dhptd_c:-1,
    model.metabolites.adp_c:1,
    model.metabolites.dhptdp_c:1,
    model.metabolites.h_c: 1
})

The next step in the pathway converts dhptdp to dhptdd via a Phospho-AI-2-isomerase. It is a very short protein and so would be expected that it is not automatically annotated in the genome. A blast of the e. coli protein to the geobacillus genome, with a slight change of the algorithm parameters to account for the short protein find a significant hit. Again there is no other indication or way this metabolite can be converted further (that we know of) and so I will incorporate this reaction here.

In [36]:
#add dhptdd_c metabolites
model.add_metabolites(Metabolite(id='dhptdd_c', name = '3-hydroxy-5-phosphonooxypentane-2,4-dione', compartment = 'c', formula = 'C5H7O7P',charge = -2 ))

In [37]:
model.metabolites.dhptdd_c.annotation = model_e_coli.metabolites.dhptdd_c.annotation

In [38]:
#add PAI2I reaction
model.add_reaction(Reaction(id='PAI2I', name = 'Phospho-AI-2 isomerase'))

In [39]:
model.reactions.PAI2I.annotation = model_e_coli.reactions.PAI2I.annotation

In [40]:
model.reactions.PAI2I.add_metabolites({
    model.metabolites.dhptdp_c:-1,
    model.metabolites.dhptdd_c:1
})

Final part of the pathway: the e. coli enzyme gave a significant hit in genome of our strain and so I will include it again.

In [41]:
#add the final conversion of dhptdd to accoa and dhap
model.add_reaction(Reaction(id = 'PAI2T', name = '3-hydroxy-2,4-pentadione 5-phosphate thiolase'))

In [42]:
model.reactions.PAI2T.annotation = model_e_coli.reactions.PAI2T.annotation

In [43]:
model.reactions.PAI2T.add_metabolites({
    model.metabolites.coa_c:-1,
    model.metabolites.dhptdd_c:-1,
    model.metabolites.accoa_c:1,
    model.metabolites.dhap_c:1,
    model.metabolites.h_c:-1
})

All the above fixes the problems in the qh2 dependency. So now I can remove the qh2 exchange and trasnport reactions.

Also I should remove the MENAOR reaction and mql7 and menqui metabolites now. 

In [44]:
model.remove_reactions(model.reactions.EX_qh2_e)

In [45]:
model.remove_reactions(model.reactions.QH2t)

In [46]:
model.remove_reactions(model.reactions.MENAOR)

In [47]:
model.remove_metabolites(model.metabolites.mql7_c)

In [48]:
model.remove_metabolites(model.metabolites.menqui_c)

In [112]:
#save&commit
cobra.io.write_sbml_model(model,'../../model/g-thermo.xml')

__THM__
Also thiamine is still needed to be supplied. Here this can be possible as we have seen that supplementing with thiamine really increases the growth rate on minimal medium. It would be nice to get the model to still grow without thiamine but just much slower, and when it is added this is alleviated. But not sure this is possible.

For this, I will investigate thiamine biosynthesis. The genome seems to have the majority of the genes needed so it should be present. In Bacillus subtilis, thiamine biosynthesis occurs from the linking of thiazole (THZ) phosphate (THZ-P) and pyrimidine (HMP) pyrophosphate (HMP-PP). So I will add this reaction and then in turn check the biosynthesis of both of these components. (see https://www.ncbi.nlm.nih.gov/pmc/articles/PMC179069/pdf/1793030.pdf and https://link.springer.com/content/pdf/10.1007/s002030050713.pdf and https://www.annualreviews.org/doi/pdf/10.1146/annurev.biochem.78.072407.102340) 

In [451]:
model= cobra.io.read_sbml_model('../../model/g-thermo.xml')

In [49]:
#reaction that makes thmp: 
model.reactions.MAHMPDC

0,1
Reaction identifier,MAHMPDC
Name,2-methyl-4-amino-5-hydroxymethylpyrimidine-diphosphate:2-(2-carboxy-4-methylthiazol-5-yl)ethyl ph...
Memory address,0x02425ff85448
Stoichiometry,2mahmp_c + cthzp_c --> co2_c + h_c + ppi_c + thmmp_c  2-Methyl-4-amino-5-hydroxymethylpyrimidine + 2-(2-Carboxy-4-methylthiazol-5-yl)ethyl phosphate --> CO2 + H+ + Pyrophosphate + Thiamin
GPR,
Lower bound,0.0
Upper bound,1000.0


__HMP-P synthesis__ The HMP-PP that condenses into thiamin is present: 2mahmp_c. This can be made from 4ampm_c, but this metabolite cannot be produced. So I need to add the reaction the converts air_c to this metabolite.

In [50]:
#add reaction to convert AIR to 4ahmmp_c
model.add_reaction(Reaction(id='HMP', name = '5-amino-1-(5-phospho-D-ribosyl)imidazole formate-lyase'))

In [51]:
model.reactions.HMP.annotation['sbo'] = 'SBO:0000247'
model.reactions.HMP.annotation['kegg.reaction'] = 'R03472'
model.reactions.HMP.annotation['ec-code'] = '4.1.99.17'

In [52]:
model.reactions.HMP.add_metabolites({
    model.metabolites.air_c:-1,
    model.metabolites.amet_c:-1,
    model.metabolites.get_by_id('4ampm_c'): 1,
    model.metabolites.met__L_c:1,
    model.metabolites.for_c:1,
    model.metabolites.co_c:1,
    model.metabolites.dad_5_c:1,
    model.metabolites.h_c: 2
})

In [53]:
model.metabolites.get_by_id('4ampm_c').formula = 'C6H8N3O4P'

In [54]:
model.metabolites.get_by_id('4ampm_c').charge = -2

In [90]:
#kidding, i found the AMETLY reaction later, so this needs to be removed or we have duplicates-
model.remove_reactions(model.reactions.AMETLY)


need to pass in a list


need to pass in a list



In [55]:
#add 4ampmp metabolite
model.add_metabolites(Metabolite(id='4ampmp_c', name = '4-Amino-5-hydroxymethyl-2-methylpyrimidine diphosphate', compartment = 'c', formula = 'C6H9N3O7P2', charge = -2))

In [56]:
model.metabolites.get_by_id('4ampmp_c').annotation['sbo'] = 'SBO:0000247'
model.metabolites.get_by_id('4ampmp_c').annotation['kegg.compound'] = 'C04752'
model.metabolites.get_by_id('4ampmp_c').annotation['chebi'] = 'CHEBI:16629'

__THZ-Pc__ above we allowed the synthesis of the one metabolites needed  to make thiamin. Now i have to go through the other branch that is responsible for the production of the thiazole ring.

First I need to check the Thiazole tautomerase (EC:5.3.99.10). This reaction should be between the cmtdepp_c and cthzp_c metabolites. the cthzp_c metabolite is the metabolite that will be used to form the thiamine. I have found it as reaction CMTEPISO. 

Synthesis of cmtdepp_c occurs in reaction DXYL5PTST. The problem here is the recycling of tcscp_c and scpgg_c. scpgg_c gets converted into ascp_c. This should then react with an Enzyme-a-sulfanyl-cysteine (enzscys_c) to give tcscp_c again. The E-sulfanylcysteine is made from the enzyme-SH (enzcys_s) reaction with L-cysteine (encoded in LCTST).
So I only need to add the reaction that uses ascp_c and enzscys to form tcscp_c.

In [57]:
#dhgly synthesis is only possible aerobically, which may also explain why complete anaerobic fermentation on minimal medium is not possible.

In [58]:
#biosynthesis of tcscp is not possible
model.metabolites.tcscp_c.annotation['sbo'] = 'SBO:0000247'

In [59]:
#add reaction
model.add_reaction(Reaction(id = 'THIOD', name = 'Sulfur transferase'))

In [60]:
model.reactions.THIOD.annotation['sbo'] = 'SBO:0000176'
model.reactions.THIOD.bounds = (-1000,1000)
model.reactions.THIOD.annotation['kegg.reaction'] = 'R07461'
model.reactions.THIOD.annotation['ec-code'] = '2.8.1.11'

In [61]:
model.reactions.THIOD.add_metabolites({
    model.metabolites.ascp_c:-1,
    model.metabolites.enzscys_c:-1,
    model.metabolites.tcscp_c:1,
    model.metabolites.enzcys_c:1,
    model.metabolites.amp_c:1
})

In [62]:
model.reactions.ATPTAT.add_metabolites({model.metabolites.h_c:1})

In [63]:
model.metabolites.ascp_c.formula = 'C14H18N7O9PR'
model.metabolites.ascp_c.charge = -1

In [64]:
model.metabolites.tcscp_c.formula = 'RCOS'
model.metabolites.tcscp_c.charge = -1

In [65]:
model.metabolites.scpgg_c.formula = 'RCO2'
model.metabolites.scpgg_c.charge = -1

In [66]:
model.metabolites.ascp_c.formula = 'RC11O8H13PN5'
model.metabolites.ascp_c.charge = 0

In [67]:
model.reactions.ATPTAT.add_metabolites({model.metabolites.h_c:-2})

The THIOD reaction is missing an electron donor. There is no indication what that is, so I will add a ambiguous one (similar to the strtegy in ACEDIA and SELCYSLY), and regenerate it with NADP(H). As this is an anabolic reaction I would say this makes most sense. .
Here this will be acc3_c and hacc_3

In [68]:
model.add_metabolites(Metabolite(id='acc3_c', name = 'Acceptor', compartment = 'c', formula = 'R', charge = 0 ))

In [69]:
model.metabolites.acc3_c.annotation['sbo'] = 'SBO:0000247'
model.metabolites.acc3_c.notes = 'Unknown electron acceptor'

In [70]:
model.metabolites.acc_c.annotation['sbo'] = 'SBO:0000247'
model.metabolites.acc2_c.annotation['sbo'] = 'SBO:0000247'
model.metabolites.hacc_c.annotation['sbo'] = 'SBO:0000247'
model.metabolites.hacc2_c.annotation['sbo'] = 'SBO:0000247'

AttributeError: DictList has no attribute or entry acc_c

In [71]:
model.add_metabolites(Metabolite(id='hacc3_c', name = 'Hydrogen-Acceptor', compartment = 'c', formula = 'HR', charge = -1))

In [72]:
model.metabolites.hacc3_c.annotation['sbo'] = 'SBO:0000247'

In [73]:
#add to THIOD reaction
model.reactions.THIOD.add_metabolites({
    model.metabolites.hacc3_c:-1,
    model.metabolites.acc3_c:1,
    model.metabolites.h_c:2
})

In [74]:
#need to add acceptor regeneration
model.add_reaction(Reaction(id='HACC3R', name ='Regeneration of hacc3'))

In [75]:
model.reactions.HACC3R.notes= 'Assumed with NADP(H)'
model.reactions.HACC3R.annotation['sbo']='SBO:0000176'

In [76]:
model.reactions.HACC3R.add_metabolites({
    model.metabolites.acc3_c:-1,
    model.metabolites.hacc3_c:1,
    model.metabolites.nadph_c:-1,
    model.metabolites.nadp_c:1
})

In [77]:
#everything is now mass balanced, should solve the issue of thm production?

In [78]:
model.reactions.DXYL5PTST.bounds = (0,1000)

In [79]:
model.reactions.ATPTAT.bounds = (0,1000)

In [80]:
model.reactions.TMPPP.bounds = (0,1000)

In [84]:
model.metabolites.thmmp_c.name = 'Thiamin monophosphate'

Now we've solved it so that the only thing that kills the biomass is somewhere in the HMP reaction. For the quinone above, I checked that the regeneration of SAM (amet_c) from the produced ahcys was possible. The problem in this reaction is not the amet or air supply, but the removal of one of the products: dad_5_c. So here I will fix this as right now it can only be produced and not consumed (i.e. a dead end metabolite).

It seems that in our model we are missing the 5DOAN reaction that can be used to degrade dad_5 in e. coli. The e. coli enzyme responsible for this is a 5'-methylthioadenosine/S-adenosylhomocysteine nucleosidase (mntN). Experimental validation has shown that this is not the main reaction of the enzyme: it can also convert dad_5_c in the 5DOAN reaction that is included in the e. coli model. In our strain, we have a ortholog of the mntN enzyme encoded in the genome (as does bacillus subtilis). Therefore I would expect that this conversion is possible in our strain. 

So here i will add the 5DOAN reaction which hydrolyses dad_5 into 5'-deoxyribose and Adenine. our model has a ribokinase which has been shown to work on deoxyribose (DRBK reaction). Therefore adding the 5DOAN reaction will fit into this blocked reaction aswell, solving another problem in the meantime. 

In [131]:
#add 5DOAN reaction
model.add_reaction(Reaction(id='5DOAN', name = '5-deoxyadenosine nuclosidase'))

In [133]:
model.reactions.get_by_id('5DOAN').annotation = model_e_coli.reactions.get_by_id('5DOAN').annotation

In [135]:
model.reactions.get_by_id('5DOAN').add_metabolites({
    model.metabolites.dad_5_c:-1,
    model.metabolites.h2o_c:-1,
    model.metabolites.drib_c:1,
    model.metabolites.ade_c:1
})

This solved the thiamin dependency in our model and now we can remove the thiamine from the default medium.

In [141]:
model.reactions.EX_thm_e.bounds = (0,1000)

In [146]:
#add new reactions to the correct groups
reactions = [model.reactions.HMP, model.reactions.THIOD, model.reactions.HACC3R]

In [147]:
for rct in reactions:
        model.groups.get_by_id('00730 - Thiamine metabolism').add_members(rct)


need to pass in a list



In [None]:
#save&commit
cobra.io.write_sbml_model(model,'../../model/g-thermo.xml')

# CHECK THIS

In [None]:
model.medium