In [3]:
import pandas as pd
import cobra
from pickle import load

In [4]:
# load the dataframe of important features
imp_frame = pd.read_csv('../results/ensemble_learning_important_reactions_round2.csv',sep=',')

# load the current version of the reconstruction that was used to generate the ensemble
model = cobra.io.read_sbml_model('../results/reconstructions/psy_cor_6.xml')

# read the ensemble so we can pull the gapfilled reactions to add to the draft reconstruction
with open("../results/ensembles/psy_ensemble_500_SEED_biomass_round2.pickle",'rb') as infile:
            ensemble = load(infile)

In [5]:
imp_frame.head(20)

Unnamed: 0.1,Unnamed: 0,importance,fraction active in 0,fraction active in 1
0,rxn12060_c_lower_bound,0.027648,0.164557,0.824561
1,rxn00394_c_upper_bound,0.026927,0.303797,0.011696
2,rxn12060_c_upper_bound,0.025805,0.164557,0.824561
3,rxn00399_c_upper_bound,0.024043,0.512658,0.032164
4,rxn00399_c_lower_bound,0.020256,0.512658,0.032164
5,rxn03990_c_lower_bound,0.013899,0.310127,0.011696
6,rxn11674_c_lower_bound,0.013391,0.487342,0.081871
7,rxn05303_c_lower_bound,0.011722,0.183544,0.0
8,rxn04702_c_upper_bound,0.011248,0.360759,0.035088
9,rxn11674_c_upper_bound,0.011186,0.487342,0.081871


First, we'll curate a few reactions from the previous round that we finished curating after starting the second round of ensemble generation. The first is rxn01423, L-2-aminoadipate:2-oxoglutarate aminotransferase. There is evidence that this exists in Pseudomonas putida (doi: 10.1128/JB.187.21.7500–7510.2005); there are two candidates that match in the P. syringae sv. tomato DC3000 genome, PSPTO_4775 (https://www.uniprot.org/uniprot/Q87W08) and PSPTO_5504 (https://www.uniprot.org/uniprot/Q87U11). Both are GntR transcriptional regulators that also have aminotransferase activity (MocR family, which transfers amino groups to keto-acid receptors), there is significant sequence similarity to a Pseudomonas florescens 2-aminoadipate aminotransferase (Uniprot accession P florescens: A0A0K1QLZ8). BLAST of the two putative P. syringae sv tomato DC3000 genes suggests the aminotransferase domain is conserved, but the proteins are ~100AA different in length. We hypothesize that a duplication event led to insertion of the aminotransferase in a transcription factor gene, so we will add both genes within an OR relationship for this reaction.

In [6]:
rxn1 = ensemble.base_model.reactions.get_by_id('rxn01423_c').copy()
rxn1.gene_reaction_rule  = ' PSPTO_4775 or PSPTO_5504 '
rxn1.notes = {'ensemble_curation_step':1}
model.add_reactions([rxn1])

rxn00394 has evidence for function in various Pseudomonas species, including Pseudomonas putida (https://www.microbiologyresearch.org/docserver/fulltext/micro/130/1/mic-130-1-69.pdf?expires=1548948345&id=id&accname=sgid024758&checksum=52D0F9E337305B0E5083196A4CEE3C14), however, sequence similarity has only revealed the presence of an aminotranferase domain that occurs in Pto; there is a group recently that annotated this feature as having arginase, agimatase, and fumerase activity without concise agreement (University of Toronto), decided does ont exist in Pst

rxn00245 has evidence for being non-functional in Pseudomonas syringae pv. psyringae. Evidence is based on single source carbon utilization of tartrate producing no growth, considered a hallmark of Psm (https://aem.asm.org/content/59/4/1018.short)

rxn00399 - Nitric oxide synthase, does not have significant sequence homology within the genome to the NO synthase from Nocardia(or any other microbes), which is the first microbe in which NOS was found (10.1006/bbrc.1994.2317), also NO induces plant defense response, so maybe this became an evolutionary detriment for phytopathogens?, further there is no sequence similarity to PAO1 arginine deiminase, which shares a similar function

rxn05303 - arginine transport reaction, supports a lower bound of 0 because this is a transport reaction into the cell, however, all supoort in the literature shows ABC transport, energy dependant for arginine, with mild suggestions that GABA permease can accomidate arginine?

In [8]:
rxn2 = ensemble.base_model.reactions.get_by_id('rxn05303_c').copy()
rxn2.gene_reactions_rule = 'Unknown'
rxn2.notes = {'ensemble_curation_step':1}
model.add_reactions([rxn2])

rxn03990 - no evidence supporting the exsistance of this reaction, possibly the reverse of the reaction, but as it stands as written, the reaction does not proceed in the manner that is presented
further, a strain is charaterized as not producing nitrate, https://www.ndrs.org.uk/pdfs/022/NDR_022010.pdf

rxn27379 - no evidence in pseudomonas or other bacterial species, only seems present in humans as yLAT

rxn04702 - no evidence in pseudomonas species, 

Working down the round 2 list, let's investigate rxn12060, a multistep reaction that converts xanthine and s-adenosyl-L-methionine to theophylline and s-adenosyl-L-homocysteine. Through many papers, there is no evidence of this particular reaction occuring. However, there is evidence of another form of theobromine/caffience degredation in /Pseudomonas putida/ found here: doi 10.1590/S0001-37141999000100013 Yamaoka-Yano and Mazzafera, Brasil

rxn000245 has evidence of exisiting in multiple /Pseudomonas/ species, as per : Ornston, 1971 PMCID PMC378377

In [9]:
rxn3 = ensemble.base_model.reactions.get_by_id('rxn00245_c').copy()
rxn3.gene_reaction_rule = '(PSPTO_2662)'
rxn3.notes = {'ensemble_curation_step' :1}
model.add_reactions([rxn3])

rxn01934 is present in /Pseudomonas convexa/ and was originally curated in /Pseudomonas putida/ : PMID : 976259
Here, we also found that there are no annotations regarding this specific ativity of alpha-hydroxy-acid dehydrogenase, but BLAST local alignment suggests the presence of AHA dehydrogenase in the genome

In [10]:
rxn4 = ensemble.base_model.reactions.get_by_id('rxn01934_c').copy()
rxn4.gene_reaction_rule = '(PSPTO_2424 or PSPTO_2434 or PSPTO_3287 or PSPTO_3460 or PSPTO_3920)'
rxn4.notes = {'ensemble_curation_step' :1}
model.add_reactions([rxn4])

rxn01827 shows evidence of exsisting due to the presence of the catalyzing enzyme in the Pto genome: PSPTO_2346

In [11]:
rxn5 = ensemble.base_model.reactions.get_by_id('rxn01827_c').copy()
rxn5.gene_reaction_rule = '(PSPTO_2346)'
rxn5.notes = {'ensemble_curation_step' :1}
model.add_reactions([rxn5])

In [12]:
rxn6 = ensemble.base_model.reactions.get_by_id('rxn01423_c').copy()
rxn6.gene_reaction_rule = '(PSPTO_0096)'
rxn6.notes = {'ensemble_curation_step' :1}
model.add_reactions([rxn6])

Ignoring reaction 'rxn01423_c' since it already exists.


In [15]:
cobra.io.write_sbml_model(model, '../results/reconstructions/psy_6_arg.xml')