# MICOM medium for VMH [description]

Here we will build up the environmental conditions used for modeling. We will start from the metabolite abundances obtained from the VMH diet designer and basically go through the following step.

1. Convert to fluxes and adjust very low abundant compounds.
2. Dilute metabolites absorbed in the small intestine.
3. Add primary bile acids and mucin cores.
4. Add in missing components to allow for at least slow growth for all known taxa residing in the human gut.

That should leave us with a set of usable media for all later simulation steps.

But first let us inspect the actual diet data we got. For that we will read the diet data, rearrange it a bit and add in annotations.

In [1]:
import pandas as pd


diet = pd.read_csv("../data/[DIET].tsv", sep="\t", header=None)
diet.columns = ["reaction", "flux"]
annotations = pd.read_csv("../data/agora_metabolites.csv")

diet = diet.rename(columns={diet.columns[0]: "reaction"})
diet["metabolite"] = diet.reaction.str.replace("^EX_", "", regex=True).str.replace("\\[e\\]|\\(e\\)", "", regex=True)
diet.loc[diet.metabolite == "4hpro", "metabolite"] = "4hpro_LT"  # fix name for hydroxyproline
diet.loc[diet.flux == 0, "flux"] = 1e-4  # bug in VMH designer where everything <1e-4 gets truncated to 0

diet

Unnamed: 0,reaction,flux,metabolite
0,EX_lcts(e),230.350004,lcts
1,EX_but(e),0.0001,but
2,EX_octa(e),4.22783,octa
3,EX_dca(e),2.630904,dca
4,EX_ddca(e),17.660948,ddca
5,EX_ttdca(e),6.873942,ttdca
6,EX_hdca(e),32.524183,hdca
7,EX_ocdca(e),5.36439,ocdca
8,EX_hdcea(e),0.0001,hdcea
9,EX_ocdcea(e),51.526825,ocdcea


## Adjust for intestinal adsorption

To achieve this we will load the Recon3 human model. AGORA and Recon IDs are very similar so we should be able to match them. We just have to adjust the Recon3 ones a bit. We start by identifying all available exchanges in Recon3 and adjusting the IDs.

In [2]:
from cobra.io import read_sbml_model
import pandas as pd

recon3 = read_sbml_model("../data/Recon3D.xml.gz")
exchanges = pd.Series([r.id for r in recon3.exchanges])
exchanges = exchanges.str.replace("__", "_").str.replace("_e$|EX_", "", regex=True)
exchanges.head()

0     5adtststerone
1    5adtststerones
2             5fthf
3             5htrp
4             5mthf
dtype: object

In [5]:
diet["dilution"] = 1.0
diet.loc[diet.metabolite.isin(exchanges), "dilution"] = 0.2
diet["flux"] = diet["flux"] * diet["dilution"] 
diet[["metabolite", "dilution"]].drop_duplicates().dilution.value_counts()

0.2    41
1.0     6
Name: dilution, dtype: int64

## Adding host supplied components

Finally we add the host metabolites such as primary bile acids and mucins and a minuscule amount of oxygen.

In [6]:
diet.set_index("metabolite", inplace=True)

# mucin
for met in annotations.loc[annotations.metabolite.str.contains("core"), "metabolite"]:
    diet.loc[met, "flux"] = 1

# primary BAs
for met in ["gchola", "tchola"]:
    diet.loc[met, "flux"] = 1

# fiber
diet.loc["cellul", "flux"] = 0.1

# anaerobic
diet.loc["o2", "flux"] = 0.001

diet.reset_index(inplace=True)
diet["reaction"] = "EX_" + diet.metabolite + "(e)"
diet

Unnamed: 0,metabolite,reaction,flux,dilution
0,lcts,EX_lcts(e),46.070001,0.2
1,but,EX_but(e),0.000020,0.2
2,octa,EX_octa(e),0.845566,0.2
3,dca,EX_dca(e),0.526181,0.2
4,ddca,EX_ddca(e),3.532190,0.2
...,...,...,...,...
59,core7,EX_core7(e),1.000000,
60,gchola,EX_gchola(e),1.000000,
61,tchola,EX_tchola(e),1.000000,
62,cellul,EX_cellul(e),0.100000,


And we will merge this tbale with some annotations to make it more accessible.

In [7]:
skeleton = pd.merge(diet, annotations, on="metabolite")

skeleton["global_id"] = skeleton.reaction
skeleton["reaction"] = "EX_" + skeleton.metabolite + "_m"
skeleton.head()

Unnamed: 0,metabolite,reaction,flux,dilution,name,hmdb,kegg.compound,pubchem.compound,inchi,chebi,global_id
0,lcts,EX_lcts_m,46.070001,0.2,Lactose,HMDB00186,C00243,440995.0,,,EX_lcts(e)
1,but,EX_but_m,2e-05,0.2,butyrate,HMDB00039,C00246,264.0,"InChI=1S/C4H8O2/c1-2-3-4(5)6/h2-3H2,1H3,(H,5,6...",,EX_but(e)
2,octa,EX_octa_m,0.845566,0.2,octanoate (n-C8:0),HMDB00482,C06423,379.0,"InChI=1S/C8H16O2/c1-2-3-4-5-6-7-8(9)10/h2-7H2,...",,EX_octa(e)
3,ddca,EX_ddca_m,3.53219,0.2,laurate,HMDB00638,C02679,3893.0,InChI=1S/C12H24O2/c1-2-3-4-5-6-7-8-9-10-11-12(...,,EX_ddca(e)
4,ttdca,EX_ttdca_m,1.374788,0.2,tetradecanoate (n-C14:0),HMDB00806,C06424,11005.0,InChI=1S/C14H28O2/c1-2-3-4-5-6-7-8-9-10-11-12-...,,EX_ttdca(e)
5,hdca,EX_hdca_m,6.504837,0.2,Hexadecanoate (n-C16:0),HMDB00220,C00249,985.0,InChI=1S/C16H32O2/c1-2-3-4-5-6-7-8-9-10-11-12-...,,EX_hdca(e)
6,ocdca,EX_ocdca_m,1.072878,0.2,octadecanoate (n-C18:0),HMDB00827,C01530,3033836.0,InChI=1S/C18H36O2/c1-2-3-4-5-6-7-8-9-10-11-12-...,,EX_ocdca(e)
7,hdcea,EX_hdcea_m,2e-05,0.2,Hexadecenoate (n-C16:1),HMDB03229,C08362,445638.0,InChI=1S/C16H30O2/c1-2-3-4-5-6-7-8-9-10-11-12-...,,EX_hdcea(e)
8,ocdcea,EX_ocdcea_m,10.305365,0.2,octadecenoate (n-C18:1),HMDB00207,C00712,5460221.0,InChI=1S/C18H34O2/c1-2-3-4-5-6-7-8-9-10-11-12-...,,EX_ocdcea(e)
9,chsterol,EX_chsterol_m,0.012381,0.2,cholesterol,HMDB00067,C00187,5997.0,,,EX_chsterol(e)


## Complete the medium

Great we now have a pretty good skeleton. One issue that this will never be fully complete. There will always be some components missing that are essential for microbial growth. Fortunately, we provide a algorithm in MICOM to complete a medium with the smallest set of additional components to provide growth to all intestinal taxa.

In [27]:
from micom.workflows.db_media import complete_db_medium

manifest, imports = complete_db_medium("../data/agora103_strain.qza", skeleton, growth=0.01, threads=12, max_added_import=10, weights="mass")

Output()

In [28]:
manifest.can_grow.value_counts()

True     532
False    286
Name: can_grow, dtype: int64

In [34]:
filled = imports.max()
added = filled[~filled.index.isin(skeleton.reaction)]

print(f"Added flux is {added.sum():.2f}/{filled.sum():.2f} mmol/h.")

Added flux is 16.10/175.13 mmol/h.


Let's see what was added in large amounts.

In [42]:
added.sort_values(ascending=False)[0:20]

EX_h_m        2.254048
EX_h2_m       1.438089
EX_no_m       1.398173
EX_urea_m     1.088698
EX_gcald_m    0.593231
EX_acald_m    0.527849
EX_no3_m      0.489774
EX_ph2s_m     0.423410
EX_no2_m      0.356260
EX_nh4_m      0.354688
EX_asp_L_m    0.336766
EX_thr_L_m    0.314275
EX_fru_m      0.287669
EX_xyl_D_m    0.282403
EX_co2_m      0.272776
EX_glyc_m     0.253462
EX_n2o_m      0.250353
EX_ac_m       0.232344
EX_fum_m      0.225022
EX_etha_m     0.222551
dtype: float64

Looks okay. So we will now assemble the final medium. For this we add the new components to each sample and rebuild the annotations for a nicely formatted medium.

In [36]:
added_df = filled.reset_index() 
added_df.iloc[:, 0] = added_df.iloc[:, 0].str.replace("EX_|_m$", "", regex=True)
added_df.columns = ["metabolite", "flux"]

completed = pd.merge(added_df, annotations, on="metabolite", how="left")
completed["reaction"] = "EX_" + completed.metabolite + "_m"
completed["global_id"] = "EX_" + completed.metabolite + "(e)"
completed

Unnamed: 0,metabolite,flux,name,hmdb,kegg.compound,pubchem.compound,inchi,chebi,reaction,global_id
0,26dap_M,0.003626,"meso-2,6-Diaminoheptanedioate",,,,,,EX_26dap_M_m,EX_26dap_M(e)
1,2dmmq8,0.000159,2-Demethylmenaquinone 8,,,,,,EX_2dmmq8_m,EX_2dmmq8(e)
2,2obut,0.007933,2-Oxobutanoate,HMDB00005,C00109,58.0,"InChI=1S/C4H6O3/c1-2-3(5)4(6)7/h2H2,1H3,(H,6,7...",,EX_2obut_m,EX_2obut(e)
3,3mop,0.007876,3-methyl-2-oxopentanoate,HMDB00491,C03465,47.0,"InChI=1S/C6H10O3/c1-3-4(2)5(7)6(8)9/h4H,3H2,1-...",,EX_3mop_m,EX_3mop(e)
4,acgam,0.012285,N-acetyl-D-glucosamine,HMDB00215,C00140,439174.0,InChI=1S/C8H15NO6/c1-4(12)9-5(2-10)7(14)8(15)6...,,EX_acgam_m,EX_acgam(e)
...,...,...,...,...,...,...,...,...,...,...
211,gal,0.003088,D-Galactose,HMDB00143,C00984,439357.0,,,EX_gal_m,EX_gal(e)
212,stys,0.044406,Stachyose,,,,,,EX_stys_m,EX_stys(e)
213,so3,0.160953,Sulfite,HMDB00240,C00094,1100.0,"InChI=1S/H2O3S/c1-4(2)3/h(H2,1,2,3)/p-2",,EX_so3_m,EX_so3(e)
214,oaa,0.006627,Oxaloacetate,HMDB00223,C00036,970.0,"InChI=1S/C4H4O5/c5-2(4(8)9)1-3(6)7/h1H2,(H,6,7...",,EX_oaa_m,EX_oaa(e)


## Validate the medium

And we will now validate whether the medium works.

In [32]:
from micom.workflows.db_media import check_db_medium

check = check_db_medium("../data/agora103_strain.qza", medium=completed, threads=12)

Output()

In [43]:
check.growth_rate.describe()

count    818.000000
mean       0.015362
std        0.012921
min        0.000000
25%        0.007142
50%        0.010000
75%        0.025641
max        0.064667
Name: growth_rate, dtype: float64

And we are done now and will the save the final medium.

In [None]:
import qiime2 as q2

arti = q2.Artifact.import_data("MicomMedium[Global]", completed)
arti.save("../media/vmh_[DIET]_agora.qza")