# Biomass composition ➟  biomass reaction

By Garrett Roell and Christina Schenk

Tested on biodesign_3.7 kernel on jprime

This notebook converts biomass composition measurements and amino acid composition measurements into biomass reactions for  R. opacus when growing on phenol and glucose

### Method
<ol>
<li>Setup imports</li>
<li>Load model and relevant data</li>
<li>Calculate the macromolecule distribution in the B. subtilis biomass reaction</li>
<li>Determine the mass fraction each macromolecule class in R. opacus in phenol and glucose</li>
<li>Calculate each precursor's coefficient in phenol and glucose biomass reactions</li>
<li>Make biomass reactions and add them to the model</li>
<li>Check that phenol and glucose biomass reactions match expectations</li>
<li>Save model with new biomass reactions</li>
<li>Save biomass composition data as a csv</li>
</ol>

### 1. Setup imports

In [1]:
import cobra
import pandas as pd

### 2. Load model and relevant data

In [2]:
# load model from notebook A
model = cobra.io.read_sbml_model("../models/r_opacus_model_A.xml")

# load data on biomass composition
biomass_comp_df = pd.read_csv('../data/biomass_composition/r_opacus_experimental_data.csv')

# load data on amino acid composition
amino_acid_df = pd.read_csv('../data/biomass_composition/r_opacus_amino_acid_data.csv')

# load data describing the Bacillus subtilis biomass reaction
b_subtilis_biomass_df = pd.read_csv('../data/biomass_composition/b_subtilis_biomass_data.csv')

display(biomass_comp_df.head(10))
display(amino_acid_df.head(10))
display(b_subtilis_biomass_df.head(10))

Unnamed: 0,condition,time_hrs,lipid_percent,carbohydrate_percent,protein_percent
0,wt_phenol,25.5,14.08,17.21,38.84
1,P1_phenol_low,25.5,16.44,12.47,39.43
2,P1_phenol_high,46.5,14.94,13.68,45.23
3,wt_pv,27.0,15.28,9.37,43.11
4,PV1_pv_low,27.0,20.27,10.11,38.9
5,PV1_pv_high,55.5,16.67,8.74,46.24
6,wt_pvh,18.0,14.86,14.02,36.06
7,PVH5_pvh_low,18.0,20.25,19.08,38.37
8,PVH5_pvh_high,28.5,15.07,16.61,46.7
9,wt_pvhg,15.0,9.09,14.83,39.08


Unnamed: 0,amino_acid,metabolite_id,molecular_weight,glucose,phenol
0,alanine,ala__L_c,89.094,14.49,14.62
1,arginine,arg__L_c,174.203,3.89,4.39
2,asparagine,asn__L_c,132.119,3.43,4.0
3,aspartate,asp__L_c,133.104,3.43,4.0
4,cysteine,cys__L_c,121.154,0.8,0.81
5,glutamine,gln__L_c,146.146,10.68,8.1
6,glutamate,glu__L_c,147.131,10.68,8.1
7,glycine,gly_c,75.067,7.03,8.23
8,histidine,his__L_c,155.156,1.31,1.51
9,isoleucine,ile__L_c,131.175,3.87,4.07


Unnamed: 0,name,metabolite_id,category,coefficient,molecular_weight
0,10-Formyltetrahydrofolate,10fthf_c,carbon carrier,-0.000216,471.43
1,L-Alanine,ala__L_c,protein,-0.498716,89.094
2,S-Adenosyl-L-methionine,amet_c,carbon carrier,-0.000216,399.446
3,L-Arginine,arg__L_c,protein,-0.28717,175.212
4,L-Asparagine,asn__L_c,protein,-0.234029,132.119
5,L-Aspartate,asp__L_c,protein,-0.234029,132.095
6,ATP,atp_c,energy molecule,-52.547151,507.18
7,Calcium,ca2_c,salt,-0.005053,40.078
8,Chloride,cl_c,salt,-0.005053,35.45
9,CoenzymeA,coa_c,carbon carrier,-0.000559,767.535


### 3. Calculate the macromolecule distribution in the B. subtilis biomass reaction

Define a helper function for getting the macromolecule mass fraction from the B. subtilis dataframe

In [3]:
# A function that takes in a biomass dataframe and a macromolecule class and 
# return the that macromolecule's fraction of biomass in milligrams/gram of dry cell weight
def macro_mass_in_mg_from_biomass_df(biomass_df, macromolecule):
    # isolate just reactants (coefficients less than 0 means a chemical is consumed)
    biomass_df = biomass_df[biomass_df.coefficient < 0]
    
    # isolate just metabolites from that macromolecule type
    macro_df = biomass_df[biomass_df.category == macromolecule]
    
    # return the sum of the products of coefficients and molecular_weights
    # the -1 is needed since the coffients of reactants are negative
    return sum([-1*row.coefficient * row.molecular_weight for _, row in macro_df.iterrows()])

Use a dictionary store the the number of milligrams each macromolecule class makes up

In [4]:
macros = ['protein', 'lipid', 'carbohydrate']
b_subtilis_macros = {macro:macro_mass_in_mg_from_biomass_df(b_subtilis_biomass_df, macro) for macro in macros}

b_subtilis_macros

{'protein': 655.9755072570001,
 'lipid': 175.241005523,
 'carbohydrate': 9.621337439000001}

### 4. Determine the mass fraction each macromolecule class in R. opacus in phenol and glucose

Define a helper function for getting mass fraction from R. opacus experimental data

In [5]:
# Gets the mass fraction of a given macro for a given condition from the biomass composition data
def macro_mass_in_mg_from_biomass_comp_df(condition, macro):
    column_name = f'{macro}_percent'
    mass_percent = biomass_comp_df[biomass_comp_df.condition == condition][column_name]
    
    # return mass percentage multiplied by 10 to convert from percent to mg macro/g biomass
    return 10*float(mass_percent)

Get phenol macro dictionary

In [6]:
phenol_macros = {macro:macro_mass_in_mg_from_biomass_comp_df('wt_phenol', macro) for macro in macros}
phenol_macros

{'protein': 388.40000000000003,
 'lipid': 140.8,
 'carbohydrate': 172.10000000000002}

Get glucose macro dictionary

In [7]:
glucose_macros = {macro:macro_mass_in_mg_from_biomass_comp_df('wt_glucose', macro) for macro in macros}
glucose_macros

{'protein': 240.39999999999998,
 'lipid': 402.59999999999997,
 'carbohydrate': 146.4}

### 5. Calculate each precursor's coefficient in phenol and glucose biomass reactions

Define a helper function to get amino acid coefficients for R. opacus biomass reactions

In [8]:
# returns the coefficient of amino acid in a given condition
def get_r_opacus_amino_acid_coefficient(condition, amino_acid_id):
    if condition == 'phenol':
        mg_protein = phenol_macros['protein']
    elif condition == 'glucose':
        mg_protein = glucose_macros['protein']
        
    # get the amino acid molar percentage
    amino_acid_mol_percent = amino_acid_df[amino_acid_df.metabolite_id == amino_acid_id][condition]
    
    # calculate the total protein mass if coefficients were at mol percent levels
    total_protein_mass = sum([row[condition] * row.molecular_weight for _, row in amino_acid_df.iterrows()])
    
    # get the scale factor
    scale_factor = mg_protein / total_protein_mass
    
    # calculate the coefficient with the molar percentage and the scale factor
    amino_acid_coefficient = -1 * float(amino_acid_mol_percent) * scale_factor
    
    return amino_acid_coefficient

Define a helper function to get the coefficient for any metabolite in an R. opacus biomass reactions

In [9]:
# get a dictionary mapping metabolite id to coefficient values in biomass reaction
# this function uses the B. subtilis biomass dataframe and the macros from b_subtilis_
def get_r_opacus_metabolite_coefficients(b_subtilis_biomass_df, b_subtilis_macros, r_opacus_macros, condition):
    macros = ['protein', 'lipid', 'carbohydrate']
    
    # get the multiplier needed to scale each macro
    macro_multipliers = {macro:(r_opacus_macros[macro]/b_subtilis_macros[macro]) for macro in macros}
    
    # make metabolite dictionary by looping over B. subtilis biomass dataframe
    metabolite_dictionary = {}
    for _, row in b_subtilis_biomass_df.iterrows():
        
        # convert the metabolite id into a metabolite object
        metabolite = model.metabolites.get_by_id(row.metabolite_id)
        
        # if the metabolite is a lipid or carbohydrate then scale its coefficient proportionally
        if row.category in ['lipid', 'carbohydrate']:
            metabolite_dictionary[metabolite] = macro_multipliers[row.category]*row.coefficient
            
        # if the metabolite is an amino acid use a special function to get fraction 
        elif row.category == 'protein':
            metabolite_dictionary[metabolite] = get_r_opacus_amino_acid_coefficient(condition, row.metabolite_id)
            
        # if not use original coefficient
        else:
            metabolite_dictionary[metabolite] = row.coefficient
            
    return metabolite_dictionary

Get coefficients for R. opacus in phenol biomass reaction

In [10]:
phenol_metabolite_coefficients = get_r_opacus_metabolite_coefficients(b_subtilis_biomass_df, b_subtilis_macros, phenol_macros, 'phenol')

Get coefficients for R. opacus in glucose biomass reaction

In [11]:
glucose_metabolite_coefficients = get_r_opacus_metabolite_coefficients(b_subtilis_biomass_df, b_subtilis_macros, glucose_macros, 'glucose')

### 6. Make biomass reactions and add them to the model

Add phenol biomass reaction to the model

In [12]:
phenol_growth_reaction = cobra.Reaction()

# add metadata
phenol_growth_reaction.name = 'Phenol biomass reaction'
phenol_growth_reaction.id = 'Growth_Phenol'

# add metabolites to the reaction
phenol_growth_reaction.add_metabolites(phenol_metabolite_coefficients)

# add reaction to the model
model.add_reactions([phenol_growth_reaction])

# print the reaction
model.reactions.get_by_id('Growth_Phenol').reaction

'0.00021600000000000002 10fthf_c + 0.4524504100898161 ala__L_c + 0.00021600000000000002 amet_c + 0.13585891246882986 arg__L_c + 0.12378944188503861 asn__L_c + 0.12378944188503861 asp__L_c + 52.547151 atp_c + 0.005053 ca2_c + 0.005053 cl_c + 0.000559 coa_c + 9.7e-05 cobalt2_c + 0.129616 ctp_c + 0.000688 cu2_c + 0.02506736198172032 cys__L_c + 0.025403 datp_c + 0.026229000000000002 dctp_c + 0.026229000000000002 dgtp_c + 0.025403 dttp_c + 0.00021600000000000002 fad_c + 0.006519 fe2_c + 0.00758 fe3_c + 0.25067361981720315 gln__L_c + 0.25067361981720315 glu__L_c + 0.25469677667846696 gly_c + 0.0007801644346423031 gtca1_45_BS_c + 0.0007801644346423031 gtca2_45_BS_c + 0.0007801644346423031 gtca3_45_BS_c + 0.208826 gtp_c + 47.184845 h2o_c + 0.04673051431160208 his__L_c + 0.12595575711802678 ile__L_c + 0.189503 k_c + 0.24572204214180166 leu__L_c + 3.93697809448742e-05 lipo1_24_BS_c + 3.93697809448742e-05 lipo2_24_BS_c + 3.93697809448742e-05 lipo3_24_BS_c + 3.93697809448742e-05 lipo4_24_BS_c + 0.

Add glucose biomass reaction to the model

In [13]:
glucose_growth_reaction = cobra.Reaction()

# add metadata
glucose_growth_reaction.name = 'Glucose biomass reaction'
glucose_growth_reaction.id = 'Growth_Glucose'

# add metabolites to the reaction
glucose_growth_reaction.add_metabolites(glucose_metabolite_coefficients)

# add reaction to the model
model.add_reactions([glucose_growth_reaction])

# print the reaction
model.reactions.get_by_id('Growth_Glucose').reaction

'0.00021600000000000002 10fthf_c + 0.2753095839435447 ala__L_c + 0.00021600000000000002 amet_c + 0.0739098883050648 arg__L_c + 0.06516990151320624 asn__L_c + 0.06516990151320624 asp__L_c + 52.547151 atp_c + 0.005053 ca2_c + 0.005053 cl_c + 0.000559 coa_c + 9.7e-05 cobalt2_c + 0.129616 ctp_c + 0.000688 cu2_c + 0.01519997702931924 cys__L_c + 0.025403 datp_c + 0.026229000000000002 dctp_c + 0.026229000000000002 dgtp_c + 0.025403 dttp_c + 0.00021600000000000002 fad_c + 0.006519 fe2_c + 0.00758 fe3_c + 0.20291969334141183 gln__L_c + 0.20291969334141183 glu__L_c + 0.13356979814514283 gly_c + 0.002230782680305335 gtca1_45_BS_c + 0.002230782680305335 gtca2_45_BS_c + 0.002230782680305335 gtca3_45_BS_c + 0.208826 gtp_c + 47.184845 h2o_c + 0.024889962385510254 his__L_c + 0.07352988887933182 ile__L_c + 0.189503 k_c + 0.1519997702931924 leu__L_c + 0.00011257296738924964 lipo1_24_BS_c + 0.00011257296738924964 lipo2_24_BS_c + 0.00011257296738924964 lipo3_24_BS_c + 0.00011257296738924964 lipo4_24_BS_c 

### 7. Check that phenol and glucose biomass reactions match expectations

Define a helper function get the total mass of given macromolecule from a biomass reaction

In [14]:
def reaction_to_single_macro_mass(reaction, biomass_df, macro):
    # isolate just reactants
    biomass_df = biomass_df[biomass_df.coefficient < 0]
    
    # isolate just the metabolites in that macro group
    macro_df = biomass_df[biomass_df.category == macro]
    
    # multiply the metabolite coefficient in new reaction with the molecular weight from dataframe
    # -1 is needed to since these are all coeffients are negative
    masses = [-1 * row.molecular_weight * reaction.get_coefficient(row.metabolite_id) for _, row in macro_df.iterrows()]
    
    return sum(masses)

Test new biomass reactions

In [15]:
# get reactions as variables to plug into function
phenol_reaction = model.reactions.get_by_id('Growth_Phenol')
glucose_reaction = model.reactions.get_by_id('Growth_Glucose')

# get reaction macros
phenol_reaction_macros = {macro:reaction_to_single_macro_mass(phenol_reaction, b_subtilis_biomass_df, macro) for macro in ['protein', 'lipid', 'carbohydrate']}
glucose_reaction_macros = {macro:reaction_to_single_macro_mass(glucose_reaction, b_subtilis_biomass_df, macro) for macro in ['protein', 'lipid', 'carbohydrate']}

# compare to expected macros
print('expected phenol macros', phenol_macros)
print('phenol reaction macros', phenol_reaction_macros)
print('\n')
print('expected glucose macros', glucose_macros)
print('glucose reaction macros', glucose_reaction_macros)

expected phenol macros {'protein': 388.40000000000003, 'lipid': 140.8, 'carbohydrate': 172.10000000000002}
phenol reaction macros {'protein': 388.4123216915718, 'lipid': 140.8, 'carbohydrate': 172.10000000000002}


expected glucose macros {'protein': 240.39999999999998, 'lipid': 402.59999999999997, 'carbohydrate': 146.4}
glucose reaction macros {'protein': 240.40884790662875, 'lipid': 402.59999999999997, 'carbohydrate': 146.4}


### 8. Save model with new biomass reactions

In [16]:
model.id = 'r_opacus_model_B'
model.name = 'R. opacus model B'
cobra.io.write_sbml_model(model, "../models/r_opacus_model_B.xml")
model

0,1
Name,r_opacus_model_B
Memory address,0x07f4d78882810
Number of metabolites,1952
Number of reactions,3021
Number of groups,0
Objective expression,1.0*Growth - 1.0*Growth_reverse_699ae
Compartments,"cytosol, periplasm, extracellular space"


### 9. Save biomass composition data as a csv

In [17]:
# only get metabolite name and B. subtilis coefficient columns
summary_df = b_subtilis_biomass_df.copy()[['name', 'category', 'coefficient']]

# # rename B. subtilis coefficient column
summary_df['B. subtilis coefficient'] = -1*summary_df['coefficient']
summary_df.drop('coefficient', axis=1, inplace=True)

# add phenol and glucose coefficient columns
summary_df['R.opacus glucose coefficient'] = glucose_metabolite_coefficients.values()
summary_df['R.opacus phenol coefficient'] = phenol_metabolite_coefficients.values()

# make all reactant coefficients positive
summary_df['R.opacus glucose coefficient'] = -1*summary_df['R.opacus glucose coefficient']
summary_df['R.opacus phenol coefficient'] = -1*summary_df['R.opacus phenol coefficient']

summary_df.head()

Unnamed: 0,name,category,B. subtilis coefficient,R.opacus glucose coefficient,R.opacus phenol coefficient
0,10-Formyltetrahydrofolate,carbon carrier,0.000216,0.000216,0.000216
1,L-Alanine,protein,0.498716,0.27531,0.45245
2,S-Adenosyl-L-methionine,carbon carrier,0.000216,0.000216,0.000216
3,L-Arginine,protein,0.28717,0.07391,0.135859
4,L-Asparagine,protein,0.234029,0.06517,0.123789


Save summary dataframe as .csv file

In [18]:
summary_df.to_csv('../data/biomass_composition/r_opacus_biomass_data.csv', index=False)