# Time course growth and substrate  data ➟ Growth parameters

By Christina Schenk and Garrett Roell

Tested on biodesign_3.7 kernel on jprime



This notebook calculates the growth rate, yield coefficient, and substrate uptake rate of R. opacus cultures growing with phenol and glucose. These experiements were published in these papers: [Yondeda (2016)](https://academic.oup.com/nar/article/44/5/2240/2465306) and [Henson (2018)](https://www.sciencedirect.com/science/article/pii/S1096717618300910).

#### Yoneda data: 
* WT 1.0 g/L Glucose, 0.05g/L ammonium sulfate (**WT-LN-G**) (3 trials)
* Evol33 1.0 g/L Glucose, 0.05 g/l ammonium sulfate (**EVOL33-LN-G**) (3 trials)
* Evol40 1.0 g/L Glucose, 0.05 g/l ammonium sulfate (**EVOL40-LN-G**) (3 trials)

#### Rhiannon 2018 data:
* Metabolomics and OD data for WT Glucose (**WT-G** but mapped to **WT-LN-G** for following notebooks)

#### Henson data:
* WT 0.5 g/L Phenol (3 trials) (**WT-P**)
* PVHG6 0.5 g/L Phenol (3 trials) (**PVHG-P**)

### Method: 
<ol>
<li>Calculate growth rate by finding slope of log(biomass) vs. time</li>
<li>Simulate biomass growth by using timepoints and X = X0e^(mu*t)</li>
<li>Calculate yield coefficient by finding the slope of the biomass generated vs substrate consumed</li>
<li>Calculate substrate uptake rate by dividing the growth rate by the yield coefficient</li>
</ol>

### Notes:
The glucose concentration measurements in the Yoneda data sets seem unrealistically low. This is why we use the data from Rhiannon to calculate the growth parameters instead. These will be used in the following prediction notebooks.


### Setup imports

In [1]:
#Python packages:
import numpy as np
import sys
import os
import pandas as pd
import cobra
import math
from scipy.stats import linregress
from scipy import mean
import matplotlib.pyplot as plt

#### **Load Yoneda and Henson Data**

In [2]:
from edd_utils import login, export_study, export_metadata
# Study to Download
study_slug = 'biodesign_yoneda_set3_reprocessed'#'biodesign_yoneda_set2'#multiomics-data-for-wt-strain-c157'
study_slug2 = 'biodesign_henson2018_reprocessed'#'biodesign_henson2108'
# EDD server
edd_server = 'public-edd.jbei.org'#agilebiofoundry.org'#'edd.jbei.org'
user       = 'schenkch'

In [3]:
session = login(edd_server=edd_server, user=user)

Password for schenkch:  ···············


In [None]:
#Export data from EDD as 2 dataframes:
df: pd.DataFrame

# filename = TBD
    
try:
    df = export_study(session, study_slug, edd_server=edd_server)
    df2 = export_study(session, study_slug2, edd_server=edd_server)
except (NameError, AttributeError, KeyError):
    try:
        df = pd.read_csv(filename)
    except:
        print(f'ERROR! Alternative loading of data from disk at {DISKDF} failed!')
    else:
        print(f'OK! Alternative loading of data from disk at {DISKDF} was successful.')


HBox(children=(FloatProgress(value=0.0, max=192297.0), HTML(value='')))

##### **Set biomass production rate**

In [None]:
GRAMS_BIOMASS_PER_LITER_PER_OD = 0.368 # 1 OD = 0.368 g/L of biomass

##### **Yoneda data**

In [None]:
od_df = df[df['Protocol'].str.contains('OD600')]
od_df.loc[:,'Biomass Conc'] = GRAMS_BIOMASS_PER_LITER_PER_OD*od_df.loc[:,'Value']

sub_df = df[df['Protocol'].str.contains('HPLC')]

print(f'substrate data has {len(sub_df)} lines')
print(f'OD data has {len(od_df)} lines')

##### **Henson data**

In [None]:
od_Hen_df = df2[df2['Protocol'].str.contains('OD600')]#pd.read_csv(ODfile)
od_Hen_df.loc[:,'Biomass Conc'] = GRAMS_BIOMASS_PER_LITER_PER_OD*od_Hen_df['Value']
sub_Hen_df = df2[df2['Protocol'].str.contains('HPLC')]

print(f'substrate data has {len(sub_Hen_df)} lines')
print(f'OD data has {len(od_Hen_df)} lines')

### Define functions to calculate growth rate, yield coefficient, and substrate uptake rate

stats_for_trial: Calculates parameters and plots data and fit lines for a single trial
<br>
stats_for_condition: Takes in three trial names and calls stats_for_trial for each one.

In [None]:
def stats_for_trial(growth_data, substrate_data, molar_mass, display=False, max_time=0, substrate=''):
    
    biomass_values = growth_data['Biomass Conc']
    biomass_times = growth_data['Hours']
    biomass_init = list(biomass_values)[0]

    substrate_values = substrate_data['Value']*1000/molar_mass
    substrate_times = substrate_data['Hours']
    substrate_init = list(substrate_values)[0]
    
    # growth is the slope of log(biomass) vs. time
    growth_rate, _, _, _, _ = linregress(biomass_times, [math.log(val) for val in biomass_values])
    
    # biomass X = X0*e^(μ*t)
    # This is different from above to ensure that there is a biomass value for every substrate measurement
    biomass_sim = [biomass_init*math.exp(growth_rate*time) for time in substrate_times]
    
    # actual consumption = S0 - S
    sub_consumed = [substrate_init - sub_value for sub_value in substrate_values]
    
    # new biomass X = X0 - X
    biomass_sim_growth = [sim_value - biomass_init for sim_value in biomass_sim ]
    
    # yield is the amount of biomass that can be made from a mmol of substrate
    yield_coeff, _, _, _, _ = linregress(sub_consumed, biomass_sim_growth)

    # S = S0 - (1/yield)*X
    substrate_sim = [substrate_init - 1/yield_coeff*val for val in biomass_sim_growth]
    
    substrate_consumption_rate = (1/yield_coeff) * growth_rate

    fig, axes = plt.subplots(nrows=1, ncols=2, figsize=(8, 5))
    axes[0].plot(biomass_times, biomass_values, 'o', color='black')
    axes[0].plot(substrate_times, biomass_sim, '-', color='black')
    axes[1].plot(substrate_times, substrate_values, 'o', color='blue')
    axes[1].plot(substrate_times, substrate_sim, '-', color='blue')
    axes[0].set_title('Biomass growth')
    axes[1].set_title(f'{substrate} consumption')
    axes[0].set_xlabel('Time (hr)')
    axes[1].set_xlabel('Time (hr)')
    axes[0].set_ylabel('Biomass (g/L)')
    axes[1].set_ylabel(f'{substrate} (mmol/L)')
    fig.tight_layout()
    
    if display:
        print(f'growth_rate = {growth_rate:.3f} hr-1')
        print(f'yield coefficient = {yield_coeff:.3f} g biomass / mmol substrate')
        print(f'substrate consumption rate = {substrate_consumption_rate:.3f} mmol substrate/gram biomass * hr')
        return growth_rate, yield_coeff, substrate_consumption_rate
    else:
        return growth_rate, yield_coeff, substrate_consumption_rate
    
    
    
def stats_for_condtion(od_df, sub_df, trial_1, trial_2, trial_3, molar_mass, substrate='', max_time=0):
    
    if max_time != 0:
        od_df = od_df[od_df['Hours'] < max_time]
        sub_df = sub_df[sub_df['Hours'] < max_time]
        
    od_1 = od_df[od_df['Line Name'] == trial_1]
    sub_1 = sub_df[sub_df['Line Name'] == trial_1]

    od_2 = od_df[od_df['Line Name'] == trial_2]
    sub_2 = sub_df[sub_df['Line Name'] == trial_2]

    od_3 = od_df[od_df['Line Name'] == trial_3]
    sub_3 = sub_df[sub_df['Line Name'] == trial_3]

    gr_1, yc_1, scr_1 = stats_for_trial(od_1, sub_1, molar_mass, substrate=substrate)
    gr_2, yc_2, scr_2 = stats_for_trial(od_2, sub_2, molar_mass, substrate=substrate)
    gr_3, yc_3, scr_3 = stats_for_trial(od_3, sub_3, molar_mass, substrate=substrate)
    
    growth_rate = np.average([gr_1, gr_2, gr_3])
    yield_coeff = np.average([yc_1, yc_2, yc_3])
    substrate_consumption_rate = np.average([scr_1, scr_2, scr_3])
    growth_rate_std = np.std([gr_1, gr_2, gr_3])
    yield_coeff_std = np.std([yc_1, yc_2, yc_3])
    substrate_consumption_rate_std = np.std([scr_1, scr_2, scr_3])
    
    print(f'growth_rate = {growth_rate:.3f} ± {growth_rate_std:.3f} hr-1')
    print(f'yield coefficient = {yield_coeff:.3f} ± {yield_coeff_std:.3f} g biomass / mmol substrate')
    print(f'substrate consumption rate = {substrate_consumption_rate:.3f} ± {substrate_consumption_rate_std:.3f} mmol substrate/gram biomass * hr')
    return growth_rate, yield_coeff, substrate_consumption_rate, growth_rate_std, yield_coeff_std, substrate_consumption_rate_std 

#### **Create dataframe and add results**

In [None]:
#create dataframe for all results:
indlist = ['growth rate', 'yield coefficient',  'substrate consumption rate', 'growth rate std dev', 'yield coefficient std dev', 'substrate consumption rate std dev']
strainslist = ['WT-LN-G', 'EVOL33-LN-G', 'EVOL40-LN-G', 'WT-P', 'PVHG-P']

In [None]:
growthandsubstrdata = pd.DataFrame(index=strainslist, columns=indlist, dtype=float)

### Yoneda: WT 1.0 g/L Glucose, 0.05 g/l ammonium sulfate - WT-LN-G

In [None]:
WTLNGlist = ['WT-LN-G-R1', 'WT-LN-G-R2', 'WT-LN-G-R3']
od_1 = od_df[od_df['Line Name'].str.contains('WT-LN-G')]
sub_1 = sub_df[sub_df['Line Name'].str.contains('WT-LN-G')]
growth_rate_WTLNG, yield_coeff_WTLNG, substrate_consumption_rate_WTLNG = stats_for_trial(od_1, sub_1, 180.16, display=True, substrate='Glucose')

##### **Add results to dataframe**

In [None]:
growthandsubstrdata.at['WT-LN-G',:]=[growth_rate_WTLNG, yield_coeff_WTLNG, substrate_consumption_rate_WTLNG, 0, 0, 0]

### Yoneda: Evol33 1.0 g/L Glucose, 0.05 g/l ammonium sulfate  - EVOL33-LN-G

In [None]:
od_1 = od_df[od_df['Line Name'].str.contains('EVOL33-LN-G')]
sub_1 = sub_df[sub_df['Line Name'].str.contains('EVOL33-LN-G')]
growth_rate_EVOL33LNG, yield_coeff_EVOL33LNG, substrate_consumption_rate_EVOL33LNG = stats_for_trial(od_1, sub_1, 180.16, display=True, substrate='Glucose')

##### **Add results to dataframe**

In [None]:
growthandsubstrdata.at['EVOL33-LN-G',:]=[growth_rate_EVOL33LNG, yield_coeff_EVOL33LNG, substrate_consumption_rate_EVOL33LNG, 0, 0, 0]

### Yoneda: Evol40 1.0 g/L Glucose, 0.05 g/l ammonium sulfate  - EVOL40-LN-G

In [None]:
od_1 = od_df[od_df['Line Name'].str.contains('EVOL40-LN-G')]
sub_1 = sub_df[sub_df['Line Name'].str.contains('EVOL40-LN-G')]
growth_rate_EVOL40LNG, yield_coeff_EVOL40LNG, substrate_consumption_rate_EVOL40LNG = stats_for_trial(od_1, sub_1, 180.16, display=True, substrate='Glucose')

##### **Add results to dataframe**

In [None]:
growthandsubstrdata.at['EVOL40-LN-G',:]=[growth_rate_EVOL40LNG, yield_coeff_EVOL40LNG, substrate_consumption_rate_EVOL40LNG, 0, 0, 0]

In [None]:
od_Hen_1 = od_Hen_df[od_Hen_df['Line Name'].str.contains('WT-P')]
sub_Hen_1 = sub_Hen_df[sub_Hen_df['Line Name'].str.contains('WT-P')]

### Henson: WT 0.5 g/L Phenol - WT-P 

In [None]:
growth_rate_WTP, yield_coeff_WTP, substrate_consumption_rate_WTP, growth_rate_std_WTP, yield_coeff_std_WTP, substrate_consumption_rate_std_WTP  = stats_for_condtion(od_Hen_1, sub_Hen_1,'WT-P-R1', 'WT-P-R2', 'WT-P-R3', 94.11, substrate='Phenol' ,max_time = 40)

##### **Add results to dataframe**

In [None]:
growthandsubstrdata.at['WT-P',:]=[growth_rate_WTP, yield_coeff_WTP, substrate_consumption_rate_WTP, growth_rate_std_WTP, yield_coeff_std_WTP, substrate_consumption_rate_std_WTP]

### Henson: PVHG6 0.5 g/L Phenol - PVHG-P

In [None]:
od_Hen_2 = od_Hen_df[od_Hen_df['Line Name'].str.contains('PVHG-P')]
sub_Hen_2 = sub_Hen_df[sub_Hen_df['Line Name'].str.contains('PVHG-P')]

In [None]:
growth_rate_PVHGP, yield_coeff_PVHGP, substrate_consumption_rate_PVHGP, growth_rate_std_PVHGP, yield_coeff_std_PVHGP, substrate_consumption_rate_std_PVHGP  = stats_for_condtion(od_Hen_2, sub_Hen_2, 'PVHG-P-R1', 'PVHG-P-R2', 'PVHG-P-R3', 94.11, substrate='Phenol', max_time=40)

##### **Add results to dataframe**

In [None]:
growthandsubstrdata.at['PVHG-P',:]=[growth_rate_PVHGP, yield_coeff_PVHGP, substrate_consumption_rate_PVHGP, growth_rate_std_PVHGP, yield_coeff_std_PVHGP, substrate_consumption_rate_std_PVHGP]

### Glucose 2018 data from Rhiannon

#### Load data from EDD

In [None]:
study_slug3 = 'rhodococcus-opacus-pd630-rhiannon-2018'

In [None]:
#Export data from EDD as dataframe:
df3 = export_study(session, study_slug3, edd_server=edd_server)

In [None]:
od_glu_df = df3[df3['Protocol'].str.contains('OD600')]
od_glu_df.loc[:,'Biomass Conc'] = GRAMS_BIOMASS_PER_LITER_PER_OD*od_glu_df.loc[:,'Value']

sub_glu_df = df3[df3['Protocol'].str.contains('Other')]#why not HPLC?

print(f'substrate data has {len(sub_df)} lines')
print(f'OD data has {len(od_df)} lines')

#### Calculate growth parameters

In [None]:
growth_rate_WTLNG2, yield_coeff_WTLNG2, substrate_consumption_rate_WTLNG2, growth_rate_std_WTLNG2, yield_coeff_std_WTLNG2, substrate_consumption_rate_std_WTLNG2  = stats_for_condtion(od_glu_df, sub_glu_df, 'WT-R1', 'WT-R2', 'WT-R3', 180.16, substrate='Glucose', max_time=12)

##### **Add results to dataframe**

In [None]:
growthandsubstrdata.at['WT-G',:]=[growth_rate_WTLNG2, yield_coeff_WTLNG2, substrate_consumption_rate_WTLNG2, growth_rate_std_WTLNG2, yield_coeff_std_WTLNG2, substrate_consumption_rate_std_WTLNG2]

##### **Drop growth parameters that will not be used in following notebooks and map WT-G to WT-LN-G**

In [None]:
growthandsubstrdata = growthandsubstrdata.drop(['WT-LN-G', 'EVOL33-LN-G', 'EVOL40-LN-G'])
growthandsubstrdata = growthandsubstrdata.rename(index={'WT-G':'WT-LN-G'})

In [None]:
growthandsubstrdata

##### **Write data frame to file**

In [None]:
growthandsubstrdata.to_csv('../consumption_and_growth_data/consumption_and_growth_data_new.csv')