# Complete Analysis of Carbon Source Variation

Â© 2019 Griffin Chure. This work is licensed under a [Creative Commons Attribution License CC-BY 4.0](https://creativecommons.org/licenses/by/4.0/). All code contained herein is licensed under an [MIT license](https://opensource.org/licenses/MIT).

--- 

In [45]:
import sys
sys.path.insert(0, '../../')
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import mwc.bayes
import mwc.stats
import mwc.model
import mwc.viz
import bokeh.io
import bokeh.plotting
import imp
imp.reload(mwc.stats)
bokeh.io.output_notebook()
colors = mwc.viz.personal_style()
constants = mwc.model.load_constants()

In this notebook, we complete analysis of the LacI repressor titration under varying carbon sources. We begin by inferring the calibration factor given all of the data, which we model in a hierarchical fashion. With a repressor count in place, we then estimate the DNA binding energy of each carbon source and compare to that reported in Garcia & Phillips, 2011. 

To begin, we will load the lineage information to compute the calibration factor

## Computing a fluorescence calibration factor 

In [6]:
# Load the lineage and fold-change data
lineages = pd.read_csv('../../data/compiled_fluctuations.csv')
fc_data = pd.read_csv('../../data/compiled_fold_change.csv')

# Isolate the autofluorescence data. 
auto_data = fc_data[fc_data['strain']=='auto'].copy()

# Load the stan model for hierarchical calibration factor inference. 
model = mwc.bayes.StanModel('../stan/hierarchical_calibration_factor.stan')

Found precompiled model. Loading...
finished!


With the data and model loaded, we can iterate through each carbon source, assign replicate identifiers, and perform the inference

In [46]:
stat_dfs = []

# Iterate through each carbon source
for g, d in lineages.groupby(['carbon']):
    # Assign identifiers on date and run number. 
    d = d.copy()
    d['idx'] = d.groupby(['date', 'run_no']).ngroup() + 1
    
    # Extract the mean autofluorescence value for each replicate
    _auto = auto_data[(auto_data['carbon']==g)].copy()
    _auto['idx'] = _auto.groupby(['date', 'run_number']).ngroup() + 1
    for _g, _d in d.groupby(['date', 'run_no', 'idx']):
        d.loc[d['idx']==_g[2], 'mean_auto'] = _auto[(_auto['date']==_g[0]) & 
                                                 (_auto['run_number']==_g[1])]['mean_yfp'].values.mean()
        
    # Compute the total fluorescence.
    d['I1_tot'] = d['area_1'].values * (d['I_1'].values - d['mean_auto'].values)
    d['I2_tot'] = d['area_2'].values * (d['I_2'].values - d['mean_auto'].values)
    d.dropna(inplace=True)
    # Assemble the data dictionary. 
    data_dict = {'J_exp':d['idx'].max(),
                'N_fluct':len(d),
                'index_1':d['idx'],
                'I_1':d['I1_tot'],
                'I_2':d['I2_tot']}
    
    fit, samples = model.sample(data_dict, iter=3000)
    stats = mwc.stats.compute_statistics(samples, logprob_name='lp__')
    stats['carbon'] = g 
    
    stat_dfs.append(stats)    

Beginning sampling...
finished sampling!
Beginning sampling...


will be corrected to return the positional maximum in the future.
Use 'series.values.argmax' to get the position of the maximum now.
  return getattr(obj, method)(*args, **kwds)


finished sampling!
Beginning sampling...


  
  ret = ret.dtype.type(ret / rcount)


finished sampling!


KeyError: 0

In [48]:
stat_dfs[0]

Unnamed: 0,parameter,mode,mean,median,hpd_min,hpd_max,carbon
0,chain,2.0,2.5,2.5,1.0,4.0,acetate
1,chain_idx,343.0,750.5,750.5,1.0,1426.0,acetate
2,warmup,0.0,0.0,0.0,0.0,0.0,acetate
3,divergent__,0.0,0.0,0.0,0.0,0.0,acetate
4,energy__,20527.126599,20532.291535,20531.937614,20527.5,20537.7,acetate
5,treedepth__,2.0,3.037167,3.0,2.0,4.0,acetate
6,accept_stat__,0.990509,0.881293,0.955719,0.429058,1.0,acetate
7,stepsize__,0.470824,0.447244,0.468566,0.359754,0.49209,acetate
8,n_leapfrog__,7.0,8.009333,7.0,7.0,15.0,acetate
9,tau_alpha,2.836981,2.68526,2.538858,0.000628843,5.69045,acetate


In [26]:
_auto[_auto['idx']==idx]['mean_yfp'].values.mean()

164.29079835768732

In [37]:
_auto[(_auto['date']==_g[0]) & (_auto['run_number']==_g[1])]['mean_yfp'].values.mean()

164.29079835768732