# Flux sampling vs oxygen concentration 
In order to better see the shifts of flux within the metabolism we can use two different approaches first we can use flux variability analysis to determine the range of flux for each reaction. While this will show how each reaction is able to change it does not show in which direction. To be more accurate and have the ability to apply statistics to show change we can use flux sampling. We will be following the work done by [Hermann et al.](https://www.nature.com/articles/s41540-019-0109-0) and be able to ask some systems wide questions that we cannot with regular FBA. 

1) What is the flux distributions between Rnf and Fix under high and low oxygen concentrations?
2) How does carbon metabolism reorganize to increase energy to the ETS and nitrogenase? 
3) What is the metabolic cost of  increase oxygen on nitrogen fixation?

Three main comparisons in the samples are shifts of the ETS, shifts of the ED and TCA pathways, and shifts of nitrogenase flux. 

From the previous work done in ATPM_compare.ipynb we can show that one pathway NII_BD_R limits the excess ATPM in all conditions but we are still not accurate in the higher oxygen concentrations of 148 and 192 uM. And while a difference between Fix and Rnf is seen in the ATPM study the difference is only a small effect and can be studied in by sampling. So to be accurate we will be using the 108 uM O2 as our high oxygen and 12 uM o2 as our low with only the NDH II and cyt BD as the ETS path. Using 108 uM as not only because it is more accurate but because it also has similar growth yeilds as the higher O2 concentrations while still being distinct from 12 uM conditions. 

In [1]:
import cobra.test
import pandas as pd
import matplotlib.pyplot as plt
import os
import math
from os.path import join
from cobra import Model, Reaction, Metabolite
from cobra.io import save_json_model
from cobra.sampling import sample
import numpy as np
import time
import pandas as pd
import matplotlib.pyplot as plt
import matplotlib.backends.backend_pdf
from scipy.stats import mannwhitneyu
from scipy.stats import kruskal
from cobra.sampling import OptGPSampler
from scipy.stats import wilcoxon

In [2]:
#import model twice to make a high and low model

model_n2_highO2 = cobra.io.load_json_model("../Data/Models/iAA1300_C.json")
model_n2_lowO2 = cobra.io.load_json_model("../Data/Models/iAA1300_C.json")


In [12]:
#change bounds of model to match growth rate of ~0.2 at the experimental sucrose uptake and predicted ATPM while removing NDH I and Cytchrome C
#data 12uM NII_BD_R 3.79 suc 16.25 ATPM
#data 108uM NII_BD_R 9.32 suc 110.8 ATPM     #rounded 


model_n2_highO2.reactions.get_by_id("EX_glc__D_e").lower_bound = 0
model_n2_highO2.reactions.get_by_id("EX_glc__D_e").upper_bound = 0
model_n2_lowO2.reactions.get_by_id("EX_glc__D_e").lower_bound = 0
model_n2_lowO2.reactions.get_by_id("EX_glc__D_e").upper_bound = 0


model_n2_highO2.reactions.get_by_id("EX_sucr_e").lower_bound = -9
model_n2_highO2.reactions.get_by_id("EX_sucr_e").upper_bound = 0
model_n2_lowO2.reactions.get_by_id("EX_sucr_e").lower_bound = -4
model_n2_lowO2.reactions.get_by_id("EX_sucr_e").upper_bound = 0

model_n2_highO2.reactions.get_by_id("PHBS_syn").upper_bound = 1000
model_n2_highO2.reactions.get_by_id("PHBS_syn").lower_bound = 0
model_n2_lowO2.reactions.get_by_id("PHBS_syn").upper_bound = 1000
model_n2_lowO2.reactions.get_by_id("PHBS_syn").lower_bound = 0

model_n2_highO2.reactions.get_by_id("ATPM").lower_bound = 110
model_n2_highO2.reactions.get_by_id("ATPM").upper_bound = 110
model_n2_lowO2.reactions.get_by_id("ATPM").lower_bound = 16
model_n2_lowO2.reactions.get_by_id("ATPM").upper_bound = 16


model_n2_highO2.reactions.get_by_id("O2tpp").lower_bound = 0
model_n2_highO2.reactions.get_by_id("O2tpp").upper_bound = 1000
model_n2_lowO2.reactions.get_by_id("O2tpp").lower_bound = 0
model_n2_lowO2.reactions.get_by_id("O2tpp").upper_bound = 1000

model_n2_highO2.reactions.get_by_id("NADH6").lower_bound = 0
model_n2_highO2.reactions.get_by_id("NADH6").upper_bound = 0
model_n2_lowO2.reactions.get_by_id("NADH6").lower_bound = 0
model_n2_lowO2.reactions.get_by_id("NADH6").upper_bound = 0

model_n2_highO2.reactions.get_by_id("CYOO2pp").lower_bound = 0
model_n2_highO2.reactions.get_by_id("CYOO2pp").upper_bound = 0
model_n2_lowO2.reactions.get_by_id("CYOO2pp").lower_bound = 0
model_n2_lowO2.reactions.get_by_id("CYOO2pp").upper_bound = 0

Lets run FBA on these conditions 


In [13]:
solution_model_n2_highO2 = model_n2_highO2.optimize()

u = solution_model_n2_highO2.objective_value
DT = 0.69314718056/u

#write flux to csv using pandas
df= pd.DataFrame.from_dict([solution_model_n2_highO2.fluxes]).T
df.to_csv('../Data/Sampling_O2/FBA_results/model_n2_highO2.csv')

#Output growth rate
print('High O2 sucrose growth rate: ' ,'%.3f' %u, ' 1/h' ' or a doubling time of ' ,'%.3f' %DT, 'hrs')
model_n2_highO2.summary()

High O2 sucrose growth rate:  0.201  1/h or a doubling time of  3.450 hrs


Metabolite,Reaction,Flux,C-Number,C-Flux
ca2_e,EX_ca2_e,0.001046,0,0.00%
cl_e,EX_cl_e,0.001046,0,0.00%
cobalt2_e,EX_cobalt2_e,5.023e-06,0,0.00%
cu2_e,EX_cu2_e,0.0001425,0,0.00%
fe2_e,EX_fe2_e,0.001658,0,0.00%
fe3_e,EX_fe3_e,0.001569,0,0.00%
h_e,EX_h_e,0.2507,0,0.00%
k_e,EX_k_e,0.03922,0,0.00%
mg2_e,EX_mg2_e,0.001743,0,0.00%
mn2_e,EX_mn2_e,0.0001388,0,0.00%

Metabolite,Reaction,Flux,C-Number,C-Flux
4crsol_c,DM_4crsol_c,-4.481e-05,7,0.00%
5drib_c,DM_5drib_c,-4.521e-05,5,0.00%
amob_c,DM_amob_c,-4.019e-07,15,0.00%
ade_e,EX_ade_e,-4.521e-05,5,0.00%
co2_e,EX_co2_e,-99.98,1,100.00%
dxylnt_e,EX_dxylnt_e,-0.0001344,5,0.00%
fald_e,EX_fald_e,-4.019e-07,1,0.00%
h2o_e,EX_h2o_e,-93.07,0,0.00%


In [14]:
solution_model_n2_lowO2 = model_n2_lowO2.optimize()

u = solution_model_n2_lowO2.objective_value
DT = 0.69314718056/u

#write flux to csv using pandas
df= pd.DataFrame.from_dict([solution_model_n2_lowO2.fluxes]).T
df.to_csv('../Data/Sampling_O2/FBA_results/model_n2_highO2.csv')

#Output growth rate
print('High O2 sucrose growth rate: ' ,'%.3f' %u, ' 1/h' ' or a doubling time of ' ,'%.3f' %DT, 'hrs')
model_n2_lowO2.summary()

High O2 sucrose growth rate:  0.219  1/h or a doubling time of  3.158 hrs


Metabolite,Reaction,Flux,C-Number,C-Flux
ca2_e,EX_ca2_e,0.001142,0,0.00%
cl_e,EX_cl_e,0.001142,0,0.00%
cobalt2_e,EX_cobalt2_e,5.487e-06,0,0.00%
cu2_e,EX_cu2_e,0.0001556,0,0.00%
fe2_e,EX_fe2_e,0.001811,0,0.00%
fe3_e,EX_fe3_e,0.001714,0,0.00%
h_e,EX_h_e,0.2738,0,0.00%
k_e,EX_k_e,0.04284,0,0.00%
mg2_e,EX_mg2_e,0.001904,0,0.00%
mn2_e,EX_mn2_e,0.0001517,0,0.00%

Metabolite,Reaction,Flux,C-Number,C-Flux
4crsol_c,DM_4crsol_c,-4.894e-05,7,0.00%
5drib_c,DM_5drib_c,-4.938e-05,5,0.00%
amob_c,DM_amob_c,-4.389e-07,15,0.00%
ade_e,EX_ade_e,-4.938e-05,5,0.00%
co2_e,EX_co2_e,-39.24,1,100.00%
dxylnt_e,EX_dxylnt_e,-0.0001468,5,0.00%
fald_e,EX_fald_e,-4.389e-07,1,0.00%
h2o_e,EX_h2o_e,-37.52,0,0.00%


from the cobra.sampling we will using the sample function to get the samples for each condition. As this is a very computaional process below will just be a demo at 10000 samples. Further samples require a computer cluster and can be found in the sampling.py script. 

In [16]:
S_model_n2_highO2 = sample(model_n2_highO2, n=10000, thinning=1000,  processes=8)

In [18]:
S_model_n2_highO2.to_csv("../Data/Sampling_O2/Sampling_results/model_n2_highO2_samples.csv")

In [19]:
S_model_n2_lowO2 = sample(model_n2_lowO2, n=10000, thinning=1000,  processes=8)

In [20]:
S_model_n2_lowO2.to_csv("../Data/Sampling_O2/Sampling_results/model_n2_lowO2_samples.csv")

In order to visualize some of the data we will use the histogram script from Herrmann et al as a template  

In [28]:
# Load Function and modfied slightly from 
# https://github.com/HAHerrmann/FluxSamplingComparison/blob/master/ArabidopsisStudy/FluxSamplingAnalysisArabidopsis.ipynb
def make_svg(i,samples1,samples2,x1,x2,Condition):
    #print(i)
    x = samples1[i]
    y = samples2[i]
    bns = 100
    
    weights_x = np.ones_like(x)/float(len(x))
    weights_y = np.ones_like(y)/float(len(y))
    
    xn, xbins, xpatches = plt.hist(x, bins=bns, alpha=0.5,  color = "#b05abd", lw=0)
    yn, ybins, ypatches = plt.hist(y, bins=bns, alpha=0.5,  color = "#c7783b", lw=0)
    
    #plot FBA results as vertical line
    plt.axvline(x = x1,linewidth=2, linestyle='--', color="#b05abd")
    plt.axvline(x = x2,linewidth=2, linestyle='--', color="#c7783b")
    
    #determine if distrubtions are different
    stat, p = wilcoxon(x, y)
    print("Reaction {}".format(i), p)
    alpha = 0.05
    if p > alpha:
        print('Same distribution')
    else:
        print('Different distribution')
        
    plt.xticks([round(min(min(x),min(y)),3),round(max(max(x),max(y)),3)],fontsize=30)
    plt.yticks([],fontsize=30)
    plt.title("{}".format(i),y=0.99)
    plt.tight_layout()
    plt.savefig("../Outputs/Sampling_O2/{0:}/{0:}_{1:}_Sample_FBA.tiff".format(Condition,i), dpi= 60, format="tiff")
    plt.close()

In [29]:
for i in [ "NADH5", "RNF", "FIX", "NIT1b", "CYTBDpp", "ATPS4rpp",
         "PDH", "GLNS", "ICL", "ICDHyr", "NAD_H2", "HYD1pp", "GLCDpp", "EDD", 
         "HEX1", "GND", "EDD", "CS", "ICDHyr", "AKGDH", "GLUSy", "PPCK", "MDH2", "ME1"]: 
    x1 = solution_model_n2_lowO2.fluxes[i]
    x2 = solution_model_n2_highO2.fluxes[i]
    make_svg(i,S_model_n2_lowO2,S_model_n2_highO2,x1,x2,"High_Low_O2")

Reaction NADH5 0.0
Different distribution
Reaction RNF 1.2197089319534001e-23
Different distribution
Reaction FIX 0.0
Different distribution
Reaction NIT1b 0.0
Different distribution
Reaction CYTBDpp 0.0
Different distribution
Reaction ATPS4rpp 0.0
Different distribution
Reaction PDH 6.9171476245858966e-18
Different distribution
Reaction GLNS 5.975236120659042e-230
Different distribution
Reaction ICL 0.0
Different distribution
Reaction ICDHyr 0.0
Different distribution
Reaction NAD_H2 2.7284521293583587e-24
Different distribution
Reaction HYD1pp 7.528618136315805e-57
Different distribution
Reaction GLCDpp 6.642924481450255e-24
Different distribution
Reaction EDD 0.0
Different distribution
Reaction HEX1 1.654437065820147e-06
Different distribution
Reaction GND 0.0
Different distribution
Reaction EDD 0.0
Different distribution
Reaction CS 0.0
Different distribution
Reaction ICDHyr 0.0
Different distribution
Reaction AKGDH 0.0
Different distribution
Reaction GLUSy 3.0265043814480384e-195
