# Calculate low and high stock solutions concentrations given standard media recipe

This notebook calculates stock concentrations for a media optimization project. Given a standard media recipe, ranges for intervals to be explored, the notebook generates sets of low and high concentrations such that media preparation can be done without dilutions (which reduces the number of operations, hence time, and the number of pipette tips needed). 


Tested using **ART 3.9.4** kernel on jprime.lbl.gov

## Inputs and outputs

#### Required files to run this notebook:
   - `../data/standard_recipe_concentrations.csv`
   
   A file with the standard media recipe. This file also contains a column with **solubility limits** for each component. 
   
An example of the file content:

| Component | Concentration   | Solubility
|------|------|------|
|   MOPS[mM]  | 40 | 2389.37 |
| H3BO3[mM] | 0.004 | 700 |
| K2SO4[mM] | 0.29 | 636.98 |

   - `../data/Putida_media_bounds.csv`
   
   A file containing upper and lower bounds for media components to be explored.


#### Files generated by running this notebook:

   - `stock_concentrations.csv`
    
stored in the user defined directory.

## Setup

Importing needed libraries:

In [5]:
import sys
sys.path.append('../../media_compiler')


import string
import pandas as pd
import numpy as np
import scipy

from pyDOE import lhs

import core


### User parameters

In [6]:
user_params = {
    'standard_media_file': '../flaviolin data/standard_recipe_concentrations.csv',  
    'output_file_path': '../flaviolin data/', # Folder for output files
    'stock_conc_filename': 'stock_concentrations.csv', # Name of the file containing stock concentrations
    'bounds_file': '../flaviolin data/Putida_media_bounds_all_components.csv', # name of the file with bounds needed for ART
    'well_volume': 1500,            # Total volume of the media content+culture in the well
    'min_volume_transfer': 5,       # Minimal transfer volume of the liquid handler
    'culture_factor': 100,          # Dilution factor for culture, e.g. 100x, 1000x
    } 

In [7]:
# user_params = {
#     'standard_media_file': '../flaviolin data/standard_recipe_concentrations_extended.csv',  
#     'output_file_path': '../flaviolin data/', # Folder for output files
#     'stock_conc_filename': 'stock_concentrations_extended.csv', # Name of the file containing stock concentrations
#     'bounds_file': '../flaviolin data/Putida_media_bounds_extended_6p1.csv', # name of the file with bounds needed for ART
#     'well_volume': 1500,            # Total volume of the media content+culture in the well
#     'min_volume_transfer': 5,       # Minimal transfer volume of the liquid handler
#     'culture_factor': 100,          # Dilution factor for culture, e.g. 100x, 1000x
#     } 

In [8]:
culture_volume = user_params['well_volume'] / user_params['culture_factor']


Read the standard media recipe concentrations

In [9]:
df_stand = pd.read_csv(user_params['standard_media_file'])
df_stand = df_stand.set_index("Component")
df_stand

Unnamed: 0_level_0,Concentration,Solubility
Component,Unnamed: 1_level_1,Unnamed: 2_level_1
MOPS[mM],40.0,1700.0
Tricine[mM],4.0,500.08
H3BO3[mM],0.004,700.0
Glucose[mM],20.0,5045.63
K2SO4[mM],0.29,636.98
K2HPO4[mM],1.32,8564.84
FeSO4[mM],0.01,1645.73
NH4Cl[mM],9.52,6543.28
MgCl2[mM],0.52,569.27
NaCl[mM],50.0,6160.16


In [11]:
if 'bounds_file' in user_params:
    df_bounds = pd.read_csv(user_params['bounds_file'])
    df_bounds = df_bounds.set_index("Variable")
    display(df_bounds)
else:
    print("Please provide the correct path to bounds file.")

Unnamed: 0_level_0,Min,Max
Variable,Unnamed: 1_level_1,Unnamed: 2_level_1
MOPS[mM],40.0,40.0
Tricine[mM],4.0,4.0
H3BO3[mM],0.0004,0.08
Glucose[mM],20.0,20.0
K2SO4[mM],0.01,1.0
K2HPO4[mM],0.264,13.2
FeSO4[mM],0.001,0.1
NH4Cl[mM],6.4,47.6
MgCl2[mM],0.026,2.6
NaCl[mM],5.0,500.0


## Find a set of low level stock concentrations that can achieve the lowest levels of target concentrations

$$c_s=\frac{c_{t_{\min}} \cdot V_\text{well}}{V_{\min}}$$

In [12]:
min_tip_volume = user_params['min_volume_transfer']
df_low = pd.DataFrame(
    index=df_stand.index,
    columns=["Stock Concentration", "Target Concentration"])
df_low["Target Concentration"] = df_bounds['Min']
df_low["Stock Concentration"] = df_low["Target Concentration"]*user_params['well_volume']/min_tip_volume


### Check solubility 

Increase the volume transfer, in increments of 5uL, for the components for which concenstrations are not soluble (there is no need to make minimal volume transfers)

$$c^i_{s}=\frac{c^i_{t_{\min}} \cdot V_\text{well}}{V_{\min}+5}$$

In [13]:
if 'Solubility' in df_stand.columns:
    
    nonsol_comp_low = core.check_solubility(df_low, solubility=df_stand['Solubility'])
    volume_transfer = min_tip_volume

    i = 0
    while len(nonsol_comp_low) > 0:    
        print(f'  Iteration {i}\n')
        volume_transfer += min_tip_volume

        for comp in nonsol_comp_low:
            df_low.at[comp,"Stock Concentration"] = df_low.at[
                comp,"Target Concentration"
            ]*user_params['well_volume']/volume_transfer

        nonsol_comp_low = core.check_solubility(df_low, solubility=df_stand['Solubility'])
        i += 1
    
else:
    print('Solubility values are not provided and it is assumed the limits are not reached.')
    

Components for which those concentrations are not soluble:
	MOPS[mM]
	Tricine[mM]
	Glucose[mM]
  Iteration 0

Components for which those concentrations are not soluble:
	MOPS[mM]
	Tricine[mM]
  Iteration 1

Components for which those concentrations are not soluble:
	MOPS[mM]
  Iteration 2

Components for which those concentrations are not soluble:
	MOPS[mM]
  Iteration 3

Components for which those concentrations are not soluble:
	MOPS[mM]
  Iteration 4

Components for which those concentrations are not soluble:
	MOPS[mM]
  Iteration 5

Components for which those concentrations are not soluble:
	MOPS[mM]
  Iteration 6



Check if all volumes are larger than the minimal transfer volume (5 uL)

In [60]:
df_low

Unnamed: 0_level_0,Stock Concentration,Target Concentration
Component,Unnamed: 1_level_1,Unnamed: 2_level_1
MOPS[mM],1500.0,40.0
Tricine[mM],400.0,4.0
H3BO3[mM],0.12,0.0004
Glucose[mM],3000.0,20.0
K2SO4[mM],3.0,0.01
K2HPO4[mM],79.2,0.264
FeSO4[mM],0.3,0.001
NH4Cl[mM],1920.0,6.4
MgCl2[mM],7.8,0.026
NaCl[mM],1500.0,5.0


In [61]:
df_low['Stock Concentration'][0] = 1700 
#correct the stock concentration for MOPS, for the real concentration of the stock

In [62]:
EPS = 0.000001
volumes, df = core.find_volumes(
    user_params['well_volume'], 
    components=df_low.index[:-1],
    stock_conc_val=df_low['Stock Concentration'].values[:-1], 
    target_conc_val=df_low['Target Concentration'].values[:-1],
    culture_ratio=user_params['culture_factor']
)
assert (df['Volumes[uL]'].values >= min_tip_volume - EPS).all(), f"Not all volumes are >={min_tip_volume}uL!"

In [63]:
df

Unnamed: 0_level_0,Stock Concentration,Target Concentration,Volumes[uL]
Component,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
MOPS[mM],1700.0,40.0,35.294118
Tricine[mM],400.0,4.0,15.0
H3BO3[mM],0.12,0.0004,5.0
Glucose[mM],3000.0,20.0,10.0
K2SO4[mM],3.0,0.01,5.0
K2HPO4[mM],79.2,0.264,5.0
FeSO4[mM],0.3,0.001,5.0
NH4Cl[mM],1920.0,6.4,5.0
MgCl2[mM],7.8,0.026,5.0
NaCl[mM],1500.0,5.0,5.0


Round to 6 digits after decimal point

In [64]:
num_digits = 6
conc = np.array([round(num, num_digits) for num in list(df_low['Stock Concentration'].values)])
df_low['Stock Concentration'] = conc


## Find a set of high level stock concentrations that can achieve the highest levels of target concentrations

Find stock concentrations for the upper limit in the range to explore.

In [100]:
df_high_real = pd.read_csv('../flaviolin data/24-well_stock_plate_high.csv')
df_high_real = df_high_real.drop(['Well'], axis=1)
df_high_real = df_high_real.set_index('Component')
df_high_real = df_high_real.rename({'Concentration': 'Stock Concentration'})
df_high_real['Stock Concentration'] = df_high_real['Concentration']
df_high_real = df_high_real.drop(['Concentration'], axis=1)
df_high_real["Target Concentration"] = df_bounds['Max']
df_high_real["Solubility"] = df_stand['Solubility']
df_high_real.loc['CaCl2[mM]', 'Target Concentration'] = 0.005
df_high_real.loc['CaCl2[mM]', 'Solubility'] = 999.86
df_high_real.loc['Kan[g/l]', 'Target Concentration'] = 0.05
df_high_real.loc['Kan[g/l]', 'Solubility'] = 50
df_high_real

Unnamed: 0_level_0,Stock Concentration,Target Concentration,Solubility
Component,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
MOPS[mM],1700.0,40.0,1700.0
Tricine[mM],400.0,4.0,500.08
H3BO3[mM],2.4,0.08,700.0
Glucose[mM],3000.0,20.0,5045.63
K2SO4[mM],43.5,1.0,636.98
K2HPO4[mM],396.0,13.2,8564.84
FeSO4[mM],6.0,0.1,1645.73
NH4Cl[mM],1900.0,47.6,6543.28
MgCl2[mM],15.6,2.6,569.27
NaCl[mM],1500.0,500.0,6160.16


In [97]:
df_high = df_low.copy()
df_high["Target Concentration"] = df_bounds['Max']
df_high["Solubility"] = df_stand['Solubility']
df_high.iloc[-1,:] = [15, 0.05, 30] # add these for kan

In [101]:
df_high = df_high_real.copy()


In [102]:
# df_high.iloc[-1,:] = [15, 0.05, 30] # add these for kan

In [103]:
df_high

Unnamed: 0_level_0,Stock Concentration,Target Concentration,Solubility
Component,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
MOPS[mM],1700.0,40.0,1700.0
Tricine[mM],400.0,4.0,500.08
H3BO3[mM],2.4,0.08,700.0
Glucose[mM],3000.0,20.0,5045.63
K2SO4[mM],43.5,1.0,636.98
K2HPO4[mM],396.0,13.2,8564.84
FeSO4[mM],6.0,0.1,1645.73
NH4Cl[mM],1900.0,47.6,6543.28
MgCl2[mM],15.6,2.6,569.27
NaCl[mM],1500.0,500.0,6160.16


Check if there are feasible volumes for the low level concentrations found above:

In [104]:
try:
    volumes, df = core.find_volumes(
        user_params['well_volume'],
        components=df_high.index[:-1],
        stock_conc_val=df_high['Stock Concentration'].values[:-1], 
        target_conc_val=df_high['Target Concentration'].values[:-1],
        culture_ratio=user_params['culture_factor']
    )
    feasible_volumes = True
    assert (df['Volumes[uL]'].values >= min_tip_volume - EPS).all(), f"Not all volumes are >={min_tip_volume}uL!"
except AssertionError:
    feasible_volumes = False
    print(core.NoFeasibleVolumesWarn())
    

No feasible volumes are found!


In [105]:
df_high

Unnamed: 0_level_0,Stock Concentration,Target Concentration,Solubility
Component,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
MOPS[mM],1700.0,40.0,1700.0
Tricine[mM],400.0,4.0,500.08
H3BO3[mM],2.4,0.08,700.0
Glucose[mM],3000.0,20.0,5045.63
K2SO4[mM],43.5,1.0,636.98
K2HPO4[mM],396.0,13.2,8564.84
FeSO4[mM],6.0,0.1,1645.73
NH4Cl[mM],1900.0,47.6,6543.28
MgCl2[mM],15.6,2.6,569.27
NaCl[mM],1500.0,500.0,6160.16


### Find feasible volumes

Increase the current stock concentrations, by 5-fold increments, of components which are the furthest away from the solubility limit  

In [106]:
if not feasible_volumes:
    print("No feasible volumes")
    
    MULTIPL_FACTOR = 5

    success = False
    df = df_high.copy()

    i = 0
    while success is False:
        i += 1
        comp = None

        # Find ratios of solubility over current stock concentrations
        df['Ratio'] = df['Solubility'].values / df['Stock Concentration'].values
        
        # Find which component is the furthest away from the solubility limit
        while comp is None:
            if any(df['Ratio'] > MULTIPL_FACTOR):
                comp = df['Ratio'].idxmax()
            else:
                MULTIPL_FACTOR /= 2
        
        # Increase the current stock concentration by a factor
        df.at[comp, 'Stock Concentration'] *= MULTIPL_FACTOR

        # Find if there are feasible volumes for such stock and target concentrations
        try:
            volumes, df_high = core.find_volumes(
                user_params['well_volume'], 
                components=df.index,
                stock_conc_val=df['Stock Concentration'].values, 
                target_conc_val=df['Target Concentration'].values,
                culture_ratio=user_params['culture_factor']
            )
            success = True
            if success:
                print(f'Iteration {i}:')
                print('Success!')
        except:
            pass
else:
    df_high = df.copy()
    
df_high["Solubility"] = df_stand['Solubility']


No feasible volumes
Iteration 1:
Success!


See what are the calculated volumes

In [107]:
df_high

Unnamed: 0_level_0,Stock Concentration,Target Concentration,Volumes[uL],Solubility
Component,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
MOPS[mM],1700.0,40.0,35.294118,1700.0
Tricine[mM],400.0,4.0,15.0,500.08
H3BO3[mM],2.4,0.08,50.0,700.0
Glucose[mM],3000.0,20.0,10.0,5045.63
K2SO4[mM],43.5,1.0,34.482759,636.98
K2HPO4[mM],396.0,13.2,50.0,8564.84
FeSO4[mM],6.0,0.1,25.0,1645.73
NH4Cl[mM],1900.0,47.6,37.578947,6543.28
MgCl2[mM],15.6,2.6,250.0,569.27
NaCl[mM],1500.0,500.0,500.0,6160.16


### Correct for minimal transfer volumes

If there are volumes that are smaller than the minimum transfer volume, change stock concentrations for those components (decrease the concentrations so that the volume increases).

In [108]:
# Find components with volume transfers smaller than the minimal
comp_small_vol = df_high[
    df_high['Volumes[uL]'] < min_tip_volume - EPS
].index
print(f"{len(comp_small_vol)} component(s) found with volume transfers smaller than the minimal")

# Define new volume transfer to be higher than the minimal, so there is some flexibility
NEW_VOLUME_TRANSFER = 5.0*min_tip_volume

for comp in comp_small_vol:
    factor_diff =  NEW_VOLUME_TRANSFER / (df_high.at[comp, 'Volumes[uL]'])
    print(f'Decreasing the concentration of {comp} by {factor_diff} times')
    df_high.at[comp, 'Stock Concentration'] /= factor_diff
    

0 component(s) found with volume transfers smaller than the minimal


In [109]:
df_high

Unnamed: 0_level_0,Stock Concentration,Target Concentration,Volumes[uL],Solubility
Component,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
MOPS[mM],1700.0,40.0,35.294118,1700.0
Tricine[mM],400.0,4.0,15.0,500.08
H3BO3[mM],2.4,0.08,50.0,700.0
Glucose[mM],3000.0,20.0,10.0,5045.63
K2SO4[mM],43.5,1.0,34.482759,636.98
K2HPO4[mM],396.0,13.2,50.0,8564.84
FeSO4[mM],6.0,0.1,25.0,1645.73
NH4Cl[mM],1900.0,47.6,37.578947,6543.28
MgCl2[mM],15.6,2.6,250.0,569.27
NaCl[mM],1500.0,500.0,500.0,6160.16


In [110]:
np.sum(df_high['Target Concentration'].values / df_high['Stock Concentration'].values)

0.9815705490907797

In [111]:
# if np.sum(df_high['Target Concentration'].values / df_high['Stock Concentration'].values):
    #in this case the stock concentrations are too low and adding all these together will lead to high dilutions
    
    #to resolve, find which concentrations are a lot lower than the solubility limit and the volumes
    #are higher than 5ul to select which stock conc to increase:
    

Recalculate volumes for corrected stock concentrations:

In [112]:
volumes, df_high_new = core.find_volumes(
    user_params['well_volume'], 
    components=df_high.index,
    stock_conc_val=df_high['Stock Concentration'].values, 
    target_conc_val=df_high['Target Concentration'].values,
    culture_ratio=user_params['culture_factor']
)
df_high_new

Unnamed: 0_level_0,Stock Concentration,Target Concentration,Volumes[uL]
Component,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
MOPS[mM],1700.0,40.0,35.294118
Tricine[mM],400.0,4.0,15.0
H3BO3[mM],2.4,0.08,50.0
Glucose[mM],3000.0,20.0,10.0
K2SO4[mM],43.5,1.0,34.482759
K2HPO4[mM],396.0,13.2,50.0
FeSO4[mM],6.0,0.1,25.0
NH4Cl[mM],1900.0,47.6,37.578947
MgCl2[mM],15.6,2.6,250.0
NaCl[mM],1500.0,500.0,500.0


Round to 5 digits after decimal point

In [113]:
df_high = df_high_new.copy()
num_digits = 5
conc = np.array([round(num, num_digits) for num in list(df_high['Stock Concentration'].values)])
df_high['Stock Concentration'] = conc


Create the final dataframe with low and high concentrations and dilution factor for their preparation

In [114]:
df_stock = df_low.copy()
df_stock.rename(columns={'Stock Concentration': 'Low Concentration'}, inplace=True)
df_stock = df_stock.drop(['Target Concentration'], axis='columns')
df_stock['High Concentration'] = df_high['Stock Concentration']
df_stock['Dilution Factor'] = df_stock['High Concentration']/df_stock['Low Concentration']
df_stock

Unnamed: 0_level_0,Low Concentration,High Concentration,Dilution Factor
Component,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
MOPS[mM],1700.0,1700.0,1.0
Tricine[mM],400.0,400.0,1.0
H3BO3[mM],0.12,2.4,20.0
Glucose[mM],3000.0,3000.0,1.0
K2SO4[mM],3.0,43.5,14.5
K2HPO4[mM],79.2,396.0,5.0
FeSO4[mM],0.3,6.0,20.0
NH4Cl[mM],1920.0,1900.0,0.989583
MgCl2[mM],7.8,15.6,2.0
NaCl[mM],1500.0,1500.0,1.0


### Test found stock concentrations for different, randomly chosen, target concentrations

Create random target concentrations, sampled using Latin Hypercube, given lower/upper bounds:

In [115]:
n_samples = 1000

latin_hc = lhs(
    len(df_stock), samples=n_samples, criterion="maximin"
)

lb = df_bounds['Min'].ravel()
ub = df_bounds['Max'].ravel()

target_conc_val = lb + latin_hc * (ub - lb)

df_target_conc = pd.DataFrame(
    data=target_conc_val, 
    columns=df_stock.index
)

Check what are the volumes for random choices of target concentrations within the given ranges:

In [116]:
%%time

df_volumes = core.find_volumes_bulk(
    df_stock, 
    df_target_conc=df_target_conc, 
    well_volume=user_params['well_volume'],
    min_tip_volume=min_tip_volume,
    culture_ratio=user_params['culture_factor'],
    verbose=0
)

Sucess rate: 99.5%
Sucess rate (water): 99.5%
CPU times: user 1.36 s, sys: 0 ns, total: 1.36 s
Wall time: 1.36 s


### Save the file with stock concentrations

In [117]:
stock_conc_file = f'{user_params["output_file_path"]}/{user_params["stock_conc_filename"]}'

In [118]:
df_stock.to_csv(stock_conc_file)