# Calculate low and high stock solutions concentrations given standard media recipe

This notebook calculates stock concentrations for a media optimization project. Given a standard media recipe, ranges for intervals to be explored, the notebook generates sets of low and high concentrations such that media preparation can be done without dilutions (which reduces the number of operations, hence time, and the number of pipette tips needed). 


## Inputs and outputs

#### Required file to run this notebook:
   - `../data/flaviolin/standard_recipe_concentrations.csv`
   
   A file with the standard media recipe, with same units for each component in [mM]. This file also contains a column with **solubility limits** for each component. 
   
   Note that in this study, the target concentration for Kanamycin is given as dilution factor (e.g. 1x).
   
An example of the file content:

| Component | Concentration[mM]   | Solubility[mM]
|------|------|------|
|   MOPS  | 40 | 2389.37 |
| H3BO3 | 0.004 | 700 |
| K2SO4 | 0.29 | 636.98 |


#### Files generated by running this notebook:

   - `stock_concentrations.csv`
   
   - `bounds_file.csv` (optionally, the file with bounds for ART is created)
 
   The files are stored in the user defined directory.

## Setup

Importing needed libraries:

In [1]:
import sys
sys.path.append('../')

import string
import pandas as pd
import numpy as np
import scipy

from core import find_volumes, check_solubility, test_volumes

## User parameters

In [2]:
user_params = {
    'standard_media_file': '../data/flaviolin/standard_recipe_concentrations.csv',  
    'output_file_path': '../data/flaviolin/', # Folder for output files
    'factor_range': 10,             # How many times higher/lower values from the 
    # standard media you want to explore? If you want to explore different 
    # relative ranges across components, you can specify it below (see cell 6)
#     'bounds_file': '../data/flaviolin/Putida_media_bounds.csv', # name of the file with bounds needed for ART
    'well_volume': 500,            # Total volume of the media content+culture in the well
    'min_volume_transfer': 5,       # Minimal transfer volume of the liquid handler
    'culture_factor': 100,          # Dilution factor for culture, e.g. 100x, 1000x
    } 

In [3]:
culture_volume = user_params['well_volume'] / user_params['culture_factor']


Read the standard media recipe concentrations

In [4]:
df_stand = pd.read_csv(user_params['standard_media_file'])
df_stand = df_stand.set_index("Component")
df_stand

Unnamed: 0_level_0,Concentration[mM],Solubility[mM]
Component,Unnamed: 1_level_1,Unnamed: 2_level_1
MOPS,40.0,2389.37
Tricine,4.0,500.08
H3BO3,0.004,700.0
Glucose,20.0,5045.63
K2SO4,0.29,636.98
K2HPO4,1.32,8564.84
FeSO4,0.01,1645.73
NH4Cl,9.52,6543.28
MgCl2,0.52,569.27
NaCl,50.0,6160.16


Assign exploration ranges for each component. A factor of 1.5 means we want to explore values 50% higher and 50% lower than the values from the standard recipe.

First, the value from `user_params['factor_range']` is assigned to all components. Individual values then can be modified if needed.

In [5]:
num_components = len(df_stand)
df_stand['Factor'] = user_params['factor_range']* np.ones(num_components)

In [6]:
df_stand.at['MOPS', 'Factor'] = 1.0
df_stand.at['Tricine', 'Factor'] = 1.0
df_stand.at['Glucose', 'Factor'] = 1.0
df_stand.at['K2HPO4', 'Factor'] = 5.0
df_stand.at['NH4Cl', 'Factor'] = 1.5
df_stand.at['Kan', 'Factor'] = 1.

In [7]:
df_stand

Unnamed: 0_level_0,Concentration[mM],Solubility[mM],Factor
Component,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
MOPS,40.0,2389.37,1.0
Tricine,4.0,500.08,1.0
H3BO3,0.004,700.0,10.0
Glucose,20.0,5045.63,1.0
K2SO4,0.29,636.98,10.0
K2HPO4,1.32,8564.84,5.0
FeSO4,0.01,1645.73,10.0
NH4Cl,9.52,6543.28,1.5
MgCl2,0.52,569.27,10.0
NaCl,50.0,6160.16,10.0


Define target low and high concentration levels:

In [8]:
target_conc_low = df_stand['Concentration[mM]'] / df_stand['Factor']
conc_low_round = np.array([round(conc,6) for conc in list(target_conc_low)])
target_conc_low = conc_low_round
target_conc_high = df_stand['Concentration[mM]'] * df_stand['Factor']

Save low and high levels of concentrations to a `bounds_file` file needed for ART.

In [11]:
if 'bounds_file' in user_params:
    df_bounds = pd.DataFrame(columns=['Variable', 'Min', 'Max'])
    df_bounds['Variable'] = df_stand.index
    df_bounds['Min'] = target_conc_low
    df_bounds['Max'] = target_conc_high.values
    df_bounds = df_bounds.set_index('Variable')
    df_bounds = df_bounds[df_stand['Factor'] > 1.]
    df_bounds.to_csv(path_or_buf=user_params['bounds_file'])
    display(df_bounds)

## Find a set of low level stock concentrations that can achieve the lowest levels of target concentrations

$$c_s=\frac{c_{t_{\min}} \cdot V_\text{well}}{V_{\min}}$$

In [12]:
min_tip_volume = user_params['min_volume_transfer']
df_low = pd.DataFrame(
    index=df_stand.index,
    columns=["Stock Concentration[mM]", "Target Concentration[mM]"])
df_low["Target Concentration[mM]"] = target_conc_low
df_low["Stock Concentration[mM]"] = df_low["Target Concentration[mM]"]*user_params['well_volume']/min_tip_volume
df_low

Unnamed: 0_level_0,Stock Concentration[mM],Target Concentration[mM]
Component,Unnamed: 1_level_1,Unnamed: 2_level_1
MOPS,4000.0,40.0
Tricine,400.0,4.0
H3BO3,0.04,0.0004
Glucose,2000.0,20.0
K2SO4,2.9,0.029
K2HPO4,26.4,0.264
FeSO4,0.1,0.001
NH4Cl,634.6667,6.346667
MgCl2,5.2,0.052
NaCl,500.0,5.0


### Check solubility 

Increase the volume transfer, in increments of 5uL, for the components for which concenstrations are not soluble (there is no need to make minimal volume transfers)

$$c^i_{s}=\frac{c^i_{t_{\min}} \cdot V_\text{well}}{V_{\min}+5}$$

In [13]:
if 'Solubility[mM]' in df_stand.columns:
    
    nonsol_comp_low = check_solubility(df_low, solubility=df_stand['Solubility[mM]'])
    volume_transfer = min_tip_volume

    i = 0
    while len(nonsol_comp_low) > 0:    
        print(f'Iteration {i}\n')
        volume_transfer += min_tip_volume

        for comp in nonsol_comp_low:
            df_low.at[comp,"Stock Concentration[mM]"] = df_low.at[
                comp,"Target Concentration[mM]"
            ]*user_params['well_volume']/volume_transfer

        nonsol_comp_low = check_solubility(df_low, solubility=df_stand['Solubility[mM]'])
        i += 1

    df_low
    
else:
    print('Solubility values are not provided and it is assumed the limits are not reached.')

Components for which those concentrations are not soluble:
	MOPS
Iteration 0



Check if all volumes are larger than the minimal transfer volume (5 uL)

In [14]:
EPS = 0.000001
volumes, df = find_volumes(
    user_params['well_volume'], 
    components=df_low.index,
    stock_conc_val=df_low['Stock Concentration[mM]'].values, 
    target_conc_val=df_low['Target Concentration[mM]'].values,
    culture_ratio=user_params['culture_factor']
)
assert (df['Volumes[uL]'].values >= min_tip_volume - EPS).all(), f"Not all volumes are >={min_tip_volume}uL!"

In [15]:
df

Unnamed: 0_level_0,Stock Concentration[mM],Target Concentration[mM],Volumes[uL]
Component,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
MOPS,2000.0,40.0,10.0
Tricine,400.0,4.0,5.0
H3BO3,0.04,0.0004,5.0
Glucose,2000.0,20.0,5.0
K2SO4,2.9,0.029,5.0
K2HPO4,26.4,0.264,5.0
FeSO4,0.1,0.001,5.0
NH4Cl,634.6667,6.346667,5.0
MgCl2,5.2,0.052,5.0
NaCl,500.0,5.0,5.0


Round to 5 digits after decimal point

In [16]:
num_digits = 6
conc = np.array([round(num, num_digits) for num in list(df_low['Stock Concentration[mM]'].values)])
df_low['Stock Concentration[mM]'] = conc


## Find a set of high level stock concentrations that can achieve the highest levels of target concentrations

Find stock concentrations for the upper limit in the range to explore.

In [17]:
df_high = df_low.copy()
df_high["Target Concentration[mM]"] = target_conc_high
df_high["Solubility[mM]"] = df_stand['Solubility[mM]']

Check if there are feasible volumes for these concentrations:

In [18]:
try:
    volumes, df = find_volumes(
        user_params['well_volume'],
        components=df_high.index,
        stock_conc_val=df_high['Stock Concentration[mM]'].values, 
        target_conc_val=df_high['Target Concentration[mM]'].values,
        culture_ratio=user_params['culture_factor']
    )
    feasible_volumes = True
    assert (df['Volumes[uL]'].values >= min_tip_volume - EPS).all(), f"Not all volumes are >={min_tip_volume}uL!"
except AssertionError:
    feasible_volumes = False
    print("No feasible volumes are found!")
        

No feasible volumes are found!


### Find feasible volumes

If there are no feasible volumes, increase the current stock concentrations, by 5-fold increments, of components which are the furthest away from the solubility limit  

In [19]:
if not feasible_volumes:
    print("No feasible volumes")
    
    MULTIPL_FACTOR = 3

    success = False
    df = df_high.copy()

    i = 0
    while success is False:
        i += 1

        # Find ratios of solubility over current stock concentrations
        df['Ratio'] = df['Solubility[mM]'].values / df['Stock Concentration[mM]'].values

        # Find which component is the furthest away from the solubility limit 
        comp = df[df['Ratio'] > MULTIPL_FACTOR]['Ratio'].idxmax()

        # Increase the current stock concentration by a factor
        df.at[comp, 'Stock Concentration[mM]'] *= MULTIPL_FACTOR

        # Find if there are feasible volumes for such stock and target concentrations
        try:
            volumes, df_high = find_volumes(
                user_params['well_volume'], 
                components=df.index,
                stock_conc_val=df['Stock Concentration[mM]'].values, 
                target_conc_val=df['Target Concentration[mM]'].values,
                culture_ratio=user_params['culture_factor']
            )
            success = True
            print(f'Iteration {i}:')
            print('Success!')
        except:
            pass
        
else:
    df_high = df.copy()
    
df_high["Solubility[mM]"] = df_stand['Solubility[mM]']

No feasible volumes
Iteration 76:
Success!


See what are the calculated volumes

In [20]:
df_high

Unnamed: 0_level_0,Stock Concentration[mM],Target Concentration[mM],Volumes[uL],Solubility[mM]
Component,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
MOPS,2000.0,40.0,10.0,2389.37
Tricine,400.0,4.0,5.0,500.08
H3BO3,87.48,0.04,0.228624,700.0
Glucose,2000.0,20.0,5.0,5045.63
K2SO4,78.3,2.9,18.518519,636.98
K2HPO4,712.8,6.6,4.62963,8564.84
FeSO4,218.7,0.1,0.228624,1645.73
NH4Cl,634.6667,14.28,11.249999,6543.28
MgCl2,46.8,5.2,55.555556,569.27
NaCl,1500.0,500.0,166.666667,6160.16


### Correct for minimal transfer volumes

If there are volumes that are smaller than the minimum transfer volume, change stock concentrations for those components (decrease the concentrations so that the volume increases).

In [21]:
# Find components with volume transfers smaller than the minimal
comp_small_vol = df_high[
    df_high['Volumes[uL]'] < min_tip_volume - EPS
].index
print(f"{len(comp_small_vol)} component(s) found with volume transfers smaller than the minimal")

# Define new volume transfer to be higher than the minimal, so there is some flexibility
NEW_VOLUME_TRANSFER = 5.0*min_tip_volume

for comp in comp_small_vol:
    factor_diff =  NEW_VOLUME_TRANSFER / (df_high.at[comp, 'Volumes[uL]'])
    print(f'Decreasing the concentration of {comp} by {factor_diff} times')
    df_high.at[comp, 'Stock Concentration[mM]'] /= factor_diff
    

8 component(s) found with volume transfers smaller than the minimal
Decreasing the concentration of H3BO3 by 109.35000000000001 times
Decreasing the concentration of K2HPO4 by 5.399999999999999 times
Decreasing the concentration of FeSO4 by 109.35 times
Decreasing the concentration of (NH4)6Mo7O24 by 8857.35 times
Decreasing the concentration of CoCl2 by 8857.350000000002 times
Decreasing the concentration of CuSO4 by 8857.350000000002 times
Decreasing the concentration of MnSO4 by 328.05 times
Decreasing the concentration of ZnSO4 by 26572.050000000003 times


Recalculate volumes for corrected stock concentrations:

In [22]:
volumes, df_high_new = find_volumes(
    user_params['well_volume'], 
    components=df_high.index,
    stock_conc_val=df_high['Stock Concentration[mM]'].values, 
    target_conc_val=df_high['Target Concentration[mM]'].values,
    culture_ratio=user_params['culture_factor']
)
df_high_new

Unnamed: 0_level_0,Stock Concentration[mM],Target Concentration[mM],Volumes[uL]
Component,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
MOPS,2000.0,40.0,10.0
Tricine,400.0,4.0,5.0
H3BO3,0.8,0.04,25.0
Glucose,2000.0,20.0,5.0
K2SO4,78.3,2.9,18.518519
K2HPO4,132.0,6.6,25.0
FeSO4,2.0,0.1,25.0
NH4Cl,634.6667,14.28,11.249999
MgCl2,46.8,5.2,55.555556
NaCl,1500.0,500.0,166.666667


Round to 5 digits after decimal point

In [23]:
df_high = df_high_new.copy()
num_digits = 5
conc = np.array([round(num, num_digits) for num in list(df_high['Stock Concentration[mM]'].values)])
df_high['Stock Concentration[mM]'] = conc


Create the final dataframe with low and high concentrations and dilution factor for their preparation

In [24]:
df_stock = df_low.copy()
df_stock.rename(columns={'Stock Concentration[mM]': 'Low Concentration[mM]'}, inplace=True)
df_stock = df_stock.drop(['Target Concentration[mM]'], axis='columns')
df_stock['High Concentration[mM]'] = df_high['Stock Concentration[mM]']
df_stock['Dilution Factor'] = df_stock['High Concentration[mM]']/df_stock['Low Concentration[mM]']
df_stock

Unnamed: 0_level_0,Low Concentration[mM],High Concentration[mM],Dilution Factor
Component,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
MOPS,2000.0,2000.0,1.0
Tricine,400.0,400.0,1.0
H3BO3,0.04,0.8,20.0
Glucose,2000.0,2000.0,1.0
K2SO4,2.9,78.3,27.0
K2HPO4,26.4,132.0,5.0
FeSO4,0.1,2.0,20.0
NH4Cl,634.6667,634.6667,1.0
MgCl2,5.2,46.8,9.0
NaCl,500.0,1500.0,3.0


### Test found stock concentrations for different, randomly chosen, target concentrations

Check what are the volumes for random choices of target concentrations within the given ranges:

In [25]:
%%time
test_volumes(df_stock, 
            target_conc_low, 
            target_conc_high, 
            n_samples=1000,
            well_volume=user_params['well_volume'],
            min_tip_volume=min_tip_volume,
            culture_ratio=user_params['culture_factor'],
            verbose=0
            )

Sucess rate: 95.8%
Sucess rate (water): 94.1%
CPU times: user 23.4 s, sys: 25.4 s, total: 48.8 s
Wall time: 13.7 s


### Save the file with stock concentrations

In [26]:
num_digits = 2
dil_fact = np.array([round(num, num_digits) for num in list(df_stock['Dilution Factor'].values)])
df_stock['Dilution Factor'] = dil_fact

Emphasize that kanamycin stock is given in terms of dilution factor:

In [27]:
kan_stock_low = df_stock.at['Kan', 'Low Concentration[mM]']
kan_stock_high = df_stock.at['Kan', 'High Concentration[mM]']
df_stock.at['Kan'] = [f'{kan_stock_low:.0f}x', f'{kan_stock_high:.0f}x', 1.]
df_stock

Unnamed: 0_level_0,Low Concentration[mM],High Concentration[mM],Dilution Factor
Component,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
MOPS,2000.0,2000.0,1.0
Tricine,400.0,400.0,1.0
H3BO3,0.04,0.8,20.0
Glucose,2000.0,2000.0,1.0
K2SO4,2.9,78.3,27.0
K2HPO4,26.4,132.0,5.0
FeSO4,0.1,2.0,20.0
NH4Cl,634.6667,634.6667,1.0
MgCl2,5.2,46.8,9.0
NaCl,500.0,1500.0,3.0


In [28]:
stock_conc_file = f'{user_params["output_file_path"]}stock_concentrations_500uL.csv'


In [29]:
df_stock.to_csv(stock_conc_file)