# 20200416 Enrichment modeling

### Goals of notebook
* build a model that maps between REU values and final enrichments
* focus on the off-target ACE1A2mdm2 interaction from andrew

### Parameters of interest
* dilution amount
* relative REU values (mapped to relative growth rates)

### Things to figure out
* relationship between observed relative growth rates and the growth rate $r$ in the logistic function
* mapping between REU and relative growth rates

# Logistic model and assumptions

Beginning with a general competitive Lotka–Volterra model:

$$ \dot{x_i} = r_i x_i \bigg(1 - \frac{\sum_{j=1}^{N} \alpha_{ij} x_j}{ K_i} \bigg) $$,

Where $x_i$ is a given species, $r_i$ is its growth rate, $\alpha_{ij}$ is the inter/intraspecies competition, $x_j$ is all species, and $K_i$ is the carrying capacity for species $x_i$.

### Simplifying assumptions
#### $\alpha_{ij} = 1$
Assumes that all the species compete with eachother and for common resources with equal strengths.

#### $K_i = K$
Assumes that all species have identical carrying capacity (maximum OD)

#### $K = 1$
Normalize carrying capacity $K$ to be 1 since we can now work with relative dilutions of the carrying capacity. 

### Simplified model:

$$ \dot{x_i} = r_i x_i \bigg(1 - \sum_{j=1}^{N} x_j \bigg) $$

# Relationship between growth rate definitions

Taken from the Wikipedia page on exponential growth: https://www.wikiwand.com/en/Exponential_growth

### Standard exponential function

$$ \dot{x} = kx$$

Solves to:

$$ x(t) = x_0 e^{kt}$$

### Relationship between different exponential growth bases:
$$ x(t) = x_0 e^{kt} = x_0 e^{t / \tau} = x_0 2^{t/T} $$

### Relationship between growth rates

$$ k = \frac{1}{\tau} = \frac{\ln 2}{T}$$

* growth constant $k$ is the frequency (number of times per unit time) of growing by a factor of $e$
* $e$-folding time $\tau$ is the time it takes to grow by a factor of $e$
* $T$ is the doubling time

### Relationship to derived growth rates from plate-reader experiments

Plate-reader growth rates are derived by linear fit of the natural log of OD with respect to time. This yields the growth constant $k$ directly. $k$ directly feeds into the logistic growth function as $r_i$.

### Relationship to normalized (relative) growth rates

* must show that two species with normalized growth rates function the same as two species with absolute growth rates

# Initial value calculations ($x_i(t=0)$)

Maximum OD of all species together is the carrying capacity $K$, which we have normalized to 1. Therefore, all species concentrations, $x_i$, are defined as fractions of $K$.

The initial OD of a given species:

$$ x_i(t=0) = D*f_{i,0}$$

Where $f_{i,0}$ is the fraction of species, $x_i$, in the original culture and $D$ is the dilution rate into fresh media.

### Example
For two species, $x_1$ and $x_2$ with $x_1$ being 1:10 and $x_2$ being 9:10 of the library a 1:100 dilution into fresh media (all part-to-whole ratios), $D = 0.01$, $f_{1,0} = 0.9$, and $f_{2,0} = 0.1$:

$$x_1(t=0) = D * f_{0,1} = 0.009$$ 
$$x_2(t=0) = D * f_{0,2} = 0.001$$

### Aside: Part-to-part vs part-to-whole ratios

Important to note that all the ratios being discussed are part-to-whole ratios not part-to-part ratios. e.g. 1:10 means 1uL of the first into 9uL of the second NOT 1uL of the first into 10uL of the second.

## Enrichment calculation
* calculated as the fraction of the population at the end divided by the initial fraction.

$E = \frac{f_{i}(t=end)}{f_{i}(t = 0)}$

$f_{+}(t=0) : \{0.5,10^{-1},10^{-2},10^{-3}\}$

# Validating relative growth rates and plotting growth curves
Need to show that simulating the system with relative growth rates is the same as simulating the system with absolute growth rates. Hard to do this analytically because I can't find a general solution to competitive Lotke-Volterre equations...

Test case: Two species, $x_1$ and $x_2$, with absolute growth rates, $r_1 = 0.25, r_2 = 0.125$, and relative growth rates of $\bar{r_1} = 1, \bar{r_2} = 0.5$.

How does the system evolve over time for each?

In [None]:
import importlib
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
import pandas as pd

import matplotlib
matplotlib.rc('figure', dpi = 150)
sns.set_palette('muted')

import sys
sys.path.append('./modules')
import enrichments

In [None]:
importlib.reload(enrichments)

In [None]:
# Define intial values and growth rates
x_i0 = np.array([0.1, 0.9])
# Define absolute rates and relative rates
r_i = np.array([0.2,0.1])
r_i_norm = r_i / np.max(r_i)

dil = 100

sys_1 = enrichments.Growth_tube(x_i0, r_i, dil)
sys_2 = enrichments.Growth_tube(x_i0, r_i_norm, dil)

In [None]:
sys_1.sim_growth()
sys_2.sim_growth()

In [None]:
fig, ax = plt.subplots(nrows = 1, ncols = 2, sharey = True, figsize = (6,3))

#for x in sys_1.x_t[:,:]:
#    ax[0].plot(sys_1.t, x)

sys_1.plot_x_t(ax[0])
sys_2.plot_x_t(ax[1])

ax[0].set_title('Absolute growth rates')
ax[1].set_title('Relative growth rates')
ax[1].set_xlabel('Time')
ax[1].set_ylabel('Normalized abundance')

plt.show()

In [None]:
print('Absolute growth enrichment vals:', sys_1.enrichs)
print('Relative growth enrichment vals:', sys_2.enrichs)

Interesting... the ODs end up coming to the same value, but they happen at different times... which kind of makes sense. Should be fine to use relative growth rates for everything.

### Do a random one

In [None]:
np.random.seed(69)
lib_sz = 100

x_i0 = np.random.uniform(0,1,lib_sz)
x_i0_norm = x_i0/np.sum(x_i0)
r_i = np.random.uniform(0,1,lib_sz)

dil = 100

sys_rand = enrichments.Growth_tube(x_i0_norm, r_i, dil)
sys_rand.sim_growth()

In [None]:
sns.set_palette('muted')
fig, ax = plt.subplots(figsize = (3,3))

sys_rand.plot_x_t(ax)

ax.set_xlabel('Time')
ax.set_ylabel('Normalized abundance')
ax.set_ylim((0,np.max(sys_rand.x_t)))

plt.show()

In [None]:
fig, ax = plt.subplots(figsize = (3,3))

y = np.arange(len(sys_rand.enrichs))
ax.bar(y,np.sort(sys_rand.enrichs),width =2)
ax.set_xlabel('Member index')
ax.set_ylabel('Enrichment')

# Calculating enrichments

Enrichments are automatically calculated for each species after running the simulations. They're accessible in the Growth_tube object.

In [None]:
sys_1.enrichs

# Simulate Andrew's enrichments

REU values:

|Strain | REU|
|-|-|
s953 | 1.9
s950 | 0.02
s951 | 0.13
s952 | 0.29

Cm values used (uM):
400, 300, 200, 100, 50, 25, 12, 0

Initial positive fractions used: 0.5,0.1,0.01,0.001

### Method
Eyeballed the relative growth values given REUs and Cm concentrations

In [None]:
df_growth = pd.read_excel('./andrews_enrich_growths.xlsx')
df_growth

In [None]:
# Function to run an enrichment simulation against s953 at the given [Cm] and with a given starting fraction
def enrich_s953(row, frac_pos, dil = 100):
    pos_strain = 's953'
    
    cm = row['cm']
    neg_growth = row['rel_growth']
    pos_growth = df_growth[(df_growth['strain'] == pos_strain) & (df_growth['cm'] == cm)]['rel_growth'].values[0]
    # Create test tube and simulate (positive first in the arrays)
    x_i0 = np.array([frac_pos, (1-frac_pos)])
    r_i = np.array([pos_growth, neg_growth])
    
    tube = enrichments.Growth_tube(x_i0, r_i, dil)
    tube.sim_growth()
    # Get enrichment of the positive strain    
    row['enrich'] = tube.enrichs[0]
    row['frac'] = frac_pos
    
    return row

In [None]:
df_enrich = pd.DataFrame(columns = ['strain', 'reu', 'cm', 'rel_growth', 'enrich', 'frac'])

pos_fracs = np.array([0.5,0.1,0.01,0.001])

for frac in pos_fracs:
    new_df = df_growth.apply(enrich_s953,axis = 1,frac_pos = frac)
    df_enrich = df_enrich.append(new_df, ignore_index = True)

In [None]:
df_enrich.head()

## Recreate Andrew's plots

In [None]:
strains = ['s950', 's951', 's952']
strain_names = {'s950': 'PMI-RBD', 's951': 'ACE2a1-Mdm2', 's952': 'ACE2a1a2-Mdm2'}

fig, ax = plt.subplots(nrows = 3, ncols = 1, figsize = (5,14))
plt.subplots_adjust(hspace = 0.3)
sns.set_palette(sns.color_palette("husl"))

i = 0
for strain in strains:
    # First subset dataframe
    df_subset = df_enrich[df_enrich['strain'] == strain]

    sns.barplot(x = 'cm', y = 'enrich', hue = 'frac', 
                hue_order = df_subset['frac'].unique(), 
                data = df_subset, ax = ax[i])
    
    ax[i].set_ylim([0.1,1100])
    ax[i].set_yscale('log')
    ax[i].set_title(strain_names[strain])
    ax[i].set_ylabel('Enrichment')
    ax[i].set_xlabel('Chloramphenicol')
    ax[i].legend(title='Initial fraction')
    
    i += 1

plt.show()


# Large library enrichments

Try to see how much enrichment we get with the initial fraction of positive cells is something like $10^{-8}$ (more realistic library scenario). Try for different amounts of Cm and different library fractions.

In [None]:
# Use enrichment function from above for all the control strains

df_lib_enrich = pd.DataFrame(columns = ['strain', 'reu', 'cm', 'rel_growth', 'enrich', 'frac', 'dil'])

lib_pos_fracs = np.array([1e-2,1e-3,1e-5,1e-7,1e-9])
dils = [100,1000,10000]

for frac in lib_pos_fracs:
    for dil in dils: 
        new_df = df_growth.apply(enrich_s953, axis = 1,frac_pos = frac, dil = dil)
        new_df['dil'] = dil
        df_lib_enrich = df_lib_enrich.append(new_df, ignore_index = True)
df_lib_enrich

In [None]:
# Plot it

strains = ['s950', 's951', 's952']
strain_names = {'s950': 'PMI-RBD', 's951': 'ACE2a1-Mdm2', 's952': 'ACE2a1a2-Mdm2'}

fig, ax = plt.subplots(nrows = 3, ncols = 3, figsize = (10,14), sharey = True)
plt.subplots_adjust(hspace = 0.3)
sns.set_palette(sns.color_palette("husl"))

i = 0
j = 0
for dil in dils:
    for strain in strains:
        
        # First subset dataframe
        df_subset = df_lib_enrich[(df_lib_enrich['strain'] == strain) & (df_lib_enrich['dil'] == dil)]

        sns.barplot(x = 'cm', y = 'enrich', hue = 'frac',
                    hue_order = df_subset['frac'].unique(), 
                    data = df_subset, ax = ax[i,j])

        ax[i,j].set_ylim([1e-1,1e10])
        ax[i,j].set_yscale('log')
        ax[i,j].set_title(strain_names[strain] + ' ' + str(dil) + '-fold')
        ax[i,j].set_ylabel('Enrichment')
        ax[i,j].set_xlabel('Chloramphenicol')
        ax[i,j].legend(title='Initial fraction')
        
        i += 1
    i=0
    j += 1

plt.show()


# Todo