# Homework 2: Nonlinear Programming and Evolutionary Algorithms

### In 2010, Prof. Quinn's husband, Alex Looi, conducted an experiment to model predator-prey dynamics in a chemostat (bioreactor) under different salinities. Each day, he measured the concentration of chlorella (a type of algae) and counted the number of rotifers (type of plankton that eats chlorella) in the chemostat. He modeled the chlorella and rotifer populations by the Fussman model of predator-prey dynamics (Fussman et al., 2000), represented by the following system of differential equations:
\begin{align}
    \frac{dN}{dt} = \delta (N_0 - N) - \frac{B_C N}{K_C + N} C\\
    \frac{dC}{dt} = \frac{B_C N}{K_C + N} C - \frac{B_R C}{K_R + C} \Big(\frac{R}{e}\Big) - \delta C\\
    \frac{dR}{dt} = \frac{B_R C}{K_R + C}R - (\delta + m)R
\end{align}
### where $C$, $R$, and $N$ are the quantities of chlorella, rotifers and nitrogen; $K_C$ and $K_R$ are the half-saturation constants of chlorella and rotifers; $B_C$ and $B_R$ are the growth rate of chlorella and rotifers; $e$ is the assimilation efficiency of the predator; $m$ is the mortality rate of the predator; and $\delta$ is the flow rate in the chemostat. He hypothesized that the half-saturation constant of the rotifers in this system, $K_R$, would be affected by salinity. Let's test his hypothesis! Attached is the data he collected, after being smoothed by exponential smoothing.

### a. Write a function called Fussmann that simulates the set of three differential equations in equations 1-3 in continuous time. The function should take 3 arguments: a length 6 vector of parameters, a length 4 vector of independent variables ($N_0$, $C_0$, $R_0$ and $\delta$), and a length $n$ vector of time steps. The function should return two vectors of length $n$ with simulated C and R values. Simulate n=38 time steps of this model assuming the parameters and initial values provided in Table 1. Plot the Chlorella and Rotifer populations in solid black and red, respectively, on one plot with a shared ("twin") axis but different y axis. (20 pts)

Table 1: Example Inputs  

| Parameter | $K_C$ | $K_R$ | $B_C$ | $B_R$ | $e$  | $m$ | $\delta$ | $N_0$ | $C_0$ | $R_0$ |
| --------- | ----- | ---- | ----- | ----- | ---- | --- | -------- | ----- | ----- | ----- |
| Value     | 4.3   | 15.0 | 3.3   | 1.0   | 0.25 | 0.3 | 0.4      | 80    | 2.5     | 0.7  |

### Fussmann, G. F., Ellner, S. P., Shertzer, K. W., \& Hairston Jr, N. G. (2000). Crossing the Hopf bifurcation in a live predator-prey system. *Science*, *290*(5495), 1358-1360.

$\color{red}{\text{The code below is complete. It simulates a system of ordinary differential equations representing the Fussman model}}$

In [None]:
import scipy.integrate as integrate
import matplotlib.pyplot as plt
import numpy as np
import scipy.stats as ss
from scipy.optimize import differential_evolution, minimize

N0 = 80 # initial nitrogen mass
C0 = 2.5 # initial chlorella population
R0 = 0.7 # intiial rotifer population
delta = 0.4 # dilution rate

y0 = [N0, C0, R0] # initial state variables
time = np.arange(0, 38) # number of time steps

def Fussmann(y0, t, params):
    """
    The predator-prey ode with Juveniles

    Parameters:
    _______________________________________________________
    betaC: params[0]  # Offspring production: Chlorella
    Kc:    params[1]  # Half Saturation constant of Chlorella
    e:     params[2]  # Assimilation efficiency of rotifers
    m:     params[3]  # Death rate of Rotifers
    betaR: params[4]  # Offspring production of Rotifers
    Kr:    params[5]  # Half saturation constant for R

    Populations:
    _______________________________________________________
    dN: Concentration of Nitrogen
    dC: Concentration of Chlorella
    dR: Concentration of Rotifers
    """
    ## Parameters
    # Whole system parameters
    #d = parms[0]

    # Chlorella parameters
    K_C = params[0] # Half Saturation constant for Chlorella
    K_R = params[1] # Half saturation constant for Rotifers
    B_C = params[2] # Growth rate of Chlorella
    B_R = params[3] # Growth rate of Rotifers
    e = params[4] # Assimilation rate of Rotifers
    m = params[5] # Death rate of Rotifers

    ##initial conditions
    N = y0[0]
    C = y0[1]
    R = y0[2]

    ##Differential equations
    dN = delta*(N0 - N) - (B_C*N)/(K_C + N)*C # rate of change in nitrogen
    dC = (B_C*N)/(K_C + N)*C - ((B_R*C) / (K_R + C)*R/e) - delta*C # rate of change in chlorella population
    dR = (B_R*C)/(K_R + C)*R - (delta + m) * R # rate of change of rotifer population

    return [dN, dC, dR]

$\color{red}{\text{Run simulation with default parameters and plot resulting time series below.}}$

In [None]:
params = [4.3, 15.0, 3.3, 1.0, 0.25, 0.3]

# run simulation and make plot here


### b. Now let's use this model to fit Alex's data, which is a time series of 38 time steps. The values of N0, C0, R0 and $\delta$ were measured in the lab and are the same as those listed in Table 1. The other 6 parameters need to be estimated from the data. Generate a Latin hypercube of 10 samples of these 6 parameters over their bounds, provided in Table 2 (set the seed to make it reproducible). Use these to initialize the population of a search with Differential Evolution for the set of parameters that maximizes the average NSE of simulated and observed chlorella and rotifer populations under each salinity (again, set the seed to make it reproducible). Report the average NSE and parameter estimates of each fit. Note that the Algae columns correspond to Chlorella populations. Hint: You will need to update your Fussmann function to take observed C and R values as an argument and return the negative of the average NSE when those arguments are passed.(20 pts)

Table 2: Parameter Ranges  

| Parameter   | Lower Bound | Upper Bound |
| ----------- | ----------- | ----------- |
| $ K_C $     | 1.0         | 25.0        |
| $ K_R $     | 1.0         | 50.0        |
| $ B_C $     | 1.0         | 10.0        |
| $ B_R $     | 0.1         | 10.0        |
| $ e $       | 0.01        | 0.50        |
| $ m $       | 0.0         | 0.9         |

$\color{red}{\text{The code below is complete. It reads in the data and then organizes it into a dictionary, dic_dfs,}}$  
$\color{red}{\text{where each salinity has a data frame of rotifer and chlorella populations associated with it.}}$

In [None]:
# read in experimental data
from google.colab import drive
import pandas as pd

drive.mount('/content/drive')
data = pd.read_csv("drive/MyDrive/Colab Notebooks/CE4110_6250/ExpData.csv")

# create dictionaries with the key being the salinity and the value being the associated observed dataset
salinities = []

for colname in data.columns:
    salinity = colname.split(".")[-1]
    if salinity not in salinities:
        salinities.append(salinity)

dic_dfs = {}
for salinity in salinities:
    for colname in data.columns:
        if colname.split(".")[-1] == salinity:
            if colname.split(".")[0].split(" ")[-1] == "Rotifers":
                S_R = data[colname]
                S_R.name = "R"
            if colname.split(".")[0].split(" ")[-1] == "Algae":
                S_C = data[colname]
                S_C.name = "C"
    dic_dfs[salinity] = pd.concat([S_R,S_C], axis = 1)
dic_dfs

$\color{red}{\text{Complete the function below to calculate the objective function: average NSE of chlorella and rotifer populations. Negate to minimize.}}$

In [None]:
def ObjFunc(params, y0, t, data):
    """
    The Fitness Function: the average of Rotifer NSE and Chlorella NSE.
    """
    # compute objective function

    return ObjFunc

$\color{red}{\text{Complete the code below to calibrate with DE}}$

In [None]:
# specify parameter bounds
bounds =

# generate initial population of 50 from Latin hypercube sample with seed=1
x0

# initialize data frame of 0s that will later store results
DE_results = pd.DataFrame(columns=["K_C","K_R","B_C","B_R","e","m","NSE"],index=salinities,data=0.0)

# loop through salinities and optimize parameters with DE
for salinity in dic_dfs.keys():
    df_obs = dic_dfs[salinity] # data of chlorella and rotifer populations at this salinity
    # use DE to optimize ObjFunc
    result =

    # fill data frame with parameter values and objective value (finish code below)
    DE_results.iloc[np.where(DE_results.index == salinity),0:len(bounds)] =
    DE_results["NSE"].iloc[np.where(DE_results.index == salinity)] =

DE_results

### c. Using each of the Latin hypercube samples from part b, use scipy.optimize.minimize with the default solver BFGS to find a set of parameters that maximize the average of NSE of simulated and observed chlorella and rotifer populations under each salinity. Report the average NSE of the best fit at each salinity. Which algorithm achieves better NSE values, and what does that suggest? (20 pts)

$\color{red}{\text{Complete the code below to calibrate with BFGS}}$

In [None]:
# generate Latin hypercube sample of 10 initial starting points using seed=1
x0

# initialize data frame of 0s that will later store results
BFGS_results = pd.DataFrame(columns=["K_C","K_R","B_C","B_R","e","m","NSE"],index=salinities,data=0.0)

# loop through salinities and optimize parameters with BFGS
for salinity in dic_dfs.keys():
    df_obs = dic_dfs[salinity] # data of chlorella and rotifer populations at this salinity
    bestObj = np.inf # initialize the best objective value so far for this salinity at infinity
    # loop through initial starting points and update bestObj value and corresponding result if it improved


    # fill data frame with parameter values and objective value (finish code below)
    BFGS_results.iloc[np.where(BFGS_results.index == salinity),0:len(bounds)] = # parameter values
    BFGS_results["NSE"].iloc[np.where(BFGS_results.index == salinity)] = # objective value

BFGSresults



### d. Make a 4x2 plot of the time series of observed and modeled chlorella and rotifer populations. Each row will represent a different salinity, the left column will show the gradient-based method results, and the right column will show the differential evolution results. Use a solid black line to represent the observed chlorella population and a dashed black line to represent the modeled chlorella population (hint: pass linestyle="dashed" to ax.plot). On the opposite y axis, but a shared ("twin") x axis, use a solid red line to represent the observed rotifer population and a dashed red line to represent the modeled rotifer population. Which algorithm's estimated parameters produce dynamics closer to the observations? (20 pts)

$\color{red}{\text{Fill in the code below to make your plot.}}$

In [None]:
fig, ax = plt.subplots(4,2)
fig.set_size_inches([10,8])

for salinity in dic_dfs.keys():
    df_obs = dic_dfs[salinity] # observed population data for this salinity
    for j in range(2):
        if j == 0:
            # get simulated chlorella and rotifer populations from best DE solution for this salinity

        else:
            # get simulated chlorella and rotifer populations from best BFGS solution for this salinity


        # plot simulated and observed populations

        if i == 0:
            if j == 0:
                ax[i,j].set_title("DE Calibration",fontsize=16)
            else:
                ax[i,j].set_title("BFGS Calibration", fontsize=16)

# add a legend to the figure
