# Derivative of fitness with respect to traits

Here I will be testing my solution for
$\frac{ \partial F }{ \partial \mathbf{V_i} }$
(see below)
by calculating the derivatives using the `theano` package
and comparing those results to my solution.




## Importing packages and setting options

In [2]:
%env OMP_NUM_THREADS=4
%env THEANO_FLAGS='openmp=True'
import sympy
import theano
theano.config.cxx = ""
import theano.tensor as T
import numpy as np
import pandas as pd
from tqdm import tqdm
import math
pd.options.display.max_columns = 10

env: OMP_NUM_THREADS=4
env: THEANO_FLAGS='openmp=True'


## Equations

__Notes:__

- ${}^\text{T}$ represents transpose.
- Elements in __bold__ are matrices
- Multiplication between matrices is always matrix multiplication, not
  element-wise


The equations for fitness for species $i$ ($F$)
and the partial derivative of $F$ with respect
to the traits of species $i$
($\frac{ \partial F }{ \partial \mathbf{V_i} }$)
are as follows:

\begin{align}
F_{i} &= \exp \left\{
        r_0 - f ~ \mathbf{V}_i ~ \mathbf{C} ~ \mathbf{V}_{i}^{\text{T}} -
        \alpha_0 ~ \text{e}^{- \mathbf{V}_i \mathbf{V}_i^{\text{T}} } ~
        \mathbf{\Omega}_i
    \right\} \\
    \mathbf{\Omega}_i &\equiv N_{i} +
        \sum_{j \ne i}^{n}{ N_{j} ~ \text{e}^{
                -d \mathbf{V}_{j} \mathbf{V}_{j}^{\text{T}}} } \\[2ex]
    \frac{ \partial F }{ \partial \mathbf{V_i} } &= 
        \exp \{ \ldots \} ~
        \left[
            2 ~ \alpha_0 ~ \mathbf{\Omega}_i ~ \text{e}^{- \mathbf{V_i} * \mathbf{V_i}^{\text{T}}} \mathbf{V_i}
            - 2 ~ f ~ \mathbf{V_i} \mathbf{C}
        \right]
\end{align}


## Read CSV of simulated datasets

In [3]:
sims = pd.read_csv("simulated_data.csv")
sims.head()

Unnamed: 0,V1,V2,V3,V4,V5,...,f,a0,eta,r0,d
0,5.329784,-0.593159,0.003065,1.414273,-6.458124,...,0.137235,0.104261,0.063997,0.343463,-0.118705
1,-1.514917,-1.024847,5.413096,-4.548136,1.542865,...,0.600063,0.197839,0.103529,0.279827,-0.158496
2,-9.969353,0.930724,2.855755,8.144096,3.640262,...,0.537799,0.202685,-0.088763,0.303346,-0.159742
3,3.821274,-3.732219,-2.680385,-1.586652,-9.75577,...,0.123312,0.117315,-0.08224,0.136664,0.103837
4,3.291826,0.708288,-5.28158,6.224788,-0.271641,...,0.560044,0.054967,0.046302,0.254523,-0.125201


In [7]:
def automatic(i, V, O, C, f, a0, r0):
    """Automatic differentiation using theano pkg"""
    Vi = T.dvector('Vi')
    F = T.exp(
        r0 - f * T.dot(T.dot(Vi, C), Vi.T) -
        a0 * O * T.exp(-1 * T.dot(Vi, Vi.T))
    )
    J = T.grad(F, Vi)
    num_fun = theano.function([Vi], J)
    out_array = num_fun(V[i,:]).T
    return out_array

In [8]:
def symbolic(i, V, O, C, f, a0, r0):
    """Symbolic differentiation using math"""
    Vi = V[i,:]
    F = np.exp(
        r0 - f * np.dot(np.dot(Vi, C), Vi.T) -
        a0 * O * np.exp(-1 * np.dot(Vi, Vi.T))
    )
    dF = F * (
        ( 2 * a0 * O * np.exp(-1 * Vi @ Vi.T) * Vi ) -
        (2 * f * Vi @ C)
    )
    return dF

In [11]:
def compare_methods(sim_i, abs = False):
    """Compare answers from symbolic and automatic methods"""

    # Fill info from data frame:
    N = sims.loc[sim_i, [x.startswith("N") for x in sims.columns]].values
    V = sims.loc[sim_i, [x.startswith("V") for x in sims.columns]].values
    n, q = (N.size, int(V.size / N.size))
    V = V.reshape((n, q), order = 'F')
    f = sims.loc[sim_i,"f"]
    a0 = sims.loc[sim_i,"a0"]
    eta = sims.loc[sim_i,"eta"]
    r0 = sims.loc[sim_i,"r0"]
    d = sims.loc[sim_i,"d"]
    C = np.zeros((q, q)) + eta
    np.fill_diagonal(C,1.0)

    # Create output array:
    diffs = np.empty((n, 4))
    diffs[:,0] = sim_i

    # Fill output array:
    for i in range(0, n):
        O = N[i] + np.sum([np.exp(-d * np.dot(V[j,:], V[j,:].T)) * N[j] 
            for j in range(0, N.size) if j != i])
        auto = automatic(i, V, O, C, f, a0, r0)
        sym = symbolic(i, V, O, C, f, a0, r0)
        if abs or np.any(sym == 0):
            diff = auto - sym
        else:
            diff = (auto - sym) / sym
        diffs[i, 1] = i
        diffs[i, 2] = diff.min()
        diffs[i, 3] = diff.max()

    return diffs


### Example of using `compare_methods`:

In [12]:
diffs = compare_methods(0)
# Worst case examples:
print(diffs[:,2].min())
print(diffs[:,3].max())

-3.4924628219779773e-16
1.2246603754998577e-16


## Comparing methods

This takes ~2 minutes.

In [13]:
n_per_rep = 4
diffs = np.empty((int(n_per_rep * 100), 4))

In [14]:
for rep in tqdm(range(100)):
    diffs_r = compare_methods(rep)
    diffs[(rep * n_per_rep):((rep+1) * n_per_rep),:] = diffs_r

100%|██████████| 100/100 [01:21<00:00,  1.27it/s]


## The results
They appear to be extremely similar, enough so that I feel comfortable with my symbolic version.

In [15]:
print(diffs[:,2].min())
print(diffs[:,3].max())

-2.5537767027145596e-14
4.050697280564236e-13


## Write output to file

To make sure the R version works, too, I'm writing to a CSV file the output from the symbolic version on the 100 datasets.

In [16]:
n = np.sum([x.startswith("N") for x in sims.columns])
q = int(np.sum([x.startswith("V") for x in sims.columns]) / n)
# Output array
results = np.zeros((100, n * q))

for sim_i in range(100):
    
    # Fill info from data frame:
    N = sims.loc[sim_i, [x.startswith("N") for x in sims.columns]].values
    V = sims.loc[sim_i, [x.startswith("V") for x in sims.columns]].values
    V = V.reshape((n, q), order = 'F')
    f = sims.loc[sim_i,"f"]
    a0 = sims.loc[sim_i,"a0"]
    eta = sims.loc[sim_i,"eta"]
    r0 = sims.loc[sim_i,"r0"]
    d = sims.loc[sim_i,"d"]
    C = np.zeros((q, q)) + eta
    np.fill_diagonal(C,1.0)

    # Fill output array:
    for i in range(0, n):
        O = N[i] + np.sum([np.exp(-d * np.dot(V[j,:], V[j,:].T)) * N[j] 
            for j in range(0, N.size) if j != i])
        sym = symbolic(i, V, O, C, f, a0, r0)
        results[sim_i, (i*q):((i+1)*q)] = sym.flatten()

# Make sure first and last aren't zeros:
results[[0, 99], :]

array([[-1.08777785e-04,  1.27757832e-04, -5.12452120e-05,
         2.01106132e-03,  1.13821655e-02,  3.54123378e-03,
         7.75063810e-15, -5.27256584e-12,  5.78240634e-13,
        -3.58300160e-02, -8.15820315e-02, -4.17550951e-02],
       [ 8.48828081e-14, -4.98771984e-14, -7.55144202e-14,
        -5.02909842e-17,  2.04833125e-17, -6.79434569e-17,
        -1.26465961e-05,  1.38184874e-05, -2.49250565e-05,
        -1.83711053e-06, -1.38219800e-06,  4.03381302e-06]])

In [17]:
np.savetxt('results/dF_dVi.csv', results, delimiter=',')