# Derivative of fitness with respect to traits

Here I will be testing my solution for
$\frac{ \partial F }{ \partial \mathbf{V_i} }$
(see below)
by calculating the derivatives using the `theano` package
and comparing those results to my solution.




## Importing packages and setting options

In [1]:
%env OMP_NUM_THREADS=4
%env THEANO_FLAGS='openmp=True'
import sympy
import theano
theano.config.cxx = ""
import theano.tensor as T
import numpy as np
import pandas as pd
from tqdm import tqdm
import math
pd.options.display.max_columns = 10

env: OMP_NUM_THREADS=4
env: THEANO_FLAGS='openmp=True'


## Equations

__Notes:__

- ${}^\text{T}$ represents transpose.
- Elements in __bold__ are matrices
- Multiplication between matrices is always matrix multiplication, not
  element-wise


The equations for fitness for species $i$ ($F$)
and the partial derivative of $F$ with respect
to the traits of species $i$
($\frac{ \partial F }{ \partial \mathbf{V_i} }$)
are as follows:

\begin{align}
F_{i} &= \exp \left\{
        r_0 - f ~ \mathbf{V}_i ~ \mathbf{C} ~ \mathbf{V}_{i}^{\text{T}} -
        \alpha_0 ~ \text{e}^{- \mathbf{V}_i \mathbf{V}_i^{\text{T}} } ~
        \mathbf{\Omega}_i
    \right\} \\
    \mathbf{\Omega}_i &\equiv N_{i} +
        \sum_{j \ne i}^{n}{ N_{j} ~ \text{e}^{
                - \mathbf{V}_{j} \mathbf{D} \mathbf{V}_{j}^{\text{T}}} } \\[2ex]
    \frac{ \partial F }{ \partial \mathbf{V_i} } &= 
        \exp \{ \ldots \} ~
        \left[
            2 ~ \alpha_0 ~ \mathbf{\Omega}_i ~ \text{e}^{- \mathbf{V_i} * \mathbf{V_i}^{\text{T}}} \mathbf{V_i}
            - 2 ~ f ~ \mathbf{V_i} \mathbf{C}
        \right]
\end{align}


## Read CSV of simulated datasets

In [2]:
sims = pd.read_csv("simulated_data.csv")
sims.head()

Unnamed: 0,V1,V2,V3,V4,V5,...,f,a0,eta,r0,d
0,4.94511,2.869199,6.747126,6.142522,5.629532,...,0.06889,0.112113,-0.33115,1.422746,-0.091228
1,0.718846,1.220364,0.815571,0.868633,0.838021,...,0.309021,0.057579,0.094811,1.237047,0.003429
2,3.369285,1.912974,3.131174,0.046303,1.416252,...,0.118318,0.40141,-0.036977,1.746024,0.01216
3,0.373669,0.283873,0.237735,0.053632,0.062281,...,0.497286,0.49973,0.117188,0.669199,0.081612
4,3.562637,1.635016,5.724176,4.953962,1.060083,...,0.042638,0.307171,-0.467453,0.952351,0.051834


In [3]:
def automatic(i, V, O, C, f, a0, r0):
    """Automatic differentiation using theano pkg"""
    Vi = T.dvector('Vi')
    F = T.exp(
        r0 - f * T.dot(T.dot(Vi, C), Vi.T) -
        a0 * O * T.exp(-1 * T.dot(Vi, Vi.T))
    )
    J = T.grad(F, Vi)
    num_fun = theano.function([Vi], J)
    out_array = num_fun(V[i,:]).T
    return out_array

In [4]:
def symbolic(i, V, O, C, f, a0, r0):
    """Symbolic differentiation using math"""
    Vi = V[i,:]
    F = np.exp(
        r0 - f * np.dot(np.dot(Vi, C), Vi.T) -
        a0 * O * np.exp(-1 * np.dot(Vi, Vi.T))
    )
    dF = F * (
        ( 2 * a0 * O * np.exp(-1 * Vi @ Vi.T) * Vi ) -
        (2 * f * Vi @ C)
    )
    return dF

In [5]:
def compare_methods(sim_i, abs = False):
    """Compare answers from symbolic and automatic methods"""

    # Fill info from data frame:
    N = sims.loc[sim_i, [x.startswith("N") for x in sims.columns]].values
    V = sims.loc[sim_i, [x.startswith("V") for x in sims.columns]].values
    n, q = (N.size, int(V.size / N.size))
    V = V.reshape((n, q), order = 'F')
    f = sims.loc[sim_i,"f"]
    a0 = sims.loc[sim_i,"a0"]
    eta = sims.loc[sim_i,"eta"]
    r0 = sims.loc[sim_i,"r0"]
    d = sims.loc[sim_i,"d"]
    D = np.zeros((q, q))
    np.fill_diagonal(D, d)
    C = np.zeros((q, q)) + eta
    np.fill_diagonal(C,1.0)

    # Create output array:
    diffs = np.empty((n, 4))
    diffs[:,0] = sim_i

    # Fill output array:
    for i in range(0, n):
        O = N[i] + np.sum([np.exp(-1 * np.dot(np.dot(V[j,:], D), V[j,:].T)) * N[j] 
            for j in range(0, N.size) if j != i])
        auto = automatic(i, V, O, C, f, a0, r0)
        sym = symbolic(i, V, O, C, f, a0, r0)
        if abs or np.any(sym == 0):
            diff = auto - sym
        else:
            diff = (auto - sym) / sym
        diffs[i, 1] = i
        diffs[i, 2] = diff.min()
        diffs[i, 3] = diff.max()

    return diffs


### Example of using `compare_methods`:

In [6]:
diffs = compare_methods(0)
# Worst case examples:
print(diffs[:,2].min())
print(diffs[:,3].max())

-1.1621602222782118e-15
1.1496172980662302e-15


## Comparing methods

This takes ~2 minutes.

In [7]:
n_per_rep = 4
diffs = np.empty((int(n_per_rep * 100), 4))

In [8]:
for rep in tqdm(range(100)):
    diffs_r = compare_methods(rep)
    diffs[(rep * n_per_rep):((rep+1) * n_per_rep),:] = diffs_r

100%|██████████| 100/100 [01:46<00:00,  1.07s/it]


## The results
They appear to be extremely similar, enough so that I feel comfortable with my symbolic version.

In [9]:
print(diffs[:,2].min())
print(diffs[:,3].max())

-1.0538853658144804e-12
4.0497273168472205e-14


## Write output to file

To make sure the R version works, too, I'm writing to a CSV file the output from the symbolic version on the 100 datasets.

In [10]:
n = np.sum([x.startswith("N") for x in sims.columns])
q = int(np.sum([x.startswith("V") for x in sims.columns]) / n)
# Output array
results = np.zeros((100, n * q))

for sim_i in range(100):
    
    # Fill info from data frame:
    N = sims.loc[sim_i, [x.startswith("N") for x in sims.columns]].values
    V = sims.loc[sim_i, [x.startswith("V") for x in sims.columns]].values
    V = V.reshape((n, q), order = 'F')
    f = sims.loc[sim_i,"f"]
    a0 = sims.loc[sim_i,"a0"]
    eta = sims.loc[sim_i,"eta"]
    r0 = sims.loc[sim_i,"r0"]
    d = sims.loc[sim_i,"d"]
    C = np.zeros((q, q)) + eta
    np.fill_diagonal(C,1.0)

    # Fill output array:
    for i in range(0, n):
        O = N[i] + np.sum([np.exp(-d * np.dot(V[j,:], V[j,:].T)) * N[j] 
            for j in range(0, N.size) if j != i])
        sym = symbolic(i, V, O, C, f, a0, r0)
        results[sim_i, (i*q):((i+1)*q)] = sym.flatten()

# Make sure first and last aren't zeros:
results[[0, 99], :]

array([[-0.15883992, -0.24553555, -0.07242995, -0.62185721, -0.09552216,
        -0.12070101, -0.30783602, -0.03582412,  0.04874743, -0.4177771 ,
        -0.0255063 ,  0.06148898],
       [-0.19063035, -0.46799459,  0.10642638,  0.08687698,  0.11886768,
         0.02535973,  0.09943704, -0.11836676, -0.46893461, -0.33039351,
         0.06542317, -0.15330258]])

In [11]:
np.savetxt('results/dF_dVi.csv', results, delimiter=',')