# Derivative of traits_i with respect to $\eta$

Here I will be testing my solution for
$\frac{ \partial \mathbf{\hat{V}_i} }{ \partial \eta }$
(see below)
by calculating the Jacobian using the `theano` package
and comparing those results to my solution.




## Importing packages and setting options

In [1]:
%env OMP_NUM_THREADS=4
%env THEANO_FLAGS='openmp=True'
import sympy
import theano
theano.config.cxx = ""
import theano.tensor as T
import numpy as np
import pandas as pd
from tqdm import tqdm
import math
pd.options.display.max_columns = 10

env: OMP_NUM_THREADS=4
env: THEANO_FLAGS='openmp=True'


## Equations


> THIS IS NO LONGER ACCURATE BECAUSE >1 ETA VALUES ARE ALLOWED.



__Notes:__

- ${}^\text{T}$ represents transpose.
- Elements in __bold__ are matrices
- Multiplication between matrices is always matrix multiplication, not
  element-wise
- $\mathbf{I}$ is an identity matrix with the same number of 
  rows and columns as the number of traits in $\mathbf{V_i}$


__Biggest notes:__

- $\mathbf{C}$ is a $q \times q$ matrix with $\eta$ on the off-diagonal elements
  and 1 on the diagonals.
- $\mathbf{C} = \mathbf{C}^{\text{T}}$, so $\mathbf{C} + \mathbf{C}^\text{T} = 2 ~ \mathbf{C}$
  

The equations for (1) traits for species $i$ at time $t+1$ ($\mathbf{\hat{V}_i}$)
and (2) the partial derivative of species $i$ traits with respect to $\eta$
are as follows:

\begin{align}
    \mathbf{V}_{i,t+1} &= \mathbf{V}_{i,t} + 2 ~ \sigma_i^2
    \left[
        \alpha_0 ~ \mathbf{\Omega}_{i,t} ~
            \textrm{e}^{-\mathbf{V}_{i,t} \mathbf{V}_{i,t}^\textrm{T}} ~ \mathbf{V}_{i,t}
        - f ~ \mathbf{V}_{i,t} \mathbf{C}
    \right] \\
    \mathbf{\Omega}_{i,t} &\equiv N_{i,t} +
            \sum_{j \ne i}^{n}{ N_{j,t} \textrm{e}^{ -d \mathbf{V}_{j,t} \mathbf{V}_{j,t}^{\textrm{T}} } }
\end{align}


\begin{align}
    \frac{ \partial \mathbf{V}_{i,t+1} }{ \partial \mathbf{C} } &= 0 + 2 ~ \sigma_i^2 ~
        \left[
            0 -
            f ~ \frac{ \partial \mathbf{V}_{i,t} \mathbf{C} }{ \partial \mathbf{C} }
        \right] \\
        &= - 2 f \sigma_i^2 ~ \frac{ \partial \mathbf{V}_{i,t} \mathbf{C} }{ \partial \mathbf{C} }
\end{align}

Because $\mathbf{C}$ is a matrix, this is not straightforward to calculate.
Thus, I now do things without matrix notation:

\begin{align}
    \mathbf{W} &= \mathbf{V}_i \mathbf{C} \\
    W_j &= {V}_{ij} + \eta ~ \sum_{k \ne j}^{q}{ {V}_{ik} }
\end{align}

Now we can use that to plug it into the partial derivative of $V_{ij,t+1}$ in relation to $\eta$.

\begin{align}
    \frac{ \partial {V}_{ij,t+1} }{ \partial \eta } &= - 2 f \sigma^2 ~ 
        \frac{ \partial \left(
            {V}_{ij,t} + \eta ~ \sum_{k \ne j}^{n}{ {V}_{ik,t} } 
        \right)}{ \partial \eta } \\
    &= - 2 f \sigma^2  \sum_{k \ne j}^{n}{ {V}_{ik} }
\end{align}


## Read CSV of simulated datasets

In [2]:
sims = pd.read_csv("simulated_data.csv")
sims.head()

Unnamed: 0,V1,V2,V3,V4,V5,...,f,a0,eta,r0,d
0,5.329784,-0.593159,0.003065,1.414273,-6.458124,...,0.137235,0.104261,0.063997,0.343463,-0.118705
1,-1.514917,-1.024847,5.413096,-4.548136,1.542865,...,0.600063,0.197839,0.103529,0.279827,-0.158496
2,-9.969353,0.930724,2.855755,8.144096,3.640262,...,0.537799,0.202685,-0.088763,0.303346,-0.159742
3,3.821274,-3.732219,-2.680385,-1.586652,-9.75577,...,0.123312,0.117315,-0.08224,0.136664,0.103837
4,3.291826,0.708288,-5.28158,6.224788,-0.271641,...,0.560044,0.054967,0.046302,0.254523,-0.125201


## Functions to compare methods

In [3]:
def automatic(i, V, O, f, a0, eta, s2):
    """Automatic differentiation using theano pkg"""
    q = V.shape[1]
    Vi = V[i,:]
    eta_ = T.dscalar('eta_')
    Vhat = Vi + 2 * s2 * (
        ( a0 * O * T.exp(-1 * T.dot(Vi, Vi.T)) * Vi) - 
        ( f * T.dot(Vi, (np.eye(q) * (1 - eta_) + np.ones((q,q)) * eta_)) )
    )
    J, updates = theano.scan(lambda i, Vhat, eta_ : T.grad(Vhat[i], eta_), 
                         sequences=T.arange(Vhat.shape[0]), non_sequences=[Vhat, eta_])
    num_fun = theano.function([eta_], J, updates=updates)
    out_array = num_fun(eta).T
    return out_array

In [4]:
def symbolic(i, V, f, s2):
    """Symbolic differentiation using my brain"""
    q = V.shape[1]
    Vi = V[i,:]
    mult = -2 * f * s2
    dVhat = np.zeros(q)
    for j in range(0, q):
        dVhat[j] = mult * np.sum([Vi[k] for k in range(0, q) if k != j])
    return dVhat

In [5]:
def compare_methods(sim_i, s2 = 0.01, abs = False):
    """Compare answers from symbolic and automatic methods"""
    
    # Fill info from data frame:

    N = sims.loc[sim_i, [x.startswith("N") for x in sims.columns]].values
    V = sims.loc[sim_i, [x.startswith("V") for x in sims.columns]].values
    n, q = (N.size, int(V.size / N.size))
    V = V.reshape((n, q), order = 'F')
    f = sims.loc[sim_i,"f"]
    a0 = sims.loc[sim_i,"a0"]
    eta = sims.loc[sim_i,"eta"]
    d = sims.loc[sim_i,"d"]

    # Create output array:
    diffs = np.empty((n, 4))
    diffs[:,0] = sim_i
    
    # Fill output array:
    for i in range(0, n):
        O = N[i] + np.sum([np.exp(-d * np.dot(V[j,:], V[j,:].T)) * N[j] 
            for j in range(0, N.size) if j != i])
        auto = automatic(i, V, O, f, a0, eta, s2)
        sym = symbolic(i, V, f, s2)
        if abs:
            diff = auto - sym
        else:
            diff = (auto - sym) / sym
        diff = diff.flatten()
        diffs[i, 1] = i
        diffs[i, 2] = diff.min()
        diffs[i, 3] = diff.max()
    
    return diffs

### Example of using `compare_methods`:

In [6]:
diffs = compare_methods(0)
# Worst case examples:
print(diffs[:,2].min())
print(diffs[:,3].max())

-4.623459313836677e-16
2.800703029825576e-16


## Comparing methods

This takes ~2 minutes.

In [7]:
n_per_rep = 4
diffs = np.empty((int(n_per_rep * 100), 4))

In [8]:
for rep in tqdm(range(100)):
    diffs_r = compare_methods(rep)
    diffs[(rep * n_per_rep):((rep+1) * n_per_rep),:] = diffs_r

100%|██████████| 100/100 [01:57<00:00,  1.16s/it]


## The results
They appear to be extremely similar, enough so that I feel comfortable with my symbolic version.

In [9]:
print(diffs[:,2].min())
print(diffs[:,3].max())

-2.4028966975239524e-14
2.6194119461406564e-14


## Write output to file

To make sure the R version works, too, I'm writing to a CSV file the output from the symbolic version on the 100 datasets.

In [10]:
n = np.sum([x.startswith("N") for x in sims.columns])
q = int(np.sum([x.startswith("V") for x in sims.columns]) / n)
s2 = 0.01
# Output array
results = np.zeros((100, n * q))

for sim_i in range(100):
    
    # Fill info from data frame:
    N = sims.loc[sim_i, [x.startswith("N") for x in sims.columns]].values
    V = sims.loc[sim_i, [x.startswith("V") for x in sims.columns]].values
    n, q = (N.size, int(V.size / N.size))
    V = V.reshape((n, q), order = 'F')
    f = sims.loc[sim_i,"f"]
    a0 = sims.loc[sim_i,"a0"]
    eta = sims.loc[sim_i,"eta"]
    d = sims.loc[sim_i,"d"]

    # Fill output array:
    for i in range(0, n):
        sym = symbolic(i, V, f, s2)
        results[sim_i, (i*q):((i+1)*q)] = sym.flatten()


# Make sure first and last aren't zeros:
results[[0, 99], :]

array([[ 0.01096645, -0.02138771,  0.00309694,  0.02015421,  0.00562801,
         0.01778228,  0.00618146, -0.00076761,  0.00693224, -0.01524951,
        -0.0086218 , -0.01439119],
       [-0.06104388,  0.00137983,  0.01325554, -0.04413719, -0.09381652,
        -0.0317462 , -0.01647427, -0.04582762, -0.00285579,  0.02285623,
         0.01911621, -0.02541097]])

In [11]:
np.savetxt('results/dVi_deta.csv', results, delimiter=',')