## Adjoint Algorithmic Differentiation

Sometimes called reverse. 

As before, let us suppose that we have a known function $f: \mathbb{R}^{p_z} \to \mathbb{R}$. That is, for $z = (z_0, z_1, \dots, z_{p_z-1})^\top$, $f(z) = y \in \mathbb{R}$.

In this case the goal is to calculate the vector $\nabla f$ or $\frac{\partial f}{\partial z_i}, \forall i= 0, \dots, p_z-1$.

We change the approach a little bit since now what is fixed is the output, allways compute the derivative of the output. Thus we operate in the reverse order, starting at $p_z+p_b-1$ and since $\partial y / \partial y =1$ then initialize the vector

$\displaystyle \bar{b} = \vec{1}$


In [None]:
from math import *
 
def f(z):
  b4 = z[0] + exp(z[1])
  b5 = sin(z[2]) + cos(z[3])
  b6 = pow(z[1],1.5) + z[3]
  b7 = cos(b4)*b5 + b6
  return [b4,b5,b6,b7]

now set the vector of derivatives $\bar{b}$, this vector has dimension $p_z \times 1$

In [None]:
import numpy as np
z = np.array([1,0.2,0,0.5])
b = np.concatenate((z,f(z)), axis=0)
pz = 4
pb = 4
bbar = bdot = np.array([0.0]*(pz+pb))
# derivatives of the last variables
bbar[pz+pb-1] = bbar[pz+pb-1] +1
print(bbar)

[0. 0. 0. 0. 0. 0. 0. 1.]


We again use the chain rule to calculate the next steps (backwards),

$\displaystyle \bar{b}[p_z+p_b-2] = \sum_{k=p_z+p_b-1}^{p_z+p_b-1} \frac{\partial y}{\partial b_k} \frac{\partial b_k}{\partial b_j} = \sum_{k=p_z+p_b-1}^{p_z+p_b-1} \bar{b}[k] \frac{\partial b_k}{\partial b_{p_z+p_b-2}} $

off course in this notation $b_k = z_k,$ if $k <p_z$.

And in this particular case we have $b_7 = \cos(b_4)b_5 + b_6$, so $\bar{b}_6$ derivatives with respect to $b_6$ is

$\displaystyle \bar{b}[p_z+p_b-2] = \bar{b}[p_z+p_b-1] \frac{\partial b_{p_z+p_b-1}}{\partial b_{p_z+p_b-2}} = \bar{b}[7] \frac{\partial b_7}{\partial b_6}= 1\bar{b}[7]$

for $\bar{b}_5$

$\displaystyle \bar{b}[5] = \sum_{k=6}^{7} \bar{b}[k] \frac{\partial b_k}{\partial b_{5}} = \bar{b}[6] \frac{\partial b_{6}}{\partial b_{5}} + \bar{b}[7] \frac{\partial b_7}{\partial b_5} = 0 \bar{b}[6] + \cos(b_4) \bar{b}[7]$

and $\bar{b}_4$

$\displaystyle \bar{b}[4] = \sum_{k=5}^{7} \bar{b}[k] \frac{\partial b_k}{\partial b_{4}} = \bar{b}[5] \frac{\partial b_{5}}{\partial b_{4}} +\bar{b}[6] \frac{\partial b_{6}}{\partial b_{4}} + \bar{b}[7] \frac{\partial b_7}{\partial b_4} $

  $= 0 \bar{b}[5] + 0 \bar{b}[6] - \sin(b_4)b_5 \bar{b}[7]$


In [None]:
bbar[6] = 1*bbar[7]
bbar[5] = cos(b[4])*bbar[7]
bbar[4] = -(sin(b[4])*b[5])*bbar[7]
print(bbar)

[ 0.          0.          0.          0.         -0.69830705 -0.60566906
  1.          1.        ]


$b_4 = b_0 + e^{b_1}$

$b_5 = \sin(b_2) + \cos(b_3)$

$b_6 = b_1^{3/2} + b_3$

Applying the same idea, for the actual variables $i < p_a$ we have

$\displaystyle \bar{b}[3] = \bar{b}[4] 0 + \bar{b}[5] (-\sin(b_3)) + \bar{b}[6] 1 + \bar{b}[7] 0 $

$\displaystyle \dot{b}[2] = \bar{b}[3] 0 + \bar{b}[4] 0 + \bar{b}[5] \cos(b_2) + \bar{b}[6] 0 + \bar{b}[7] 0$

$\displaystyle \dot{b}[1] = \bar{b}[2] 0 +\bar{b}[3] 0 + \bar{b}[4] e^{b_1} + \bar{b}[5] 0 + \bar{b}[6] \frac{3}{2} \sqrt{b_1} + \bar{b}[7] 0$

$\displaystyle \dot{b}[0] = \bar{b}[1] 0 +\bar{b}[2] 0 +\bar{b}[3] 0 + \bar{b}[4] 1 + \bar{b}[5] 0 + \bar{b}[6] 0 + \bar{b}[7] 0$


In [None]:
bbar[3] = -sin(b[3])*bbar[5] + 1*bbar[6]
bbar[2] = cos(b[2])*bbar[5]
bbar[1] = exp(b[1])*bbar[4]+ 1.5*sqrt(b[1])*bbar[6]
bbar[0] = bbar[4]

print(bbar[:4])

[-0.69830705 -0.18209377 -0.60566906  1.29037322]
