# Auxiliary Tests

This notebook contains some tests to probe the behaviour of the used operations.

---

In [14]:
import numpy as np
import tensorflow as tf

np.random.seed(42)
tf.random.set_seed(42)

import os
os.environ['TF_CPP_MIN_LOG_LEVEL'] = '3' 

## Jacobian-Vector-Product / Gradient Trick

For example consider $\mathbf x = (1, 2, 3) \in \mathbb R^3$ and 
$$
    f: \mathbb R^3 \to \mathbb R, \ x\mapsto x_0^3+x_1^3+x_2^3 + 2x_0x_1x_2\, .
$$
Then Gradient, Hessian, and Hessian-Gradient-Product can be calculated as shown below:

In [15]:
## Testing Tape Behaviour
x = tf.Variable([1., 2., 3.])
with tf.GradientTape(persistent=True) as outer_tape:
    with tf.GradientTape() as tape:
       y = tf.reduce_sum(x ** 3.) + 2*tf.reduce_prod(x)
    grad  = tape.gradient(y, x)
    int_H = tf.reduce_sum(grad * tf.stop_gradient(grad))
hess = outer_tape.jacobian(grad, x)
hess_rowsums = outer_tape.gradient(grad, x)
hess_x_grad = outer_tape.gradient(int_H, x)

print("Gradient:", grad.numpy())
print("Hessian:\n", hess.numpy())
print("Rowsums of Hessian:", hess_rowsums.numpy())
print("Hessian times Gradient Explicitly:\n", 
    tf.matmul([hess], tf.transpose([grad])).numpy().flatten())
print("Hessian times Gradient Implicitly:\n",
    hess_x_grad.numpy())

Gradient: [15. 18. 31.]
Hessian:
 [[ 6.  6.  4.]
 [ 6. 12.  2.]
 [ 4.  2. 18.]]
Rowsums of Hessian: [16. 20. 24.]
Hessian times Gradient Explicitly:
 [322. 368. 654.]
Hessian times Gradient Implicitly:
 [322. 368. 654.]


Instead of using a usual `GradientTape` one can also use a `ForwardAccumulator` which is able of directly calculating jacobina-vector-products and therefore also hessian-gradient-products (see the [docs](https://www.tensorflow.org/api_docs/python/tf/autodiff/ForwardAccumulator)). Yet for doing so, we already need the gradient to pass it as `tangents` to the `ForwardAccumulator`. This is less efficient when having lots of weights.

In [18]:
with tf.autodiff.ForwardAccumulator(x, grad) as acc:
    with tf.GradientTape() as tape:
        y = tf.reduce_sum(x ** 3.) + 2*tf.reduce_prod(x)
    grad2 = tape.gradient(y, x)
print("Hessian-x-Gradient via. `ForwardAccumulator`_\n",
    acc.jvp(grad2).numpy())  # forward-over-backward Hessian-vector product

Hessian-x-Gradient via. `ForwardAccumulator`_
 [322. 368. 654.]


---