Autodiff guide: https://www.tensorflow.org/guide/autodiff

In this notebook I take the second example from the tf guide and I compute the result both by hand and by automatic differentiation, using simple values for the variables involved.

Definitions and concepts:

Let $x$ be a row vector that represents data, set to $[1,2,3]$. Let $W$ and $b$ be a weight matrix and a bias row vector filled with ones for simplicity.

With $x$, $W$ and $b$, I construct the new row vector

$$y = x . W + b.$$

The target function chosen by the tf autodiff guide is

$$\text{loss}(x,W,b) = \frac{\big(x_1 W_{11} + x_2 W_{21} + x_3 W_{31} + b_1 \big)^2 + \big(x_1 W_{12} + x_2 W_{22} + x_3 W_{32} + b_2 \big)^2} {2}.$$

To check, I calculate here simbolically the partial derivative $\frac{\partial\text{loss}}{\partial{b_1}}$, which is

$$\frac{\partial\text{loss}(x,W,b)}{\partial{b_1}} = x_1 W_{11} + x_2 W_{21} + x_3 W_{31} + b_1. $$

PS: To understand the use of tf.GradientTape, read the cited guide.


In [1]:
import tensorflow as tf
import numpy as np

# Variable definitions
x = tf.Variable([1,2,3], dtype = tf.float32, name = 'x')
W = tf.Variable(tf.ones((3, 2)), name = 'W')
b = tf.Variable(tf.ones(2, dtype = tf.float32), name = 'b')

# Recording of calculations
with tf.GradientTape(persistent = True) as tape:
  y = tf.tensordot(x, W, axes = [[0],[0]]) + b
  loss = tf.reduce_mean(y**2)

# Gradient computation
[dloss_dW, dloss_db] = tape.gradient(loss, [W, b])

# Partial derivatives with respect to each variable
# Derivatives output shape is equal to variables input shape.
print('derivatives with respect to the weights:')
print(dloss_dW,'\n')
print('derivatives with respect to the biases:')
print(dloss_db)

derivatives with respect to the weights:
tf.Tensor(
[[ 7.  7.]
 [14. 14.]
 [21. 21.]], shape=(3, 2), dtype=float32) 

derivatives with respect to the biases:
tf.Tensor([7. 7.], shape=(2,), dtype=float32)
