### Reference: Deep Learning by Francois Chollet

Training a neural network revolves around the following concepts, which we we divide up into two important parts:

1. Low-level tensor manipulation. This is handled by the TensorFlow API and is related to:

    a) Tensors, that store the network's state (variables)

    b) Tensor operations such as addition, relu, matmul etc. 

    c) Backpropagation using GradientTape.

2. High-level learning concepts. This is handled by the Keras API and is related to:

    a) Layers, which are combined into a model.

    b) Loss functions

    c) Optimizers

    d) Metrics for Evaluation

    e) Loops that implement mini-batch stochastic gradient descent.

### GradientTape Object

The most important thing that we will leverage from Tensorflow to build our neural networks is the GradientTape Object. 

The GradientTape class provided by Tensorflow can be used to retrieve the gradient of any differentiable expression with respect to any of its inputs.

In [1]:
import tensorflow as tf

#### Example 1

In [5]:
input_var = tf.Variable(initial_value=3.0)
with tf.GradientTape() as tape:
    result = tf.square(input_var)
gradient = tape.gradient(result, input_var)

The code basically does the following operations:

The GradientTape Object is given the following function with an input value, 

$$
f(x) = x^{2} \hspace{0.5cm}, \hspace{0.1cm} x = 3.0
$$

It computes the derivative,

$$
\frac{df}{dx} = 2x
$$

Then it updates the value of the derivative, i.e., gradient, using the input value, which results in the tensor object __input_var__ producing a new Tensor object with a value of $6.0$. It should be noted that the value of the __input_var__ does not change due to this operation.

In [7]:
print(gradient)
print(input_var)

tf.Tensor(6.0, shape=(), dtype=float32)
<tf.Variable 'Variable:0' shape=() dtype=float32, numpy=3.0>


#### Example 2

Let us assume that we have the time dependent position function,

$$
x(t) = 4.9 t^{2} 
$$

We know that the acceleration of the object following this function will be given by the second derivative,

$$
\begin{align*}
x(t) = 4.9 t^{2} \\
\implies v(t) = \frac{dx}{dt} = 9.8t \\
\implies a(t) = \frac{d^{2}x}{dt^{2}} = 9.8
\end{align*}
$$

Although a simple example, the use of GradientTape in this case is given below, which shows how we can nest GradientTape objects to compute higher-order derivatives. 

In [8]:
time = tf.Variable(0.0)
with tf.GradientTape() as outer_tape:
    with tf.GradientTape() as inner_tape:
        position = (4.9) * (time)**2
    speed = inner_tape.gradient(position, time)
acceleration = outer_tape.gradient(speed, time)
print(acceleration)

tf.Tensor(9.8, shape=(), dtype=float32)
