<a href="https://colab.research.google.com/github/lukmanr/codenext/blob/master/Training_Loop_in_TensorFlow.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Neural Network Training Loop in TensorFlow

This notebook shows you how to create a basic Neural Network training loop in TensorFlow.  The MNIST dataset is used to train a simple two-layer neural network.

First we import TensorFlow as usual:

In [0]:
%tensorflow_version 2.x
import tensorflow as tf

Next we define the weights and the biases.  The weights are a two dimensional array, connecting the 784 pixels of input to the 10 neurons in the output layer.  There is one bias per neuron.

In [0]:
# weights W[784, 10]   784=28*28
W = tf.Variable(tf.zeros([784, 10]))

# biases b[10]
b = tf.Variable(tf.zeros([10]))

We use TensorFlow's keras.datasets module to load MNIST.  ```x_train, y_train``` are numpy arrays contain the training images and the training labels, respectively.  ```x_test, y_test``` are numpy arrays containing the test images and the test labels.

In [0]:
# load the MNIST data set. The training set and test set are split
# automatically. 
(x_train, y_train), (x_test, y_test) = tf.keras.datasets.mnist.load_data()

Next we do a little preprocessing of the training and test data.  The images are composed of 1 byte per pixel, each byte being a number from 0 to 255.  We divide the arrays by 255 to make each pixel go from 0 to 1.  We also have to "flatten" the images, to change their shape from 28 x 28 to 1 x 784.  

In [0]:
# convert the integer pixel values to floats
x_train = x_train / 255.0
x_test = x_test / 255.0

# reshape the images to be 2-D tensors of 1 x 784 pixels
x_train = x_train.reshape([-1, 1, 784])
x_test = x_test.reshape([-1, 1, 784])

Now we define the neural network itself.  The network is a simple function that takes the inputs X as argument.  It computes the input to each neuron by multiplying the inputs times the weights and adding the biases.  Then it applies the "softmax" activation function to the inputs, to compute the output for each of the 10 neurons.  The softmax function enforces the sum of the outputs to be equal to 1, and it makes the high outputs higher and the low outputs lower, which helps the network "make a choice" between the 10 different digits. The method returns the outputs, a Tensor of 10 elements.

![two layer MNIST network](https://drive.google.com/file/d/1q9h9e0jhSCLrfr4Z4HZVnQbhIZcdWhTL/view?usp=sharing)

In [0]:
# The neural network
def neural_network(X):
  Inputs = tf.matmul(X, W) + b
  Y = tf.nn.softmax(Inputs)
  return Y

The loss function is the same sum of squares function we have seen before.

In [0]:
# The loss function
def loss(Y, Y_l):
  return tf.reduce_sum(tf.square(Y - Y_l))

Here is the training loop.

In [0]:
num_epochs = 1
learning_rate = 0.01

# the outer training loop:  repeat for num_epochs
for e in range(num_epochs):

    # the inner training loop: train on one image and label from the data set
    for image, label in zip(x_train, y_train):

        # convert the image and label to tensors
        X = tf.Variable(image, dtype=tf.float32)
        Y_l = tf.Variable(label, dtype=tf.float32)        

        # we wrap this 'with' statement around the next two lines, to tell 
        # TensorFlow to auto-compute the gradients
        with tf.GradientTape() as tape:
            # now get the output of the neural net
            Y = neural_network(X, W, b)

            # compute the loss function 
            current_loss = loss(Y, Y_l)

        # compute the gradients of the weights and biases with respect to the
        # loss function
        dW, db = tape.gradient(current_loss, [W, b])

        # update the weights and biases, by multiplying the gradients by the
        # learning rate
        W.assign(W + learning_rate * dW)
        b.assign(b + learning_rate * db)

        print(current_loss.numpy())