# Deep Learning for Computer Vision with TensorFlow 2.0

Table of contents:
[Lab 0](https://colab.research.google.com/github/embedded-vision/dlcvtf2/blob/master/00_test_install.ipynb) | 
[Lab 1](https://colab.research.google.com/github/embedded-vision/dlcvtf2/blob/master/01_linear_regression.ipynb) | 
[Lab 2](https://colab.research.google.com/github/embedded-vision/dlcvtf2/blob/master/02_tensorflow_logistic_regression.ipynb) | 
[Lab 3a](https://colab.research.google.com/github/embedded-vision/dlcvtf2/blob/master/03a_tensorflow_deep_network.ipynb) | 
[Lab 3b](https://colab.research.google.com/github/embedded-vision/dlcvtf2/blob/master/03b_deep_mnist_visualize.ipynb) | 
[Lab 4](https://colab.research.google.com/github/embedded-vision/dlcvtf2/blob/master/04_mnist_cnn.ipynb) | 
[Lab 5](https://colab.research.google.com/github/embedded-vision/dlcvtf2/blob/master/05_data_prep.ipynb) | 
[Lab 6](https://colab.research.google.com/github/embedded-vision/dlcvtf2/blob/master/06_transfer_learning.ipynb) | 

# Lab 3: Deep MNIST

If you ever want to start over just go to the menu at the top and select Runtime ->  Restart runtime.  (Please do not use Runtime -> Reset all runtimes.)

In this exercise you will again apply what you've learned in the lecture. This is another model for an MNIST classifier but this model will use a deep network instead of a shallow one as in the last example. 

Run each of the cells to create the classifier model, train the model, and test the model to see how well the model performs. 

You can modify parameters in certain cells to change the way that the training is performed and see what happens. Experiment to see if you can get better results than the defaults. 

This cell imports tensorflow as well as Keras which will be used for the layer description.

In [0]:
# Cell 3a.1
%tensorflow_version 2.x
import tensorflow as tf
if tf.__version__ != "2.0.0":
    !pip install tensorflow-gpu==2.0.0
    print("Please go to Runtime -> restart runtime and then, once that finishes, rerun this cell.")

from tensorflow import keras
from tensorflow.keras import layers

import numpy as np

%pylab inline

print ('cell finished')

Notice that the dataset loading and modification code is exactly the same as in the shallow model.


In [0]:
# Cell 3a.2

(x_train, y_train), (x_test, y_test) = tf.keras.datasets.mnist.load_data()
# Numpy defaults to dtype=float64; TF defaults to float32. Stick with float32.
x_train, x_test = x_train / np.float32(255), x_test / np.float32(255)
y_train, y_test = y_train.astype(np.int64), y_test.astype(np.int64)


print ('cell finished')

The dataset conversion code is also exactly the same.

Variables x_train, y_train, etc. are tensors that can be converted into a Dataset with the statements shown below

In [0]:
# Cell 3a.3

train_ds = tf.data.Dataset.from_tensor_slices((x_train, y_train)).repeat()
test_ds = tf.data.Dataset.from_tensor_slices((x_test, y_test))


print ('cell finished')

The next cell contains a number of constants used in the model and training. Putting these variables in one location allows for easy modification throughout the model. Notice that a new variable HIDDEN_SIZE has been added. This will control the size of the hidden layer that is added for the deep network.

In [0]:
# Cell 3a.4

BATCH_SIZE=100

HIDDEN_SIZE = 300

NUM_CLASSES = 10
IMG_HT = 28
IMG_W = 28

print ('cell finished')

The next two statements are used to specify the batch sizes for the training and testing datasets. Additionally the training dataset is shuffled so that successive runs will use a different training image order. 

In [0]:
# Cell 3a.5

train_ds = train_ds.shuffle(60000).batch(BATCH_SIZE)
test_ds = test_ds.batch(BATCH_SIZE)

print ('cell finished')

This cell is again the same as in the last example. 

This cell is meant to help understand the details of the MNIST dataset. You will notice a number of print statements that will print out different aspects of MNIST. Uncomment these and run the cell to see the results. 

The first print statement (2 lines) will print out the sizes of the train and test sets. 

The second shows how the labels are formatted. You can change the index used to pick a particular image to see different labels. 

The next print statement shows the shape of the fields of training data set

The next statements show the length of a row of image data, and the contents of one row in the middle

The final two statements plot the image from the index. Again feel free to modify the indices to see different images.

In [0]:
# Cell 3a.6

# Run this cell to understand the format of the dataset. Uncomment the print commands one by one to understand what
# the data set looks like. As you uncomment a line use shift-enter to run the cell once more. In the first print you
# actually have to uncomment two lines or you will get an error. 

img_index = 7

# 1. There are 55k, 5k, and 10k examples in train, validation, and test.
print ('Train, test: %d, %d' % 
      (len(x_train), len(x_test)))

# 2. The label is an integer
print ('label = {}'.format(y_train[4]))

# 3. The shapes of the two different data types in the Dataset is shown here
print ('x shape = {}'.format(np.shape(x_train)))
print ('y_shape = {}'.format(np.shape(y_train)))

# 4. An image is a 28 by 288 array of  pixels.
print ('length of first row = {}'.format(len(x_train[4])))
print (x_train[img_index][10]) # This prints the 11th row of 28. The nonzero values represent the pixel values through
                       # the middle of the digit

# 5. To display an image
pylab.imshow(x_train[img_index], cmap=pylab.cm.gray_r)   
pylab.title('Label: %d' % y_train[4])

The next cell contains a function that describes the classification network. A Keras Sequential model is used to describe a single classification layer. The classification layer is a dense layer so it will input all of the pixels of an image at a time. Since the MNIST data is an array of 28 x 28 a Flatten layer is needed to convert it into a flat array of 784 pixels. Notice that the Flatten layer uses the IMG_HT and IMG_W constants to determine how to flatten the image. 

This model now contains an extra hidden layer that will be used to add non-linearity for more complex model behavior. The size of the hidden layer is specified by the new constant HIDDEN_SIZE that was added earlier. 

The dense layer is then 784 pixels as input and NUM_CLASSES(10) output. One class for each of our 10 digit types. 

The activation is specified as softmax so that probabilities are used for neuron output. 

In [0]:
# Cell 3a.7

def model():
    model = keras.Sequential([
    layers.Flatten(input_shape=[IMG_HT,IMG_W]),
    layers.Dense(HIDDEN_SIZE, activation='relu'),
    layers.Dense(NUM_CLASSES, activation='softmax')
  ])
    return model

print ('cell finished')

Just as in the previous notebook the model function will be called to create the Keras model for the classification. 
 

In [0]:
# Cell 3a.8

mnist_model = model()

mnist_model.summary()

print ('cell finished')

Also as previously discussed the model needs to be compiled for use.  

Notice that this code does not change for a deep network, it is exactly the same except that we decided to use the adam optimizer. 

The metrics that will be displayed are the classification accuracy, though feel free to add others. 

In [0]:
# Cell 3a.9

mnist_model.compile(optimizer='adam', 
              loss='sparse_categorical_crossentropy',
              metrics=['accuracy'])


print ('cell finished')

Let's try the model out before it is trained to see how well it does. Since the weights and biases are random values there is a chance (about 1 in 10) that the correct value is returned from classification. This code will allow you to try an example to see if it works. 

The `image_index` variable is used to grab an example from the training dataset. The expected label is captured in variable `exp_label`. The pixel data is captured in `img`. These values are fed to the network and the predict function called to generate the output in variable `label`. 

The results are printed out and the image plotted so you can see which one it is. 

Experiment with the `image_index` variable to see if you can find images where the untrained model correctly predicts the label. You will need to change the index and rerun this cell. 

In [0]:
# Cell 3a.10

image_index = 109
exp_label = y_train[image_index]

img = (np.expand_dims(x_train[image_index],0))

label = mnist_model.predict(img)

print ("calculated label = {} expected label = {}".format(np.argmax(label), exp_label))
pylab.imshow(x_train[image_index], cmap=pylab.cm.gray_r)   
pylab.title('Label: %d' % y_train[image_index]) 

With a deep network you will notice that the training takes more time. This is especially noticeable on a CPU vs a GPU. 

The training dataset train_ds is input to the fit method and the training is run using the number of epochs specified. The steps_per_epoch is usually specified as the length of the dataset divided by the batch size so that the there are enough steps in each epoch to run all of the examples. 

The training process will use the batch size that was applied to the dataset earlier. 

This time you might also notice some lines with perfect accuracy. That is not for the complete training set but only a single batch.

The `verbose` keyword specifies how much data is output for each epoch. A setting of 2 will output a single line per epoch. 

In [0]:
# Cell 3a.11

history = mnist_model.fit(
  train_ds,
  epochs=50, steps_per_epoch=600,verbose=2)


print ('cell finished')

During the training the accuracy and loss are displayed for each epoch. When the accuracy plateaus there is no need to run further epochs. 

This is the training accuracy. It is not the same as the testing accuracy. As mentioned in the lecture the data has been split into a training set and a testing set. The test data was never seen by the model during training but is used to test how well the model generalizes. 



The previous cell showed the training accuracy because it used the training images. To get the testing accuracy let's run the accuracy on the test images. 

In [0]:
# Cell 3a.12 

mnist_model.evaluate(
    test_ds, steps=32)


The testing accuracy should be much improved over the shallow model because the deep model is better able to handle the complexities in the data. 

The testing accuracy is slightly different, but remember the model was not trained on those images, the model has never seen those images before, so having a similar accuracy is a good sign. 

The accuracy displayed on the first line is for the last epoch of the testing data. The final accuracy is displayed on the last line. 

You might think that over 90% accuracy is good, but the accuracy of this model has great room for improvement. Going forward more accurate models will be developed.

Go back to cell 10 and change the image_index once more. The model now uses the trained weight values. Nine out of ten times as we see by the accuracy it should pick the correct label.