# Deep Learning for Computer Vision with TensorFlow 2.0

Table of contents:
[Lab 0](https://colab.research.google.com/github/embedded-vision/dlcvtf2/blob/master/00_test_install.ipynb) | 
[Lab 1](https://colab.research.google.com/github/embedded-vision/dlcvtf2/blob/master/01_linear_regression.ipynb) | 
[Lab 2](https://colab.research.google.com/github/embedded-vision/dlcvtf2/blob/master/02_tensorflow_logistic_regression.ipynb) | 
[Lab 3a](https://colab.research.google.com/github/embedded-vision/dlcvtf2/blob/master/03a_tensorflow_deep_network.ipynb) | 
[Lab 3b](https://colab.research.google.com/github/embedded-vision/dlcvtf2/blob/master/03b_deep_mnist_visualize.ipynb) | 
[Lab 4](https://colab.research.google.com/github/embedded-vision/dlcvtf2/blob/master/04_mnist_cnn.ipynb) | 
[Lab 5](https://colab.research.google.com/github/embedded-vision/dlcvtf2/blob/master/05_data_prep.ipynb) | 
[Lab 6](https://colab.research.google.com/github/embedded-vision/dlcvtf2/blob/master/06_transfer_learning.ipynb) | 

# Lab 4: MNIST CNN

If you ever want to start over just go to the menu at the top and select Runtime ->  Restart runtime.  (Please do not use Runtime -> Reset all runtimes.)

As we saw in the lecture fully connected models are impractical for larger images. We need a better way to classify larger images and CNN models can help significantly. 

TensorFlow with Keras makes describing even complex CNN models easy. The other beauty of this framework is that the only part that will change is the model layer description. Virtually the rest of the model remains the same as used previously. 

This model takes much more computation to run on the CPU so this is where the GPU is necessary.  Notice that there is also a TPU option but this model is not setup to optimally use the TPU so don't use that right now.


In [0]:
# Cell 4.1
%tensorflow_version 2.x
import tensorflow as tf
if tf.__version__ != "2.0.0":
    !pip install tensorflow-gpu==2.0.0
    print("Please go to Runtime -> restart runtime and then, once that finishes, rerun this cell.")

from tensorflow import keras
from tensorflow.keras import layers

import numpy as np
%pylab inline

print ('cell finished')


The first cell reads the data sets as we have been doing all along.

In [0]:
# Cell 4.2

BATCH_SIZE=100

HIDDEN_SIZE = 1024

NUM_CLASSES = 10
IMG_HT = 28
IMG_W = 28

print ('cell finished')


Now we define the input variable to accept digit input pixel values, and the expected result from the model. These look very familiar as we have used the same in previous models.


In [0]:
# Cell 4.3

(x_train, y_train), (x_test, y_test) = tf.keras.datasets.mnist.load_data()
# Numpy defaults to dtype=float64; TF defaults to float32. Stick with float32.
x_train, x_test = x_train / np.float32(255), x_test / np.float32(255)
y_train, y_test = y_train.astype(np.int64), y_test.astype(np.int64)

train_ds = tf.data.Dataset.from_tensor_slices((x_train, y_train)).repeat()
test_ds = tf.data.Dataset.from_tensor_slices((x_test, y_test))

# DON'T CHANGE THESE VARIABLES OR YOU WILL BREAK THINGS
NUM_PIXELS = 28 * 28
NUM_CLASSES = 10

print ('cell finished')


In [0]:
# Cell 4.4

train_ds = train_ds.shuffle(60000).batch(BATCH_SIZE)
test_ds = test_ds.batch(BATCH_SIZE)

print ('cell finished')


This cell shows how to describe the CNN model using TensorFlow and Keras. The model function contains a sequential model that describes all of the layers of our network. 

This function accepts a single input 2D image and each of the layers passes the output tensor to the next layer.  

The first convolutional layer accepts the input image and generates an output tensor which is the input to the next layer. The next layer generates a new output applied to the next layer. This occurs successively until all of the layers have been defined.  

Notice that each layer has a number of parameters that allow specification of the activation type, number of filters, kernel size, etc. There are a lot more parameters that are defaulted but can be specified if needed to get better results. 

The Dropout layer is used when the model is fit to the data, but will not be used during evaluation or prediction. This behavior is built in to Keras.

In [0]:
# Cell 4.5

def model():
    model = keras.Sequential([
    layers.Reshape(
        target_shape=[28, 28, 1],
        input_shape=(28, 28,)),
  
    layers.Conv2D(filters=32,kernel_size=(5,5), padding='same', activation='relu'),
    layers.MaxPooling2D(pool_size=(2,2), strides=(2,2), padding='same'),
    layers.Conv2D(filters=64,kernel_size=(5,5), padding='same', activation='relu'),
    layers.MaxPooling2D(pool_size=(2,2), strides=(2,2), padding='same'),
    layers.Flatten(),
    layers.Dense(HIDDEN_SIZE, activation='relu'),
    layers.Dropout(rate=0.4),
    layers.Dense(NUM_CLASSES, activation='softmax')
  ])
    return model

print ('cell finished')


Just as in previous networks the model function will create the model. Notice the summary shows a much bigger model than any used previously

In [0]:
# Cell 4.6

conv_model = model() # calls the layer function

conv_model.summary()

print ('cell finished')

Notice the the compilation parameters are exactly as used before, even though the model is much larger.

In [0]:
# Cell 4.7

conv_model.compile(optimizer='adam', 
              loss='sparse_categorical_crossentropy',
              metrics=['accuracy'])

print ('cell finished')


The next cell will train or fit the model. It will use the training dataset train_ds, run 50 times or 50 epochs through the data, with 600 minibatches per epoch. If you take the steps_per_epoch * the mini-batch size, it should be greater than or equal to the size of the training set. 

Since this is a much larger model than previously used, it will take much longer to train on a CPU. A GPU is needed for reasonable performance in class.

In [0]:
# Cell 4.8

history = conv_model.fit(
  train_ds,
  epochs=50,  
  steps_per_epoch=600,verbose=2)


print ('cell finished')


As in previous models the accuracy of the model is plot as before. The accuracy should be better, even though this is a tiny dataset. 

In [0]:
# Cell 4.9

pylab.plot(history.history['accuracy'],'b')
pylab.title('Model accuracy')
pylab.ylabel('Accuracy')
pylab.xlabel('Epoch')
pylab.legend(['Train'], loc='upper left')
pylab.show()

# Plot training & validation loss values
pylab.plot(history.history['loss'], 'r')
#plt.plot(history.history['val_loss'])
pylab.title('Model loss')
pylab.ylabel('Loss')
pylab.xlabel('Epoch')
pylab.legend(['Train'], loc='upper left')
pylab.show()

print ('cell finished')


Just like before this will check the model on the test data

In [0]:
# Cell 4.10

results=conv_model.evaluate(
    test_ds, steps=100)

print("loss = {}".format(results[0]))
print("accuracy = {}".format(results[1]))

print ('cell finished')


When working with images even a less than 1 percent difference can make a big difference in the utility of the classifier. Notice that the accuracy achieved with the convolutional model is significantly better than even the deep network from the last exercise. 