## Getting Ready

We're going to reimplement the MNIST model from *The Introductory CNN Model* recipe in chapter 8 for this task.

## How to do it...

1. First load required libraries

In [1]:
import tensorflow as tf
import numpy as np
import datetime

2. Reimplement the MNIST model

In [7]:
(x_train, y_train), (x_test, y_test) = tf.keras.datasets.mnist.load_data()

x_train = x_train.reshape(-1, 28, 28, 1)
x_test = x_test.reshape(-1, 28, 28, 1)

# Padding the images by 2 pixels since in the paper input images were 32x32
x_train = np.pad(x_train, ((0,0),(2,2),(2,2),(0,0)), 'constant')
x_test = np.pad(x_test, ((0,0),(2,2),(2,2),(0,0)), 'constant')

# Normalize
x_train = x_train / 255
x_test = x_test/ 255

# Set model parameters
image_width = x_train[0].shape[0]
image_height = x_train[0].shape[1]
num_channels = 1 # grayscale = 1 channel

# Training and Test data variables
batch_size = 100
evaluation_size = 500
generations = 300
eval_every = 5

# Set for reproducible results
seed = 98
np.random.seed(seed)
tf.random.set_seed(seed)

# Declare the model
input_data = tf.keras.Input(dtype=tf.float32, shape=(image_width,image_height, num_channels), name="INPUT")

# First Conv-ReLU-MaxPool Layer
conv1 = tf.keras.layers.Conv2D(filters=6,
                               kernel_size=5,
                               padding='VALID',
                               activation="relu",
                               name="C1")(input_data)

max_pool1 = tf.keras.layers.MaxPool2D(pool_size=2,
                                      strides=2, 
                                      padding='SAME',
                                      name="S1")(conv1)

# Second Conv-ReLU-MaxPool Layer
conv2 = tf.keras.layers.Conv2D(filters=16,
                               kernel_size=5,
                               padding='VALID',
                               strides=1,
                               activation="relu",
                               name="C3")(max_pool1)

max_pool2 = tf.keras.layers.MaxPool2D(pool_size=2,
                                      strides=2, 
                                      padding='SAME',
                                      name="S4")(conv2)

# Flatten Layer
flatten = tf.keras.layers.Flatten(name="FLATTEN")(max_pool2)


# First Fully Connected Layer
fully_connected1 = tf.keras.layers.Dense(units=120,
                                         activation="relu",
                                         name="F5")(flatten)

# Second Fully Connected Layer
fully_connected2 = tf.keras.layers.Dense(units=84,
                                         activation="relu",
                                         name="F6")(fully_connected1)

# Final Fully Connected Layer
final_model_output = tf.keras.layers.Dense(units=10,
                                           activation="softmax",
                                           name="OUTPUT"
                                           )(fully_connected2)
    

model = tf.keras.Model(inputs= input_data, outputs=final_model_output)

3. We now compile the model with sparse categorical cross-entropy loss and the Adam optimizer

In [8]:
model.compile(
    optimizer="adam", 
    loss="sparse_categorical_crossentropy",
    metrics=["accuracy"]
)

model.summary()

Model: "model_1"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
INPUT (InputLayer)           [(None, 32, 32, 1)]       0         
_________________________________________________________________
C1 (Conv2D)                  (None, 28, 28, 6)         156       
_________________________________________________________________
S1 (MaxPooling2D)            (None, 14, 14, 6)         0         
_________________________________________________________________
C3 (Conv2D)                  (None, 10, 10, 16)        2416      
_________________________________________________________________
S4 (MaxPooling2D)            (None, 5, 5, 16)          0         
_________________________________________________________________
FLATTEN (Flatten)            (None, 400)               0         
_________________________________________________________________
F5 (Dense)                   (None, 120)               4812

4. We can now create a timestamped subdirectory for each run. The summary writer will write the TensorBoard logs to this folder:

In [9]:
log_dir="logs/experiment-" + datetime.datetime.now().strftime("%Y%m%d-%H%M%S")

5. We instantiate the TensorBoard callback and pass it to the fit method. All the logs during the training phase will be stored in this directory and can be viewed instantly in TensorBoard.

In [15]:
tensorboard_callback = tf.keras.callbacks.TensorBoard(log_dir=log_dir, write_images=True, histogram_freq=1)

model.fit(x=x_train,
          y=y_train,
          epochs=5,
          validation_data=(x_test, y_test),
          callbacks=[tensorboard_callback])

Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5


<keras.callbacks.History at 0x7fec53fc5cf8>

6. Now we start the TensorBoard application by running the following commands:

In [14]:
%load_ext tensorboard
%tensorboard --logdir='logs'

While TensorBoard seems like a useful tool, this in-notebook version is a bit clunky/unyieldy. It would be interesting to take a deep dive into the applications of this tool.