# Deep Learning with Python 
# Example 5.1 - `mnist` with a Convnet

## Preparing Workspace

In [2]:
from tensorflow.keras.datasets import mnist

## Installing a small ConvNet

In [None]:
from tensorflow.keras import models, layers

# Instantiate Model
model = models.Sequential()

# Add successive pairs of 2D convolutional and max pooling layers
model.add(layers.Conv2D(32, (3, 3), activation='relu', input_shape=(28, 28, 1)))
model.add(layers.MaxPooling2D((2, 2)))

# Second convolutional and pooling layer
model.add(layers.Conv2D(64, (3, 3), activation='relu'))
model.add(layers.MaxPooling2D((2, 2)))

# Third convolutional layer - no pooling 
model.add(layers.Conv2D(64, (3, 3), activation='relu'))

A convnet takes as an input a tensor of form `(img_width, img_height,  img_channels)`. In this case, we have not specified the batch dimension, which means we can provide an arbitrary number of image tensors to the ConvNet in one batch - this gives us the freedom to define batch size to any value we want during compilation step.

`(28, 28, 1)` means each element of the input tensors to the ConvNet will be a grayscale image (one color channel) with a width and height of 28 pixels.

In [5]:
# Displaying Architecture Summary - layers names, types, I/Os, params
model.summary()

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
conv2d_2 (Conv2D)            (None, 26, 26, 32)        320       
_________________________________________________________________
max_pooling2d_2 (MaxPooling2 (None, 13, 13, 32)        0         
_________________________________________________________________
conv2d_3 (Conv2D)            (None, 11, 11, 64)        18496     
_________________________________________________________________
max_pooling2d_3 (MaxPooling2 (None, 5, 5, 64)          0         
_________________________________________________________________
conv2d_4 (Conv2D)            (None, 3, 3, 64)          36928     
Total params: 55,744
Trainable params: 55,744
Non-trainable params: 0
_________________________________________________________________


## Feed ConvNet O/P to Densely Connected Classifier

In [6]:
# Flatten 3D outputs to 1D
model.add(layers.Flatten())

model.add(layers.Dense(64, activation='relu'))        # hidden layer
model.add(layers.Dense(10, activation='softmax'))     # output layer

In [7]:
# Now the model's summary property will have been updated
model.summary()

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
conv2d_2 (Conv2D)            (None, 26, 26, 32)        320       
_________________________________________________________________
max_pooling2d_2 (MaxPooling2 (None, 13, 13, 32)        0         
_________________________________________________________________
conv2d_3 (Conv2D)            (None, 11, 11, 64)        18496     
_________________________________________________________________
max_pooling2d_3 (MaxPooling2 (None, 5, 5, 64)          0         
_________________________________________________________________
conv2d_4 (Conv2D)            (None, 3, 3, 64)          36928     
_________________________________________________________________
flatten (Flatten)            (None, 576)               0         
_________________________________________________________________
dense (Dense)                (None, 64)                36928     
__________

## Training the Model

In [9]:
from tensorflow.keras.datasets import mnist
from tensorflow.keras.utils import to_categorical

In [10]:
(train_images, train_labels), (test_images, test_labels) = mnist.load_data()

In [11]:
# Converting training images into a a batch of 60k 28 x 28 px images with one color channel
train_images = train_images.reshape((60000, 28, 28, 1))
train_images = train_images.astype('float32') / 255         # also regularize pixel values
 
test_images = test_images.reshape((10000, 28, 28, 1))
test_images = test_images.astype('float32') / 255

In [14]:
# Converting target and test labels to categorical data for use with crossentropy
train_labels = to_categorical(train_labels)
test_labels = to_categorical(test_labels)

## Compile and Train

In [12]:
model.compile(optimizer='rmsprop', 
             loss='categorical_crossentropy', 
             metrics=['accuracy'])

In [15]:
model.fit(train_images, train_labels, epochs=5, batch_size=64)

Instructions for updating:
Use tf.cast instead.
Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5


<tensorflow.python.keras.callbacks.History at 0x19932673ba8>

## Evaluation

In [17]:
convNetResults = model.evaluate(test_images, test_labels)



With the convnet, our prediction accuracy on the MNIST data set increased from ~98% to ~99.1%, which may not seem like much but in fields such as OCR and Computer Vision, can have drastic practical implications.

So while convnets take longer to train, they generally tend to outperform densely connected networks, at least on image recognition tasks.