## The Initial Setup

In [1]:
import numpy as np
import mnist
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Conv2D, MaxPooling2D, Dense, Flatten
from tensorflow.keras.utils import to_categorical
from tensorflow.keras.optimizers import SGD

## Importing the data

We will be using the mnist library which is already installed as the data for working with the CNN.

In [2]:
train_images = mnist.train_images()
train_labels = mnist.train_labels()
test_images = mnist.test_images()
test_labels = mnist.test_labels()


## Normalizing the data

The `expand_dims` is used for extending the dimensions of the numpy array. [expand_dims](https://numpy.org/doc/stable/reference/generated/numpy.expand_dims.html)

In [3]:
train_images = (train_images / 255) - 0.5
test_images = (test_images / 255) - 0.5

train_images = np.expand_dims(train_images, axis=3)
test_images = np.expand_dims(test_images, axis=3)

## Building the Model

We will be building the model which we will be training.
<br>
Our model consists of a input layer being fed in the following series in forward propagations:
 - Convlayer where the filter will be applied to the input
 - MaxPooling layer which reduces the redundancy of the filter
 - Flatten the shape of the MaxPooling output
 - 10 output nodes (to predict 10 digits) using a softmax activation function at the end.

In [4]:
model = Sequential([
  Conv2D(8, 3, input_shape=(28, 28, 1), use_bias=False),
  MaxPooling2D(pool_size=2),
  Flatten(),
  Dense(10, activation='softmax'),
])


## Compiling the Model

We compile the model using different hyperparameters.
<br>
We have chosen the optimiser to be a stochastic gradient descent optimiser. Although there are many optimisers such as RMSprop, ADAM, momentum, Adamax, etc.,
<br>
We have chosen the loss to be a `categorical_crossentropy` because we have 10 classes in the output which needs to be classified. If there are only 2, we can always use `binary_crossentropy`
<br>
We have added only the accuracy as the metrics to be evaluated since this is just a classification problem.

In [6]:
model.compile(SGD(learning_rate=.005), loss='categorical_crossentropy', metrics=['accuracy'])


## Training the model

Here we input the train_images, batch_size and the epochs the model needs to be run to assess the accuracy.
<br>
Keras expects the input for the labels to be cross checked as 10, which is the same number as that of the output classes. Hence we use the `to_categorical` function to solve this issue.

In [7]:
model.fit(
  train_images,
  to_categorical(train_labels),
  batch_size=1,
  epochs=3,
  validation_data=(test_images, to_categorical(test_labels)),
)

Epoch 1/3
Epoch 2/3
Epoch 3/3


<tensorflow.python.keras.callbacks.History at 0x7f497cd1d910>