### MNIST digit classifier

Obviously the first thing we do is import the relevent packages for the problem

In [1]:
import tensorflow
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Conv2D, Flatten

Now define some parameters, apparently for the MNIST image set the images are 28 pixels square.

In [2]:
img_width, img_height = 28, 28
input_shape = (img_width,img_height,1) # this odd shape is due to use of Conv2D
batch_size = 1000                      #and since imgs are greysale
no_epochs = 5
no_classes = 10 #datase has images of numbers 0-9
validation_split = 0.2
verbosity = 1

Specifying the model as verbose means to specify all the possible output in the terminal

Now we need to load the training dataset, keras has some ones saved, including the MNIST image set so we can load it as follows

In [3]:
def load_data():
    return tensorflow.keras.datasets.mnist.load_data(path='mnist.npz')

Ok now lets actually build the model

In [4]:
def model1():
    model = Sequential()
    model.add(Conv2D(4, kernel_size = (3,3), activation = 'relu', input_shape = input_shape))
    model.add(Conv2D(8, kernel_size = (3,3), activation = 'relu'))
    model.add(Conv2D(12, kernel_size = (3,3), activation = 'relu'))
    model.add(Flatten())
    model.add(Dense(256, activation = 'relu'))
    model.add(Dense(no_classes, activation = 'softmax'))
    return model

This is the shape of the model that we will be using, however we still have to compile it into a working model. So we need to specify both an optimisation algorithm and a loss function.

In [5]:
def compile_model1(model):
    model.compile(loss=tensorflow.keras.losses.sparse_categorical_crossentropy,
                 optimizer = tensorflow.keras.optimizers.Adam(),
                 metrics = ['accuracy'])
    return model

Now we train up the model, making sure to split data into training validation and testing so that we can effectively train a model able to generalise

In [6]:
def train_model(model, X_train, Y_train):
    model.fit(X_train, Y_train, 
             batch_size = batch_size, 
             epochs = no_epochs, 
             verbose = verbosity,
             shuffle = True,
             validation_split = validation_split
             )
    return model

#now we test the model
def test_model(model, X_test, Y_test):
    score = model.evaluate(X_test, Y_test, verbose=0)
    print(f'Test loss: {score[0]} / Test accuracy: {score[1]}')
    return model
    

In [7]:
#now we load in the data
(X_train, Y_train), (X_test, Y_test) = load_data()

In [8]:
#now we have to normalise the data
(X_train, X_test) = (X_train/255.0, X_test/255.0)

#255 because I think this is the maximum value in the greyscale range

Now we reshape the data into the same shape as what the first Conv2D layer of our model is expecting, as we have defined earlier

In [9]:
X_train=X_train.reshape(X_train.shape[0],X_train.shape[1],X_train.shape[2],1)

In [10]:
X_test = X_test.reshape(X_test.shape[0],X_test.shape[1],X_test.shape[2],1)

In [11]:
model = model1()
model = compile_model1(model)
model = train_model(model, X_train, Y_train)
model = test_model(model, X_test, Y_test)

Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5
Test loss: 0.05624527111649513 / Test accuracy: 0.9814000129699707


So it seems the output of the model is a length 10 list with each entry being the probability of that index being the predicted number, so I should be able to print the indices of the largest probabilities and this will give us the predicted numbers, I can test this below for the first 10 or something just to show it

In [31]:
print('First 3 answers to test are ' , Y_test[:10])

x=model.predict(X_test)
list1=[np.argmax(x[i]) for i in range(10)]
print('First 3 predicted answers are ', list1)

First 3 answers to test are  [7 2 1 0 4 1 4 9 5 9]
First 3 predicted answers are  [7, 2, 1, 0, 4, 1, 4, 9, 5, 9]


Works very well