In this notebook, we will learn how to use the Keras library to build convolutional neural networks. We will also use the popular MNIST dataset and we will compare our results to using a conventional neural network.

## Objectives for this Notebook    
* How to use the Keras library to build convolutional neural networks
* Convolutional neural network with one set of convolutional and pooling layers
* Convolutional neural network with two sets of convolutional and pooling layers



### Install the necessary libraries


Let's start by installing the keras libraries and the packages that we would need to build a neural network.


In [1]:
%pip install numpy==2.0.2
%pip install pandas==2.2.2
%pip install tensorflow==2.18.0
%pip install matplotlib==3.9.2

Note: you may need to restart the kernel to use updated packages.
Note: you may need to restart the kernel to use updated packages.
Note: you may need to restart the kernel to use updated packages.
Note: you may need to restart the kernel to use updated packages.


## Import Keras and Packages


In [2]:
import keras
from keras.models import Sequential
from keras.layers import Dense
from keras.layers import Input
from keras.utils import to_categorical

In [3]:
from keras.layers import Conv2D # to add convolutional layers
from keras.layers import MaxPooling2D # to add pooling layers
from keras.layers import Flatten # to flatten data for fully connected layers

## Convolutional Neural Network with One Set of Convolutional and Pooling Layers


In [4]:
# import data
from keras.datasets import mnist

# load data
(X_train, y_train), (X_test, y_test) = mnist.load_data()

# reshape to be [samples][pixels][width][height]
X_train = X_train.reshape(X_train.shape[0], 28, 28, 1).astype('float32')
X_test = X_test.reshape(X_test.shape[0], 28, 28, 1).astype('float32')

Let's normalize the pixel values to be between 0 and 1


In [5]:
X_train = X_train / 255 # normalize training data
X_test = X_test / 255 # normalize test data

Next, let's convert the target variable into binary categories


In [6]:
y_train = to_categorical(y_train)
y_test = to_categorical(y_test)

num_classes = y_test.shape[1] # number of categories

Next, let's define a function that creates our model. Let's start with one set of convolutional and pooling layers.


In [7]:
def convolutional_model():
    
    # create model
    model = Sequential()
    model.add(Input(shape=(28, 28, 1)))
    model.add(Conv2D(16, (5, 5), strides=(1, 1), activation='relu'))
    model.add(MaxPooling2D(pool_size=(2, 2), strides=(2, 2)))
    
    model.add(Flatten())
    model.add(Dense(100, activation='relu'))
    model.add(Dense(num_classes, activation='softmax'))
    
    # compile model
    model.compile(optimizer='adam', loss='categorical_crossentropy',  metrics=['accuracy'])
    return model

Finally, let's call the function to create the model, and then let's train it and evaluate it.


In [8]:
# build the model
model = convolutional_model()

# fit the model
model.fit(X_train, y_train, validation_data=(X_test, y_test), epochs=10, batch_size=200, verbose=2)

# evaluate the model
scores = model.evaluate(X_test, y_test, verbose=0)
print("Accuracy: {} \n Error: {}".format(scores[1], 100-scores[1]*100))

Epoch 1/10
300/300 - 2s - 8ms/step - accuracy: 0.9168 - loss: 0.2980 - val_accuracy: 0.9666 - val_loss: 0.1139
Epoch 2/10
300/300 - 2s - 5ms/step - accuracy: 0.9732 - loss: 0.0929 - val_accuracy: 0.9816 - val_loss: 0.0638
Epoch 3/10
300/300 - 2s - 5ms/step - accuracy: 0.9814 - loss: 0.0635 - val_accuracy: 0.9825 - val_loss: 0.0552
Epoch 4/10
300/300 - 2s - 5ms/step - accuracy: 0.9852 - loss: 0.0499 - val_accuracy: 0.9842 - val_loss: 0.0489
Epoch 5/10
300/300 - 2s - 5ms/step - accuracy: 0.9875 - loss: 0.0402 - val_accuracy: 0.9858 - val_loss: 0.0435
Epoch 6/10
300/300 - 2s - 5ms/step - accuracy: 0.9899 - loss: 0.0332 - val_accuracy: 0.9855 - val_loss: 0.0440
Epoch 7/10
300/300 - 2s - 5ms/step - accuracy: 0.9917 - loss: 0.0284 - val_accuracy: 0.9867 - val_loss: 0.0414
Epoch 8/10
300/300 - 2s - 5ms/step - accuracy: 0.9926 - loss: 0.0236 - val_accuracy: 0.9867 - val_loss: 0.0393
Epoch 9/10
300/300 - 2s - 5ms/step - accuracy: 0.9939 - loss: 0.0199 - val_accuracy: 0.9871 - val_loss: 0.0383
E

## Convolutional Neural Network with Two Sets of Convolutional and Pooling Layers


Let's redefine our convolutional model so that it has two convolutional and pooling layers instead of just one layer of each.


In [9]:
def convolutional_model():
    
    # create model
    model = Sequential()
    model.add(Input(shape=(28, 28, 1)))
    model.add(Conv2D(16, (5, 5), activation='relu'))
    model.add(MaxPooling2D(pool_size=(2, 2), strides=(2, 2)))
    
    model.add(Conv2D(8, (2, 2), activation='relu'))
    model.add(MaxPooling2D(pool_size=(2, 2), strides=(2, 2)))
    
    model.add(Flatten())
    model.add(Dense(100, activation='relu'))
    model.add(Dense(num_classes, activation='softmax'))
    
    # Compile model
    model.compile(optimizer='adam', loss='categorical_crossentropy',  metrics=['accuracy'])
    return model

In [10]:
# build the model
model = convolutional_model()

# fit the model
model.fit(X_train, y_train, validation_data=(X_test, y_test), epochs=10, batch_size=200, verbose=2)

# evaluate the model
scores = model.evaluate(X_test, y_test, verbose=0)
print("Accuracy: {} \n Error: {}".format(scores[1], 100-scores[1]*100))

Epoch 1/10
300/300 - 2s - 8ms/step - accuracy: 0.8713 - loss: 0.4678 - val_accuracy: 0.9621 - val_loss: 0.1327
Epoch 2/10
300/300 - 2s - 5ms/step - accuracy: 0.9676 - loss: 0.1088 - val_accuracy: 0.9775 - val_loss: 0.0768
Epoch 3/10
300/300 - 2s - 5ms/step - accuracy: 0.9757 - loss: 0.0789 - val_accuracy: 0.9818 - val_loss: 0.0628
Epoch 4/10
300/300 - 2s - 5ms/step - accuracy: 0.9807 - loss: 0.0633 - val_accuracy: 0.9835 - val_loss: 0.0514
Epoch 5/10
300/300 - 2s - 5ms/step - accuracy: 0.9839 - loss: 0.0537 - val_accuracy: 0.9835 - val_loss: 0.0510
Epoch 6/10
300/300 - 2s - 5ms/step - accuracy: 0.9853 - loss: 0.0475 - val_accuracy: 0.9853 - val_loss: 0.0459
Epoch 7/10
300/300 - 2s - 6ms/step - accuracy: 0.9873 - loss: 0.0421 - val_accuracy: 0.9875 - val_loss: 0.0388
Epoch 8/10
300/300 - 2s - 5ms/step - accuracy: 0.9883 - loss: 0.0375 - val_accuracy: 0.9868 - val_loss: 0.0418
Epoch 9/10
300/300 - 2s - 5ms/step - accuracy: 0.9894 - loss: 0.0342 - val_accuracy: 0.9873 - val_loss: 0.0382
E

Let's see how batch size affects the time required and accuracy of the model training. 
For this, you can try to change batch_size to 1024 and check it's effect on accuracy


In [11]:
# Write your answer here
model = convolutional_model()
model.fit(X_train, y_train, validation_data=(X_test, y_test), epochs=10, batch_size=1024, verbose=2)
scores = model.evaluate(X_test, y_test, verbose=0)
print('Accuracy: {:.2f} % \n Error: {:.2f}'.format(scores[1]*100, 1 - scores[1]))

Epoch 1/10
59/59 - 2s - 38ms/step - accuracy: 0.6530 - loss: 1.3176 - val_accuracy: 0.8882 - val_loss: 0.4036
Epoch 2/10
59/59 - 1s - 22ms/step - accuracy: 0.9151 - loss: 0.2893 - val_accuracy: 0.9455 - val_loss: 0.1912
Epoch 3/10
59/59 - 1s - 23ms/step - accuracy: 0.9492 - loss: 0.1737 - val_accuracy: 0.9616 - val_loss: 0.1303
Epoch 4/10
59/59 - 1s - 22ms/step - accuracy: 0.9617 - loss: 0.1309 - val_accuracy: 0.9684 - val_loss: 0.1057
Epoch 5/10
59/59 - 1s - 22ms/step - accuracy: 0.9678 - loss: 0.1093 - val_accuracy: 0.9731 - val_loss: 0.0890
Epoch 6/10
59/59 - 1s - 22ms/step - accuracy: 0.9720 - loss: 0.0935 - val_accuracy: 0.9746 - val_loss: 0.0811
Epoch 7/10
59/59 - 1s - 22ms/step - accuracy: 0.9748 - loss: 0.0828 - val_accuracy: 0.9785 - val_loss: 0.0707
Epoch 8/10
59/59 - 1s - 21ms/step - accuracy: 0.9773 - loss: 0.0748 - val_accuracy: 0.9799 - val_loss: 0.0642
Epoch 9/10
59/59 - 1s - 21ms/step - accuracy: 0.9791 - loss: 0.0681 - val_accuracy: 0.9813 - val_loss: 0.0597
Epoch 10/1

Now, let's see how number of epochs  affect the time required and accuracy of the model training. 
For this, you can keep the batch_size=1024 and epochs=25 and check it's effect on accuracy


In [12]:
# Write your answer here
model = convolutional_model()
model.fit(X_train, y_train, validation_split=0.2, epochs=25, batch_size=1024, verbose=2)
scores = model.evaluate(X_test, y_test, verbose=0)
print('Accuracy: {:.2f} % \n Error: {:.2f}'.format(scores[1]*100, 1 - scores[1]))

Epoch 1/25
47/47 - 2s - 40ms/step - accuracy: 0.6080 - loss: 1.4443 - val_accuracy: 0.8531 - val_loss: 0.4901
Epoch 2/25
47/47 - 1s - 21ms/step - accuracy: 0.8913 - loss: 0.3702 - val_accuracy: 0.9255 - val_loss: 0.2569
Epoch 3/25
47/47 - 1s - 22ms/step - accuracy: 0.9310 - loss: 0.2369 - val_accuracy: 0.9457 - val_loss: 0.1912
Epoch 4/25
47/47 - 1s - 23ms/step - accuracy: 0.9488 - loss: 0.1784 - val_accuracy: 0.9544 - val_loss: 0.1564
Epoch 5/25
47/47 - 1s - 24ms/step - accuracy: 0.9570 - loss: 0.1480 - val_accuracy: 0.9616 - val_loss: 0.1330
Epoch 6/25
47/47 - 1s - 23ms/step - accuracy: 0.9625 - loss: 0.1277 - val_accuracy: 0.9662 - val_loss: 0.1206
Epoch 7/25
47/47 - 1s - 23ms/step - accuracy: 0.9652 - loss: 0.1155 - val_accuracy: 0.9680 - val_loss: 0.1100
Epoch 8/25
47/47 - 1s - 22ms/step - accuracy: 0.9700 - loss: 0.1043 - val_accuracy: 0.9714 - val_loss: 0.0989
Epoch 9/25
47/47 - 1s - 22ms/step - accuracy: 0.9728 - loss: 0.0937 - val_accuracy: 0.9737 - val_loss: 0.0929
Epoch 10/2