<a href="https://colab.research.google.com/github/sagnikbiswas/My_Keras_Notebooks/blob/main/keras_convolutional.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

#Building a Convolutional Neural Network with Keras

In this notebook we will be training a CNN with the MNIST dataset. MNIST is a handwritten digit image database.

**Use GPU for training if you are using Colab. It is ~25 times faster than CPU.**

In [1]:
import keras
from keras.models import Sequential
from keras.layers import Dense
from tensorflow.keras.utils import to_categorical
from keras.layers.convolutional import Conv2D
from keras.layers.convolutional import MaxPooling2D
from keras.layers import Flatten
from keras.datasets import mnist

In [2]:
# load MNIST data.
(X_train, y_train), (X_test, y_test) = mnist.load_data()

Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/mnist.npz


In [3]:
# Why reshape with a fourth dimension 1? Because Keras Conv2D expects 4D input, i.e. sample_size, height, width, channel_size
# We have greyscale images in MNIST so the channel_size is 1
X_train = X_train.reshape(X_train.shape[0], 28, 28, 1).astype("float32")
X_test = X_test.reshape(X_test.shape[0], 28, 28, 1).astype("float32")

# Normalize pixel values
X_train /= 255
X_test /= 255

In [4]:
# Encode y values into one hot vectors
y_train = to_categorical(y_train)
y_test = to_categorical(y_test)

# number of outputs in the model
num_classes = y_test.shape[1]

###Building the model

####Description

A CNN has:
    input layer,
    one or more convolution + ReLU (Keras Conv2D) layers,
    one or more corresponding max pool layers (Keras MaxPooling2D),
    a flatten layer for flattening the output into a 1D vector,
    one or more dense layers,
    and an output softmax layer.

In `keras_classification` we trained the same MNIST data using a conventional NN, but we had to use 784 input neurons corresponding to every feature (pixel*channel) of the image. For higher resolution image, it is impractical to use millions of input neurons, thus CNN enters the scene. In a CNN the Conv filters extract features (edges etc), the pooling layer reduces dimensionality and risk of overfitting, and the dense layers are identical to normal NNs.

The following model has one Conv2D and one MaxPooling2D layer. Conv2D is taking input filters=16, kernel_size=(5,5), strides=(1,1) (default). MaxPooling2D has pool_size=(2,2) (default) and strides=(2,2)

####Model

In [5]:
def my_convolutional_model_1():
    my_model = Sequential()
    my_model.add(Conv2D(16, (5,5), activation='relu', input_shape=(28, 28, 1)))
    my_model.add(MaxPooling2D(strides=(2, 2)))

    my_model.add(Flatten())
    my_model.add(Dense(100, activation='relu'))
    my_model.add(Dense(num_classes, activation='softmax'))

    my_model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])
    return my_model

###Train

In [6]:
my_model_1 = my_convolutional_model_1()

my_model_1.fit(X_train, y_train, validation_data=(X_test, y_test), epochs=10, batch_size=200, verbose=2)

Epoch 1/10
300/300 - 44s - loss: 0.2829 - accuracy: 0.9216 - val_loss: 0.0909 - val_accuracy: 0.9733
Epoch 2/10
300/300 - 1s - loss: 0.0828 - accuracy: 0.9759 - val_loss: 0.0655 - val_accuracy: 0.9794
Epoch 3/10
300/300 - 1s - loss: 0.0574 - accuracy: 0.9825 - val_loss: 0.0507 - val_accuracy: 0.9831
Epoch 4/10
300/300 - 1s - loss: 0.0459 - accuracy: 0.9861 - val_loss: 0.0468 - val_accuracy: 0.9856
Epoch 5/10
300/300 - 1s - loss: 0.0374 - accuracy: 0.9888 - val_loss: 0.0410 - val_accuracy: 0.9856
Epoch 6/10
300/300 - 1s - loss: 0.0307 - accuracy: 0.9907 - val_loss: 0.0481 - val_accuracy: 0.9853
Epoch 7/10
300/300 - 1s - loss: 0.0272 - accuracy: 0.9917 - val_loss: 0.0354 - val_accuracy: 0.9886
Epoch 8/10
300/300 - 1s - loss: 0.0211 - accuracy: 0.9934 - val_loss: 0.0379 - val_accuracy: 0.9875
Epoch 9/10
300/300 - 1s - loss: 0.0174 - accuracy: 0.9947 - val_loss: 0.0375 - val_accuracy: 0.9876
Epoch 10/10
300/300 - 1s - loss: 0.0143 - accuracy: 0.9958 - val_loss: 0.0363 - val_accuracy: 0.989

<keras.callbacks.History at 0x7fce3a5b4750>

###Test

In [7]:
scores = my_model_1.evaluate(X_test, y_test, verbose=0)
print(f"Accuracy: {scores[1]*100}%")

Accuracy: 98.94000291824341%


###Building another model

This model has two Conv2D and two MaxPooling2D layers. Generally we use less filters (16) in the earlier layer where the dimensions are bigger, and more filters (32) in the later layers with reduced dimension sizes. In this example, using more than two conv and pool layer did not improve accuracy. Using (1,1) stride for pooling resulted in more oscillatory accuracy

In [8]:
def my_convolutional_model_2():
    my_model = Sequential()
    my_model.add(Conv2D(16, (5,5), activation='relu', input_shape=(28, 28, 1)))
    my_model.add(MaxPooling2D(strides=(1, 1)))

    my_model.add(Conv2D(32, (3,3), activation='relu'))
    my_model.add(MaxPooling2D(strides=(1,1)))

    my_model.add(Flatten())
    my_model.add(Dense(100, activation='relu'))
    my_model.add(Dense(num_classes, activation='softmax'))

    my_model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])
    return my_model

In [11]:
my_model_2 = my_convolutional_model_2()

my_model_2.fit(X_train, y_train, validation_data=(X_test, y_test), epochs=10, batch_size=64, verbose=2)

Epoch 1/10
938/938 - 4s - loss: 0.1297 - accuracy: 0.9609 - val_loss: 0.0387 - val_accuracy: 0.9871
Epoch 2/10
938/938 - 3s - loss: 0.0416 - accuracy: 0.9875 - val_loss: 0.0316 - val_accuracy: 0.9882
Epoch 3/10
938/938 - 3s - loss: 0.0285 - accuracy: 0.9909 - val_loss: 0.0235 - val_accuracy: 0.9923
Epoch 4/10
938/938 - 3s - loss: 0.0221 - accuracy: 0.9929 - val_loss: 0.0335 - val_accuracy: 0.9891
Epoch 5/10
938/938 - 3s - loss: 0.0177 - accuracy: 0.9940 - val_loss: 0.0359 - val_accuracy: 0.9887
Epoch 6/10
938/938 - 3s - loss: 0.0131 - accuracy: 0.9959 - val_loss: 0.0479 - val_accuracy: 0.9866
Epoch 7/10
938/938 - 3s - loss: 0.0108 - accuracy: 0.9964 - val_loss: 0.0348 - val_accuracy: 0.9910
Epoch 8/10
938/938 - 3s - loss: 0.0097 - accuracy: 0.9967 - val_loss: 0.0331 - val_accuracy: 0.9911
Epoch 9/10
938/938 - 3s - loss: 0.0087 - accuracy: 0.9972 - val_loss: 0.0287 - val_accuracy: 0.9919
Epoch 10/10
938/938 - 3s - loss: 0.0063 - accuracy: 0.9980 - val_loss: 0.0320 - val_accuracy: 0.9918

<keras.callbacks.History at 0x7fce2964c890>

In [12]:
scores2 = my_model_2.evaluate(X_test, y_test, verbose=0)
print(f"Accuracy: {scores2[1]*100}%")

Accuracy: 99.18000102043152%


The accuracies of both models are pretty close. Using lower batch size and smaller pooling stride improves accuracy. Any other slight improvement in the second model seems unreliable at best.