# Convolutional Neural networks

In this notebook, we will look closer at Convolutional Neural networks (CNN) for image classification. We will show how a CNN can improve our previous model for classifying handwritten digits in the MNIST dataset. This is based on the following code example from the keras website: [https://keras.io/examples/vision/mnist_convnet/](/https://keras.io/examples/vision/mnist_convnet/).

First, however, we will redo the model from chapter 2 of the book. 

In [1]:
import numpy as np
import matplotlib.pyplot as plt
import keras

from tensorflow import keras
from tensorflow.keras import layers

In [2]:
from tensorflow.keras.datasets import mnist
(train_images, train_labels), (test_images, test_labels) = mnist.load_data()

In [3]:
model = keras.Sequential([
    layers.Dense(512, activation="relu"),
    layers.Dense(10, activation="softmax")
])

In [4]:
model.compile(optimizer="rmsprop",
              loss="sparse_categorical_crossentropy",
              metrics=["accuracy"])

In [5]:
train_images = train_images.reshape((60000, 28 * 28))
train_images = train_images.astype("float32") / 255
test_images = test_images.reshape((10000, 28 * 28))
test_images = test_images.astype("float32") / 255

In [6]:
model.fit(train_images, train_labels, epochs=5, batch_size=128)

Epoch 1/5
[1m469/469[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m3s[0m 5ms/step - accuracy: 0.8715 - loss: 0.4466
Epoch 2/5
[1m469/469[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m2s[0m 5ms/step - accuracy: 0.9654 - loss: 0.1141
Epoch 3/5
[1m469/469[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m2s[0m 5ms/step - accuracy: 0.9782 - loss: 0.0732
Epoch 4/5
[1m469/469[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m2s[0m 5ms/step - accuracy: 0.9850 - loss: 0.0497
Epoch 5/5
[1m469/469[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m2s[0m 4ms/step - accuracy: 0.9892 - loss: 0.0364


<keras.src.callbacks.history.History at 0x2e38e205420>

In [7]:
model.summary()

In [8]:
test_loss, test_acc = model.evaluate(test_images, test_labels)
print(f"test_acc: {test_acc}")

[1m313/313[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 1ms/step - accuracy: 0.9757 - loss: 0.0799   
test_acc: 0.9800000190734863


## Building a CNN model for the MNIST dataset

Now, let us build a CNN model on the same data

In [9]:
num_classes = 10
input_shape = (28, 28, 1)

model = keras.Sequential(
    [
        keras.Input(shape=input_shape),
        layers.Conv2D(32, kernel_size=(3, 3), activation="relu"),
        layers.MaxPooling2D(pool_size=(2, 2)),
        layers.Conv2D(64, kernel_size=(3, 3), activation="relu"),
        layers.MaxPooling2D(pool_size=(2, 2)),
        layers.Flatten(),
        layers.Dropout(0.5),
        layers.Dense(num_classes, activation="softmax"),
    ]
)
model.summary()

Before we can train it, we need the data in the right format for CNN. For CNN models, we do not need to reshape or flatten our images thus, the only preprocessing we need is the one below.

In [10]:
(x_train, y_train), (x_test, y_test) = mnist.load_data()

# Scale images to the [0, 1] range
x_train = x_train.astype("float32") / 255
x_test = x_test.astype("float32") / 255
# Make sure images have shape (28, 28, 1)
x_train = np.expand_dims(x_train, -1)
x_test = np.expand_dims(x_test, -1)

# convert class vectors to binary class matrices
y_train = keras.utils.to_categorical(y_train, num_classes)
y_test = keras.utils.to_categorical(y_test, num_classes)

Now, let us train the model.

In [11]:
batch_size = 128
epochs = 15

model.compile(loss="categorical_crossentropy", optimizer="adam", metrics=["accuracy"])

model.fit(x_train, y_train, batch_size=batch_size, epochs=epochs, validation_split=0.1)

Epoch 1/15
[1m422/422[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m11s[0m 23ms/step - accuracy: 0.7662 - loss: 0.7710 - val_accuracy: 0.9792 - val_loss: 0.0809
Epoch 2/15
[1m422/422[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m9s[0m 22ms/step - accuracy: 0.9611 - loss: 0.1256 - val_accuracy: 0.9847 - val_loss: 0.0572
Epoch 3/15
[1m422/422[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m11s[0m 25ms/step - accuracy: 0.9723 - loss: 0.0874 - val_accuracy: 0.9862 - val_loss: 0.0484
Epoch 4/15
[1m422/422[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m11s[0m 25ms/step - accuracy: 0.9770 - loss: 0.0739 - val_accuracy: 0.9870 - val_loss: 0.0444
Epoch 5/15
[1m422/422[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m10s[0m 23ms/step - accuracy: 0.9803 - loss: 0.0645 - val_accuracy: 0.9885 - val_loss: 0.0402
Epoch 6/15
[1m422/422[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m9s[0m 21ms/step - accuracy: 0.9830 - loss: 0.0546 - val_accuracy: 0.9880 - val_loss: 0.0374
Epoch 7/15
[1m422

<keras.src.callbacks.history.History at 0x2e39086e7a0>

Finally, we can evaluate the model.

In [12]:
score = model.evaluate(x_test, y_test, verbose=0)
print("Test loss:", score[0])
print("Test accuracy:", score[1])

Test loss: 0.02726016938686371
Test accuracy: 0.9904999732971191


Our original model was already fairly good, but we managed to improve it relatively significantly, to a production quality. However, the CNN model took much longer to train, which also have to be taken into account. In other, more complicated image classification tasks, CNNs have turned out to be really well performing.

## Cat vs dog CNN

In this section we will look at building a CNN to classify images of cats or dogs. It is based on a famous kaggle dataset available here: [https://www.kaggle.com/c/dogs-vs-cats/data](https://www.kaggle.com/c/dogs-vs-cats/data). We will, however, only look at a small subset of the dataset (1200 training images, 200 validation images, and 600 test images), which is also available on the course moodle page.

We first need to load the various datasets in (and process them properly). Luckily, Keras has utility functions that makes this easy.

In [13]:
from tensorflow.keras.utils import image_dataset_from_directory
path = "../Notebooks and data-15/dogs_vs_cats_tiny/dogs_vs_cats_tiny"
train_dataset = image_dataset_from_directory(
    path + "/train",# how to concat?
    image_size=(180, 180),
    batch_size=32)
validation_dataset = image_dataset_from_directory(
     path + "/validation",
    image_size=(180, 180),
    batch_size=32)
test_dataset = image_dataset_from_directory(
     path + "/test",
    image_size=(180, 180),
    batch_size=32)

NotFoundError: Could not find directory dogs_vs_cats_tiny/train

Let us now build a CNN models to classify cats vs dogs.

In [None]:
c_and_d_model = keras.Sequential(
    [
        keras.Input(shape=(180, 180, 3)),
        layers.Rescaling(1./255),
        layers.Conv2D(32, kernel_size=(3, 3), activation="relu"),
        layers.MaxPooling2D(pool_size=(2, 2)),
        layers.Conv2D(64, kernel_size=(3, 3), activation="relu"),
        layers.MaxPooling2D(pool_size=(2, 2)),
        layers.Conv2D(128, kernel_size=(3, 3), activation="relu"),
        layers.MaxPooling2D(pool_size=(2, 2)),
        layers.Conv2D(256, kernel_size=(3, 3), activation="relu"),
        layers.MaxPooling2D(pool_size=(2, 2)),
        layers.Flatten(),
        layers.Dense(1, activation="sigmoid"),
    ]
)
c_and_d_model.summary()

And let us compile it.

In [None]:
c_and_d_model.compile(loss="binary_crossentropy",
              optimizer="rmsprop",
              metrics=["accuracy"])

And let us finally fit it.

In [None]:
history = c_and_d_model.fit(
    train_dataset,
    epochs=8,  # I have already checked when to stop...
    validation_data=validation_dataset)

In [None]:
accuracy = history.history["accuracy"]
val_accuracy = history.history["val_accuracy"]
loss = history.history["loss"]
val_loss = history.history["val_loss"]
epochs = range(1, len(accuracy) + 1)
plt.plot(epochs, accuracy, "bo", label="Training accuracy")
plt.plot(epochs, val_accuracy, "b", label="Validation accuracy")
plt.title("Training and validation accuracy")
plt.legend()
plt.figure()
plt.plot(epochs, loss, "bo", label="Training loss")
plt.plot(epochs, val_loss, "b", label="Validation loss")
plt.title("Training and validation loss")
plt.legend()
plt.show()

In [None]:
test_loss, test_acc = c_and_d_model.evaluate(test_dataset)
print(f"Test accuracy: {test_acc:.3f}")