<a href="https://colab.research.google.com/github/karolina-kom/cnn-image-classification/blob/main/CNNs_for_image_classification.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

## Setup

Import the necessary libraries.

In [None]:
import matplotlib
import matplotlib.pyplot as plt
import numpy as np
import os
import PIL
import tensorflow as tf

from sklearn.metrics import confusion_matrix, ConfusionMatrixDisplay

from tensorflow import keras
from tensorflow.keras import layers
from tensorflow.keras.models import Sequential

Download the dataset.

In [None]:
import pathlib
dataset_url = "https://storage.googleapis.com/download.tensorflow.org/example_images/flower_photos.tgz"
data_dir = tf.keras.utils.get_file(fname='flower_photos', origin=dataset_url, untar=True)
data_dir = pathlib.Path(data_dir)

## Data exploration

The dataset should contain around 3,700 photos of flowers, split into five sub-directories one for each class (daisy, dandelion, roses, sunflowers, tulips).

Check the total number of images in the dataset.

In [None]:
image_count = len(list(data_dir.glob('*/*.jpg')))
print(image_count)

Check each sub-directory.

In [None]:
for name in os.listdir('/root/.keras/datasets/flower_photos/'):
  if (name!='LICENSE.txt'):
    print(name)

Check the number of images in each sub-directory.

In [None]:
print(len(os.listdir('/root/.keras/datasets/flower_photos/dandelion')))

In [None]:
print(len(os.listdir('/root/.keras/datasets/flower_photos/daisy')))

In [None]:
print(len(os.listdir('/root/.keras/datasets/flower_photos/sunflowers')))

In [None]:
print(len(os.listdir('/root/.keras/datasets/flower_photos/roses')))

In [None]:
print(len(os.listdir('/root/.keras/datasets/flower_photos/tulips')))

Let us visualize this as a bar chart.

In [None]:
fig = plt.figure()
ax = fig.add_axes([0,0,1,1])
flowers = ['dandelion', 'daisy', 'sunflowers', 'roses', 'tulips']
n = [898, 633, 699, 641, 799]
plt.bar(flowers, n, color=['yellow','gold','darkorange','red','orchid'])
plt.xlabel('Class', fontsize=12)
plt.ylabel('# of images', fontsize=12)
plt.show()

Check the first few images in the daisy sub-directory.

In [None]:
daisy = list(data_dir.glob('daisy/*'))
PIL.Image.open(str(daisy[0]))

In [None]:
PIL.Image.open(str(daisy[1]))

## Data preparation

Now we want to go from a directory of images on disk to a dataset. We will go this using the `tf.keras.utils.image_dataset_from_directory` utility.

### Create a dataset

First, define parameters for the loader.

In [None]:
batch_size = 32
img_height = 180
img_width = 180

We will be using a 80/20 validation split, so 80% of the images will be used for training and 20% will be used for validation.

Define the training dataset.

In [None]:
train_ds = tf.keras.utils.image_dataset_from_directory(
  data_dir,
  validation_split=0.2,
  subset="training",
  seed=123,
  image_size=(img_height, img_width),
  batch_size=batch_size)

Define the validation dataset.

In [None]:
val_ds = tf.keras.utils.image_dataset_from_directory(
  data_dir,
  validation_split=0.2,
  subset="validation",
  seed=123,
  image_size=(img_height, img_width),
  batch_size=batch_size)

Find the class names using the `class_names` attribute.

In [None]:
class_names = train_ds.class_names
print(class_names)

Check the dimensions of the image_batch tensor and the labels_batch tensor.

In [None]:
for image_batch, labels_batch in train_ds:
  print(image_batch.shape)
  print(labels_batch.shape)
  break

### Configure dataset for performace

In [None]:
AUTOTUNE = tf.data.AUTOTUNE

train_ds = train_ds.cache().shuffle(1000).prefetch(buffer_size=AUTOTUNE)
val_ds = val_ds.cache().prefetch(buffer_size=AUTOTUNE)

Turn off warnings - the Keras preprocessing layers can be very slow leading to warnings.

In [None]:
tf.get_logger().setLevel('ERROR')

### Data standardization

The RGB channel values are in the `[0, 255]` range, which is not ideal for a neural network. So we will standardize the input values to be in the `[0, 1]` range by defining a normalization layer and including it in the model definition.

In [None]:
normalization_layer = layers.Rescaling(1./255)

## Data visualization

Visualize the first nine images from the training set along with their true labels.

In [None]:
image_batch, label_batch = next(iter(train_ds))

plt.figure(figsize=(10, 10))
for i in range(9):
  ax = plt.subplot(3, 3, i + 1)
  plt.imshow(image_batch[i].numpy().astype("uint8"))
  label = label_batch[i]
  plt.title(class_names[label],fontsize=18)
  plt.axis("off")

## Overfitting

In order to reduce the effect of overfitting in the training process, we will implement data augmentation and the dropout technique.

### Data augmentation

Data augmentation is the process of generating additional training data from the existing examples by augmenting the images using random transformations.

We will use Keras preprocessing layers to implement data augmentation.

In [None]:
data_augmentation = keras.Sequential(
  [
    layers.RandomFlip("horizontal", input_shape=(img_height, img_width, 3)),
    layers.RandomRotation(0.1),
    layers.RandomZoom(0.1),
  ]
)

Visualize a few augmented examples by applying data augmentation to the same image several times.

In [None]:
plt.figure(figsize=(10, 10))
for images, _ in train_ds.take(1):
  for i in range(9):
    augmented_images = data_augmentation(images)
    ax = plt.subplot(3, 3, i + 1)
    plt.imshow(augmented_images[4].numpy().astype("uint8"))
    plt.grid()

### Dropout

Another technique which can be used to reduce overfitting is dropout regularization. When we apply droput to a layer, it randomly drops out (by setting the activation to zero) a number of output units from the layer during the training process.

In [None]:
dropout_layer = layers.Dropout(0.2)

## Model definition

We will define a basic 5 layer CNN model below with three convolution layers (each followed by a ReLU and max-pooling layer) and two fully-connected layers.

In [None]:
num_classes = len(class_names)

model = Sequential([
  data_augmentation,
  layers.Rescaling(1./255),

  layers.Conv2D(filters=32, kernel_size=(5,5), padding='same', activation='relu'),
  layers.MaxPooling2D(pool_size=(2,2)),

  layers.Conv2D(filters=64, kernel_size=(3,3), padding='same', activation='relu'),
  layers.MaxPooling2D(pool_size=(2,2), strides=(2,2)),

  layers.Conv2D(filters=96, kernel_size=(3,3), padding='same', activation='relu'),
  layers.MaxPooling2D(pool_size=(2,2), strides=(2,2)),

  layers.Conv2D(filters=96, kernel_size=(3,3), padding='same', activation='relu'),
  layers.MaxPooling2D(pool_size=(2,2), strides=(2,2)),
  
  layers.Dropout(0.2),
  layers.Flatten(),
  layers.Dense(512, activation='relu'),
  layers.Dense(num_classes, name="outputs")
])

## Model compilation and training

We will use the `tf.keras.optimizers.Adam` optimizer and `tf.keras.losses.SparseCategoricalCrossentropy` loss function.

In [None]:
model.compile(optimizer='adam',
              loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True),
              metrics=['accuracy'])

Check the model summary.

In [None]:
model.summary()

We are ready to train the model now for 50 epochs (iterations).

In [None]:
epochs = 50
history = model.fit(
  train_ds,
  validation_data=val_ds,
  epochs=epochs
)

## Evaluating model performance

Plot training/validation set accuracy and loss.

In [None]:
acc = history.history['accuracy']
val_acc = history.history['val_accuracy']

loss = history.history['loss']
val_loss = history.history['val_loss']

In [None]:
epochs_range = range(epochs)

f = plt.figure(figsize=(16, 8))
matplotlib.rcParams['font.family'] = "sans-serif"

plt.subplot(1, 2, 1)
plt.plot(epochs_range, acc, label='training set', color='tab:purple')
plt.plot(epochs_range, val_acc, label='validation set', color='tab:blue')
plt.locator_params(axis="x", integer=True, tight=True)
plt.legend(loc='lower right', prop={'size': 18})
plt.xlim([1, 50])
plt.xlabel('Epoch #',fontsize=18)
plt.ylabel('Accuracy',fontsize=18)
plt.title('Training and Validation Accuracy')

plt.subplot(1, 2, 2)
plt.plot(epochs_range, loss, label='training set', color='tab:purple')
plt.plot(epochs_range, val_loss, label='validation set', color='tab:blue')
plt.locator_params(axis="x", integer=True, tight=True)
plt.xlim([1, 50])

plt.xlabel('Epoch #',fontsize=18)
plt.ylabel('Loss',fontsize=18)
plt.title('Training and Validation Loss')

plt.show()

## Visualizing predictions

Let us have a look at the first nine images in the validation set and compare their actual labels to the label predicted by the model.

In [None]:
plt.figure(figsize=(10, 10))

for images, labels in val_ds.take(1):

  predictions = model.predict(images)

  for i in range(9):
    label_true = class_names[labels[i]]

    score = tf.nn.softmax(predictions[i])
    label_pred = class_names[np.argmax(score)]

    ax = plt.subplot(3, 3, i + 1)
    plt.imshow(images[i].numpy().astype("uint8"))
    plt.title("Actual: {} \n Predicted: {}".format(label_true, label_pred))
    plt.axis("off")

### Correctly classified images

Now let us have a look at a few images which were correctly classified.

In [None]:
plt.figure()
count = 0

for images, labels in val_ds.take(1):

  predictions = model.predict(images)

  for i in range(len(predictions)):

    label_true = class_names[labels[i]]
    score = tf.nn.softmax(predictions[i])

    label_pred = class_names[np.argmax(score)]

    if(label_pred==label_true):
      ax = plt.subplot(1, 3, count + 1)
      plt.imshow(images[i].numpy().astype("uint8"))
      plt.title("Actual: {} \n Predict: {}".format(label_true, label_pred))
      plt.axis("off")
      plt.grid()
      count += 1

    if(count==3):
      break

### Misclassified Images

Finally, let us have a look at a few misclassified images.

In [None]:
plt.figure()
count = 0

for images, labels in val_ds.take(1):

  predictions = model.predict(images)

  for i in range(len(predictions)):

    label_true = class_names[labels[i]]
    score = tf.nn.softmax(predictions[i])

    label_pred = class_names[np.argmax(score)]

    if(label_pred!=label_true):
      ax = plt.subplot(1, 3, count + 1)
      plt.imshow(images[i].numpy().astype("uint8"))
      plt.title("Actual: {} \n Predict: {}".format(label_true, label_pred))
      plt.grid()
      plt.axis("off")
      count += 1

    if(count==3):
      break

## Confusion Matrix

Confusion matrices are a great tool for the visualization of errors in classification problems. They encode the complete specification of misclassifications: the numbers of misclassified items
for each pair (original class in which items should be classified,
incorrect class in which items are misclassified).

In [None]:
test_labels = []
pred_labels = []

for images, labels in val_ds:

  predictions = model.predict(images)

  for i in range(len(predictions)):

    label_true = class_names[labels[i]]
    score = tf.nn.softmax(predictions[i])

    label_pred = class_names[np.argmax(score)]

    test_labels.append(label_true)
    pred_labels.append(label_pred)

In [None]:
cm = confusion_matrix(test_labels, pred_labels)
disp = ConfusionMatrixDisplay(confusion_matrix=cm, display_labels=class_names)
disp.plot(cmap=plt.cm.Blues)
plt.show()