<a href="https://colab.research.google.com/github/schmuecker/transfer-learning/blob/main/computer_vision/classification_from_scratch/cnn_human_action.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

<h2 style='color:blue' align='center'>Data Augmentation To Address Overfitting In Flower Classification CNN</h2>

**In this notebook we will build a CNN to classify flower images. We will also see how our model overfits and how overfitting can be addressed using data augmentation. Data augmentation is a process of generating new training samples from current training dataset using transformations such as zoom, rotations, change in contrast etc**

Credits: I used tensorflow offical tutorial: https://www.tensorflow.org/tutorials/images/classification as a reference and made bunch of changes to make it simpler

In below image, 4 new training samples are generated from original sample using different transformations

<img src="https://github.com/schmuecker/transfer-learning/blob/main/computer_vision/classification_from_scratch/daisy2.JPG?raw=1" />

In [1]:
%pip install datasets

Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/


In [2]:
import matplotlib.pyplot as plt
import numpy as np
import cv2
import os
import PIL
import tensorflow as tf

from tensorflow import keras
from tensorflow.keras import layers
from tensorflow.keras.models import Sequential
from datasets import load_dataset
from PIL import Image

We will download flowers dataset from google website and store it locally. In below call it downloads the zip file (.tgz) in cache_dir which is . meaning the current folder

<h3 style='color:purple'>Load flowers dataset</h3>

In [3]:
dataset = load_dataset("Bingsu/Human_Action_Recognition")
dataset



  0%|          | 0/2 [00:00<?, ?it/s]

DatasetDict({
    train: Dataset({
        features: ['image', 'labels'],
        num_rows: 12600
    })
    test: Dataset({
        features: ['image', 'labels'],
        num_rows: 5400
    })
})

In [4]:
dataset_url = "https://storage.googleapis.com/download.tensorflow.org/example_images/flower_photos.tgz"
data_dir = tf.keras.utils.get_file('flower_photos', origin=dataset_url,  cache_dir='.', untar=True)
# cache_dir indicates where to download data. I specified . which means current directory
# untar true will unzip it

In [5]:
#data_dir

In [6]:
import pathlib
data_dir = pathlib.Path(data_dir)
data_dir

PosixPath('datasets/flower_photos')

In [7]:
#list(data_dir.glob('*/*.jpg'))[:5]

In [8]:
#image_count = len(list(data_dir.glob('*/*.jpg')))
#print(image_count)

In [9]:
#roses = list(data_dir.glob('roses/*'))
#roses[:5]

In [10]:
#PIL.Image.open(str(roses[1]))

In [11]:
#tulips = list(data_dir.glob('tulips/*'))
#PIL.Image.open(str(tulips[0]))

<h3 style='color:purple'>Read flowers images from disk into numpy array using opencv</h3>

In [12]:
flowers_images_dict = {
    'roses': list(data_dir.glob('roses/*')),
    'daisy': list(data_dir.glob('daisy/*')),
    'dandelion': list(data_dir.glob('dandelion/*')),
    'sunflowers': list(data_dir.glob('sunflowers/*')),
    'tulips': list(data_dir.glob('tulips/*')),
}

In [13]:
#flowers_labels_dict = {
#    'roses': 0,
#    'daisy': 1,
#    'dandelion': 2,
#    'sunflowers': 3,
#    'tulips': 4,
#}

In [14]:
#flowers_images_dict['roses'][:5]

In [15]:
#str(flowers_images_dict['roses'][0])

In [16]:
img = cv2.imread(str(flowers_images_dict['roses'][0]))

In [17]:
img[0].shape

(320, 3)

In [18]:
type(img[0][0])

numpy.ndarray

In [19]:
img[0][0]

array([ 66, 103,  87], dtype=uint8)

In [20]:
#cv2.resize(img,(180,180)).shape

In [21]:
#X, y = [], []
#
#for flower_name, images in flowers_images_dict.items():
#    for image in images:
#        img = cv2.imread(str(image))
#        resized_img = cv2.resize(img,(180,180))
#        X.append(resized_img)
#        y.append(flowers_labels_dict[flower_name])

In [22]:
X_train, y_train = [], []

for a in dataset['train']:
  image, labels = a['image'], a['labels']
  img = np.asarray(image.resize((160,160)), dtype=np.float32)
  X_train.append(img)
  y_train.append(labels)

In [23]:
X_train = np.array(X_train)
y_train = np.array(y_train)

X_train.shape, y_train.shape

((12600, 160, 160, 3), (12600,))

In [24]:
X_train[0].shape

(160, 160, 3)

In [25]:
X_train[0][0][0]

array([234., 161.,  89.], dtype=float32)

<h3 style='color:purple'>Test data</h3>

In [38]:
X_test, y_test = [], []

for a in dataset['test']:
  image, labels = a['image'], a['labels']
  img = np.asarray(image.resize((160,160)), dtype=np.float32)
  X_test.append(img)
  y_test.append(labels)

X_test = np.array(X_test)
y_test = np.array(y_test)

<h3 style='color:purple'>Build convolutional neural network and train it</h3>

In [30]:
class myCallback(tf.keras.callbacks.Callback):
  def on_epoch_end(self, epoch, logs={}):
    '''
    Halts the training after reaching 80 percent accuracy

    Args:
      epoch (integer) - index of epoch (required but unused in the function definition below)
      logs (dict) - metric results from the training epoch
    '''

    # Check accuracy
    if(logs.get('loss') < 0.3) and (logs.get('accuracy') > 0.8):

      # Stop if threshold is met
      print("\nLoss is lower than 0.4 so cancelling training!")
      self.model.stop_training = True

# Instantiate class
callbacks = myCallback()

Model architecture copied from: https://www.kaggle.com/code/debanjan2002/human-action-recognition-classification

In [35]:
num_classes = 15

model = Sequential([
  layers.Conv2D(32, 3, padding='same', activation='relu'),
  layers.MaxPooling2D(pool_size=2,strides=2),
  layers.Conv2D(32, 3, padding='same', activation='relu'),
  layers.MaxPooling2D(pool_size=2,strides=2),
  layers.Flatten(),
  layers.Dense(512, activation='relu'),
  layers.Dense(num_classes, activation='sigmoid')
])

model.compile(optimizer='adam',
              loss=tf.keras.losses.SparseCategoricalCrossentropy(),
              metrics=['accuracy'])
              
model.fit(X_train, y_train, epochs = 100, callbacks=[callbacks])              

Epoch 1/100
Epoch 2/100
Epoch 3/100
Epoch 4/100
Epoch 5/100
Epoch 6/100
Epoch 7/100
Epoch 8/100
Epoch 9/100
Epoch 10/100
Epoch 11/100
Epoch 12/100
Epoch 13/100
Epoch 14/100
Epoch 15/100
Epoch 16/100
Epoch 17/100
Epoch 18/100
Epoch 19/100
Loss is lower than 0.4 so cancelling training!


<keras.callbacks.History at 0x7fa8393af410>

In [39]:
model.evaluate(X_test,y_test)



[21.869091033935547, 0.0640740767121315]

**Here we see that while train accuracy is very high (99%), the test accuracy is significantly low (66.99%) indicating overfitting. Let's make some predictions before we use data augmentation to address overfitting**

In [40]:
predictions = model.predict(X_test)
predictions



array([[1.25805836e-05, 1.18464285e-07, 9.99999821e-01, ...,
        1.61897391e-04, 3.17071946e-09, 9.99377429e-01],
       [2.14241984e-23, 2.49052420e-14, 1.15959365e-05, ...,
        2.44274739e-23, 1.00000000e+00, 6.46187804e-17],
       [9.99999344e-01, 1.03614957e-14, 4.44925184e-13, ...,
        1.63464529e-35, 2.53704154e-32, 0.00000000e+00],
       ...,
       [3.89393717e-01, 1.73525512e-02, 1.30133168e-03, ...,
        8.87449714e-04, 4.02354747e-01, 6.25553960e-03],
       [9.94157314e-01, 9.99826252e-01, 9.54675302e-03, ...,
        6.56399934e-10, 9.99969661e-01, 7.39748776e-02],
       [4.48311172e-21, 1.47763203e-04, 4.39406902e-01, ...,
        1.45447309e-04, 1.00000000e+00, 7.88521464e-33]], dtype=float32)

In [41]:
score = tf.nn.softmax(predictions[0])

In [42]:
np.argmax(score)

2

In [43]:
y_test[0]

0

<h3 style='color:purple'>Improve Test Accuracy Using Data Augmentation</h3>

In [None]:
data_augmentation = keras.Sequential(
  [
    layers.experimental.preprocessing.RandomRotation(0.1),
    layers.experimental.preprocessing.RandomZoom(0.1),
  ]
)

**Original Image**

In [None]:
plt.axis('off')
plt.imshow(X[0])

**Newly generated training sample using data augmentation**

In [None]:
#plt.axis('off')
#plt.imshow(data_augmentation(X)[0].numpy().astype("uint8"))

<h3 style='color:purple'>Train the model using data augmentation and a drop out layer</h3>

In [None]:
num_classes = 5

model = Sequential([
  data_augmentation,
  layers.Conv2D(16, 3, padding='same', activation='relu'),
  layers.MaxPooling2D(),
  layers.Conv2D(32, 3, padding='same', activation='relu'),
  layers.MaxPooling2D(),
  layers.Conv2D(64, 3, padding='same', activation='relu'),
  layers.MaxPooling2D(),
  layers.Dropout(0.2),
  layers.Flatten(),
  layers.Dense(128, activation='relu'),
  layers.Dense(num_classes)
])

model.compile(optimizer='adam',
              loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True),
              metrics=['accuracy'])
              
model.fit(X_train_scaled, y_train, epochs=30)    

In [None]:
model.evaluate(X_test_scaled,y_test)

**You can see that by using data augmentation and drop out layer the accuracy of test set predictions is increased to 73.74%**