
## Image Classification (CNN)

![animals.jpg](https://storage.googleapis.com/kaggle-datasets-images/1554380/2561346/c14cd64fb06842ad190298f9f4efaa49/dataset-cover.png?t=2021-08-26-19-14-08)

## Check if GPU is enabled

In [None]:
# check your Colab device
import tensorflow as tf  # Import tensorflow library
import pprint            # Import pprint library for better print format
device_name = tf.config.list_physical_devices()  # A list of divece name, which could contain CPU and GPU
pprint.pprint(device_name)                       # Print the device_name

[PhysicalDevice(name='/physical_device:CPU:0', device_type='CPU'),
 PhysicalDevice(name='/physical_device:GPU:0', device_type='GPU')]


***Note:*** If you use GPU too regularly, runtime durations will become shorter and shorter, and disconnections more frequent. The cooldown period before you can connect to another GPU will extend from hours to days to weeks.

## **Lab Task Procedure**
0. Data preparation
1. Data preprocessing
2. Data generator **(Task 1)**
3. Build the model **(Task 2)**
4. Compile the model
5. Train the model
6. Evaluate the model
7. Save the model

## **Data Preparation**


1. Download the [Animal Species Classification Dataset](https://www.kaggle.com/datasets/utkarshsaxenadn/animal-image-classification-dataset) from [here](https://course.cse.ust.hk/comp2211/labs/lab8/animal-species-cls-v3.zip).
2. Upload this data to your Google Drive, under folder `comp2211/lab8`.
3. Run the following code cell to mount Google Drive and unzip the data.

Note: If this lasts for more than three minutes, you may try deleting the previously unzipped folder on Google Drive and try again.



In [None]:
from google.colab import drive
drive.mount("/content/drive")
%cd "drive/MyDrive/lab8"
#!unzip -q -o animal_species_v3.zip -d .

Drive already mounted at /content/drive; to attempt to forcibly remount, call drive.mount("/content/drive", force_remount=True).
/content/drive/MyDrive/lab8


In [None]:
import os
data_dir = './animal_species_v3/train'
category_list = sorted(os.listdir(data_dir))
print(category_list)
print('Total categories:', len(category_list))

['Beetle', 'Butterfly', 'Cat', 'Cow', 'Dog', 'Elephant', 'Gorilla', 'Hippo', 'Lizard', 'Monkey', 'Mouse', 'Panda', 'Spider', 'Tiger', 'Zebra']
Total categories: 15


## **Animal Recognition**
---
About the data:
- Number of images: **7,500**.
- Number of classes: **15**.
- Image size: **(64, 64, 3)**.

Before data preprocessing, we visualize some of the images to get familiar with the data.

In [None]:
import os, cv2, random
import matplotlib.pyplot as plt

plt.figure(figsize=(24,8))
for i, cate in enumerate(category_list):
  img_names = random.sample(os.listdir(data_dir+'/'+cate), k=5)
  for j, img_name in enumerate(img_names): # we only show 5 images of each category
    img = plt.imread(data_dir+'/'+cate+'/'+img_name) # read the image
    plt.subplot(5, 15, 15*j+i+1) # plot the same category at the same column
    plt.imshow(img)
    plt.axis('off')
    if j == 0: # only show category name at the first row
      plt.title(cate)
plt.show()


Output hidden; open in https://colab.research.google.com to view.

In [None]:
# Import necessary libraries
import numpy as np
from sklearn.model_selection import train_test_split
import keras
from keras.utils import to_categorical
from keras.models import Sequential
from keras.layers import Conv2D, MaxPooling2D, AveragePooling2D
from keras.layers import Dense, Dropout, Flatten
from keras.callbacks import ModelCheckpoint

## 1. Data preprocessing

We first construct a mapping from string-type category names to integer-type class indices, for later use.

In [None]:
# Create a dict mapping the category name to the class index
# The number of label should be 15 (0 to 14)
cate2Idx = {cate:idx for idx, cate in enumerate(category_list)}
print(cate2Idx)

{'Beetle': 0, 'Butterfly': 1, 'Cat': 2, 'Cow': 3, 'Dog': 4, 'Elephant': 5, 'Gorilla': 6, 'Hippo': 7, 'Lizard': 8, 'Monkey': 9, 'Mouse': 10, 'Panda': 11, 'Spider': 12, 'Tiger': 13, 'Zebra': 14}


In [None]:
from tqdm import tqdm
x, y = [], []
for cate in tqdm(category_list):
  img_names = os.listdir(data_dir+'/'+cate)
  for img_name in img_names:
    img = cv2.imread(os.path.join(data_dir, cate, img_name))
    x.append(img)
    y.append(cate2Idx[cate])
x, y = np.asarray(x), np.asarray(y)

100%|██████████| 15/15 [03:33<00:00, 14.23s/it]


In [None]:
# Check if the shapes are correct
print(x.shape)
print(y.shape)

(7500, 64, 64, 3)
(7500,)


We further split the data to train and test sets with ratio 4:1 and convert the labels from integer to one-hot encoding with the following code cell.

In [None]:
# Split the dataset to train and test parts with ratio 4:1
# x_train is a NumPy array of RGB image data with shape (6000, 64, 64, 3)
# y_train is a NumPy array of labels (in range 0-14) with shape (6000, 15)
# x_test is a NumPy array of RGB image data with shape (1500, 64, 64, 3)
# y_test is a NumPy array of labels (in range 0-14) with shape (1500, 15)
x_train, x_test, y_train, y_test = train_test_split(x, y, test_size=0.2, random_state=42)

# There are 15 classes, represented as unique integers(0 to 14).
# Transform the integer into a 15-element binary vector (i.e., one-hot encoding).
y_train = to_categorical(y_train, len(category_list))
y_test = to_categorical(y_test, len(category_list))

In [None]:
# Check if the shapes are correct
print(x_train.shape)
print(y_train.shape)
print(x_test.shape)
print(y_test.shape)

(6000, 64, 64, 3)
(6000, 15)
(1500, 64, 64, 3)
(1500, 15)


## 2. Data generator

### **Task 1**

You need to add appropriate data augmentations to the data generator to avoid overfitting. By default, the data generator does not contain any data augmentation, but still runnable (you may try the default generator first to see how it performs).

You may find this [webpage](https://www.tensorflow.org/api_docs/python/tf/keras/preprocessing/image/ImageDataGenerator) useful for adding more augmentations.

In [None]:
from keras.preprocessing.image import ImageDataGenerator
def get_datagen() -> ImageDataGenerator:
  datagen = None
  ###############################################################################
  # TODO: your code starts here

  datagen = ImageDataGenerator(rotation_range=30,
            width_shift_range=0.1,
            height_shift_range=0.1,
            shear_range=0.2,
            zoom_range=0.2,
            horizontal_flip=True,
            fill_mode='nearest')

  # TODO: your code ends here
  ###############################################################################
  return datagen

Run the following code cell to get a data generator `train_generator`, which will be used to produce augmented data during training.

In [None]:
datagen = get_datagen()   # Instantiate a data generator
datagen.fit(x_train)      # Fit the generator to the training data for normalization
train_generator = datagen.flow(x_train, y_train, batch_size=128) #  The generator will be used during training

## 3. Build the model

### **Task 2**

You need to build a CNN model for animal recognition. There is no restriction on the number of layers. You can use the following layers:

* Convolution (`Conv2D`)
* Pooling (`MaxPooling2D`, `AveragePooling2D`)
* Fully-connected (`Dense`)
* Dropout (`Dropout`)
* Flatten (`Flatten`)

Please keep the number of total parameters of your model within **less than 10 million.**

For reference, our solution uses around 6.4 million parameters.

In [None]:
# Hint: The model from the review notebook could be a good starting point.
def custom_model():
  model = None
  ###############################################################################
  # TODO: your code starts here
  model = Sequential(  #Partly copy from lab8 review
    [Conv2D(filters=32, kernel_size=(3, 3), activation='relu', input_shape=(64, 64, 3)),  # Add a convolutional layer with 32 kernels, each of size 3x3
    MaxPooling2D(pool_size=(2, 2)),
    Conv2D(filters=64, kernel_size=(3, 3), activation='relu'),                            # Add another convolutional layer with 64 kernels, each of size 3x3
    MaxPooling2D(pool_size=(2, 2)),                                                       # Add a max pooling layer of size 2x2
    Dropout(0.2),                                                                         # Add a dropout layer to prevent a model from overfitting
    Conv2D(filters=128, kernel_size=(3, 3), activation='relu'),
    MaxPooling2D(pool_size=(2, 2)),
    Conv2D(filters=256, kernel_size=(3, 3), activation='relu'),
    Dropout(0.3),
    Flatten(),                                                                            # Add a flatten layer to convert the pooled data to a single column
    Dense(units=256, activation='relu'),                                                  # Add a dense layer (fully-connected layer) and use ReLU activation function
    Dropout(0.3),
    Dense(units=15, activation='softmax')]
  )

  # TODO: your code ends here
  ###############################################################################
  return model

# Create the model (DO NOT include this in the submission file)
model = custom_model()
model.summary()

Model: "sequential_1"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 conv2d_4 (Conv2D)           (None, 62, 62, 32)        896       
                                                                 
 max_pooling2d_3 (MaxPoolin  (None, 31, 31, 32)        0         
 g2D)                                                            
                                                                 
 conv2d_5 (Conv2D)           (None, 29, 29, 64)        18496     
                                                                 
 max_pooling2d_4 (MaxPoolin  (None, 14, 14, 64)        0         
 g2D)                                                            
                                                                 
 dropout_3 (Dropout)         (None, 14, 14, 64)        0         
                                                                 
 conv2d_6 (Conv2D)           (None, 12, 12, 128)      

## 4. Compile the Model

In [None]:
# Compile the model
# Use crossentropy loss function since there are two or more label classes
# Use adam algorithm (a stochastic gradient descent method)
# Use accuracy as metric
model.compile(loss="categorical_crossentropy", optimizer="adam", metrics=["accuracy"])
model.save('model_lab8.init.keras')

## 5. Train the model

Run the following code cell to start training.

In [None]:
model = keras.models.load_model('model_lab8.init.keras')  # Reset the model to last compilation

checkpoint_callback = ModelCheckpoint(
    filepath='model_lab8.temp.keras',
    monitor='val_accuracy',
    mode='max',
    save_best_only=True)  # Save the model with the best validation accuracy seen so far at each epoch

model.fit(train_generator,
         validation_data=(x_test, y_test),
         steps_per_epoch=len(x_train) / 128, epochs=60, # By default the model is trained with 60 epochs
         callbacks=[checkpoint_callback])               # You don't have to change the number of epochs, but you may do so if it is necessary



Epoch 1/60
Epoch 2/60
Epoch 3/60
Epoch 4/60
Epoch 5/60
Epoch 6/60
Epoch 7/60
Epoch 8/60
Epoch 9/60
Epoch 10/60
Epoch 11/60
Epoch 12/60
Epoch 13/60
Epoch 14/60
Epoch 15/60
Epoch 16/60
Epoch 17/60
Epoch 18/60
Epoch 19/60
Epoch 20/60
Epoch 21/60
Epoch 22/60
Epoch 23/60
Epoch 24/60
Epoch 25/60
Epoch 26/60
Epoch 27/60
Epoch 28/60
Epoch 29/60
Epoch 30/60
Epoch 31/60
Epoch 32/60
Epoch 33/60
Epoch 34/60
Epoch 35/60
Epoch 36/60
Epoch 37/60
Epoch 38/60
Epoch 39/60
Epoch 40/60
Epoch 41/60
Epoch 42/60
Epoch 43/60
Epoch 44/60
Epoch 45/60
Epoch 46/60
Epoch 47/60
Epoch 48/60
Epoch 49/60
Epoch 50/60
Epoch 51/60
Epoch 52/60
Epoch 53/60
Epoch 54/60
Epoch 55/60
Epoch 56/60
Epoch 57/60
Epoch 58/60
Epoch 59/60
Epoch 60/60


<keras.src.callbacks.History at 0x7abcb0f207f0>

## 6. Evaluate the model

In [None]:
model = keras.models.load_model('model_lab8.temp.keras')          # Restore the best model
val_loss, val_acc = model.evaluate(x_test, y_test, verbose=0)  # 'verbose=0' means no progress bar
print('Validation loss: {}'.format(val_loss))
print('Validation accuracy: {}'.format(val_acc))

Validation loss: 1.629609227180481
Validation accuracy: 0.5400000214576721


## 7. Save the model

Run the following code cell to save your model.

In [None]:
# Save the mdoel to a keras file
model_name = 'model_lab8.keras'              # Define model name
model.save(model_name)  # Save the model