**Supervised Learning - Classification of birds to species with Convolutional Neural Network**

Learning paradigm: (strong) supervised learning

<u>This notebook documents 3. experiment conducted in 10th of April 2022</u>

Dataset I work with is: https://www.kaggle.com/datasets/gpiosenka/100-bird-species/. Check what Convolutional neural networks are all about at https://d2l.ai/chapter_convolutional-neural-networks/index.html before making changes to this notebook.

In [None]:
# import utilities
import os
import matplotlib.pyplot as plt # to evaluate model performance
import numpy as np
import pandas as pd
from sklearn.metrics import confusion_matrix, classification_report
from tensorflow.keras.callbacks import EarlyStopping, LearningRateScheduler, ReduceLROnPlateau

import tensorflow as tf # import tensorflow
# work with Keras facade
from keras.callbacks import EarlyStopping
from keras.preprocessing.image import load_img, img_to_array, ImageDataGenerator
from tensorflow.keras.layers import Input, Dense, Flatten, Activation, Dropout # input, output, hidden layers, activation...
from tensorflow.keras.layers import Conv2D, MaxPooling2D # convolutional layer and max pooling
from tensorflow.keras.layers import BatchNormalization
import cv2

Training set consists of 58 388 RGB images (that means 3 channels), 224px x 224px. Validation set consists of 2000 images and test set consists of 2000 images. There are 356 unique bird species in a training dataset. There are 400 unique bird species in validation dataset and 400 bird species in test dataset too.

In [None]:
BASE_DIR = os.getcwd() # extract dataset from Kaggle to same folder as you have in this notebook

TRAIN_DIR = os.path.join(BASE_DIR, 'train')
VALIDATION_DIR = os.path.join(BASE_DIR, 'valid')
TEST_DIR = os.path.join(BASE_DIR, 'testing')

In [None]:
TRAIN_CATEGORIES = os.listdir(TRAIN_DIR)
Train_Category_count = len(TRAIN_CATEGORIES) # gets you number of classes in training dataset

VAL_CATEGORIES = os.listdir(VALIDATION_DIR)
Val_Category_count = len(VAL_CATEGORIES)

TEST_CATEGORIES = os.listdir(TEST_DIR)
Test_Category_count = len(TEST_CATEGORIES)

Applying standard rescale factor by which all data values would be multiplied. We're doing this because we deal with images in RGB color model, where pixel values vary between 0 and 255. Such values would be too high for our model to process. This is why I rescale them to interval 0-1.

In [None]:
data_iterator = ImageDataGenerator(rescale=1./255,)

In [None]:
train_data = data_iterator.flow_from_directory(
    directory = TRAIN_DIR, 
    batch_size = 32, 
    shuffle=True,
    class_mode="categorical",
    target_size=(224,224))

validation_data = data_iterator.flow_from_directory(
    directory = VALIDATION_DIR, 
    batch_size = 32,
    shuffle = True,
    class_mode="categorical",
    target_size=(224, 224))

test_data = data_iterator.flow_from_directory(
    directory = TEST_DIR,
    batch_size=32,
    shuffle=True,
    class_mode="categorical",
    target_size=(224, 224))

**Architecture of this Convolutional Neural Network**

1. Convolutional layer - to extract patterns and abstract from low-level features of the images
2. Activation layer - activation functions with a same purpose as they have in Multilayer perceptrons. 
3. Pooling layer - optimization with Max pooling.
4. Normalization layer - batch normalization as optimization technique. Conducted after activation, before another convolution
5. Dense layer - fully connected layer of the MLP.

Input to this convolutional network are RGB images 224 x 224 px with 3 channels. First convolutional layer works with 3 channels, but this does not mean all convolutional layers have to work with these same 3 channels (they usually create activation maps with more channels).

Best practice is to use same activation function across all layers. 

**Strategy of my Learning Process**

1. Setup, build and run my Convolutional Neural Network.
2. Check my model performance.
3. Conduct more experiments with probably different set of hyperparameters.

In [None]:
tf.keras.backend.clear_session()

IMAGE = load_img(os.getcwd() + "\\testing\\ABBOTTS BABBLER\\1.jpg")
IMAGEDATA = img_to_array(IMAGE)
SHAPE = IMAGEDATA.shape

model = tf.keras.models.Sequential()

# lowest convolutional layer for identification of the edges of birds
# input shape is provided, so no deferred initialization https://d2l.ai/chapter_deep-learning-computation/deferred-init.html
model.add(Conv2D(64, (3, 3), padding='same', input_shape=SHAPE))
model.add(Activation('relu'))
model.add(BatchNormalization())

# convolutional layer to learn and store mid-level features of the bird species
model.add(Conv2D(64, (3, 3), padding="same"))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(BatchNormalization())
model.add(Dropout(0.35))

model.add(Conv2D(64, (3, 3), padding='same'))
model.add(Activation('relu'))
model.add(BatchNormalization()) 

# highest convolutional layer to store complex information about the look of birds
model.add(Conv2D(64, (3, 3), padding='same')) 
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2))) 
model.add(BatchNormalization())
model.add(Dropout(0.35))

model.add(Conv2D(64, (3, 3), padding='same')) #54x54
model.add(Activation('relu'))
model.add(BatchNormalization())

# and finally mlp
model.add(Flatten()) 
model.add(Dropout(0.5)) 
model.add(Dense(512)) 
model.add(Activation('relu'))
model.add(BatchNormalization())
model.add(Dense(Train_Category_count)) 
model.add(Activation('softmax'))

**Model Learning**

To summarize, in this experiment, task here was to increase accuracy of the model from experiment one and two. In order to do that, we:

1. we increased number of filters in the first convolutional layer from 32 (used in first two experiments) to 64. Reason to do that we were trying to increase its computational power. <br>
2. for a same reason, in order to increase model computational power we were adding here two new convolutional layers (with a same number of convolutional filters). <br>
3. in order to increase speed of the learning and enhance our learning process we're using here technique of batch normalization. This means we're normalizing output from all previous layers passed to the next layers, which also helped to generalize. <br>
4. we're also using here one regularization technique: dropout, which was expected to temporarily halt impact of some neurons on layers which would have inadequately large weights for its neurons (being compared to others). <br>

In [None]:
model.compile(optimizer = tf.keras.optimizers.SGD(learning_rate=0.001, decay=1e-6, momentum=0.9, nesterov=True),
               loss = 'categorical_crossentropy',
               metrics = ['accuracy'])

EPOCHS = 4
BATCH_SIZE = 32 # this batch size allows us to perform normalization after each activation layer 

history = model.fit(train_data, epochs=EPOCHS, validation_data = validation_data, 
                    steps_per_epoch=len(train_data), validation_steps = len(validation_data), 
                    callbacks=[EarlyStopping(monitor='val_accuracy', patience = 5, restore_best_weights = True),
                    ReduceLROnPlateau(monitor = 'val_loss', factor = 0.7, 
                                 patience = 2, verbose = 1)])

model.summary()

**Model Evaluation**

This network was one of the best in all experiments which I've conducted. It achieved more than 80% accuracy.

In [None]:
plt.plot(history.history['accuracy'])
plt.plot(history.history['val_accuracy'])
plt.title('Model accuracy')
plt.ylabel('Accuracy')
plt.xlabel('Epoch')
plt.legend(['Train', 'Validation'], loc='upper left')
plt.show()

plt.plot(history.history['loss'])
plt.plot(history.history['val_loss'])
plt.title('Model loss')
plt.ylabel('Loss')
plt.xlabel('Epoch')
plt.legend(['Train', 'Validation'], loc='upper left')
plt.show()

scores = model.evaluate(test_data, verbose=1)
print('Test loss:', scores[0])
print('Test accuracy:', scores[1])

**Evaluating Model Performance**

See attached document.