## Overview

This classification assignment will focus on developing a deep learning model to predict the classification of birds using the Kaggle 190 Bird Species dataset.  I want to take a comparative approach in using a basic CNN vs a pre-trained model.  The basic CNN will be comprised of about a dozen hidden layers.  For comparison, I will use the Xception model to train and predict bird images.

The data set includes over 25K training images, 950 test images, and 950 validation images.The images are color and have a 224x224 pixel dimension.  The curator of the dataset has provided great details about the images included; the images are cropped to focus on the birds themselves rather than including a significant amount of extraineous details not related to the birds.  This should help with model prediction and limit the need for significant transformations with data augmentation.



In [None]:
# Libraries
import numpy as np
import matplotlib.pyplot as plt
import tensorflow as tf
import os
import cv2
from tensorflow.keras.preprocessing.image import ImageDataGenerator
from tensorflow.keras import backend, models, layers, optimizers, regularizers
from tensorflow.keras.layers import BatchNormalization
from tensorflow.keras.callbacks import EarlyStopping
from tensorflow.keras.applications import Xception
from sklearn.metrics import confusion_matrix

Below is are a few sample images from the dataset, by type of bird.  The curator of the images indicated that the cropped images to focus on the bird.  The sample images will help me validate if this is true; if true I won't need to apply significant transformations to images within data augmentation in order to ensure that it generalizes well.  

In [None]:
parent_dir = '/kaggle/input/100-bird-species/consolidated'
cats = os.listdir(path=parent_dir)

subcats = cats[0:15]
fig = plt.figure(figsize = [16,12])
for category in subcats:
    img = os.listdir(path=os.path.join(parent_dir,subcats[1]))[1]
    plt.subplot(3,5,subcats.index(category)+1, title = category)
    path = os.path.join(parent_dir, category)
    img_array = cv2.imread(os.path.join(path,img))
    plt.imshow(img_array)
plt.show()

The images do appear well cropped.  This will be helpful in determing my data augmentation strategy below.

In [None]:
base_dir = '/kaggle/input/100-bird-species/'
train_dir = os.path.join(base_dir, 'train')
validation_dir = os.path.join(base_dir, 'valid')
test_dir = os.path.join(base_dir,'test')

In [None]:
train_datagen = ImageDataGenerator(rescale=1./255)
test_datagen = ImageDataGenerator(rescale=1./255)

epoch = 100

train_generator = train_datagen.flow_from_directory(
    train_dir,
    target_size=(224,224),
    batch_size=20,
    class_mode='categorical')

validation_generator = train_datagen.flow_from_directory(
    validation_dir,
    target_size=(224,224),
    batch_size=20,
    class_mode='categorical')

test_generator = test_datagen.flow_from_directory(
    test_dir,
    target_size=(224,224),
    batch_size=20,
    class_mode='categorical')

The first model is a basic CNN.  It is made of up three sequential groupings of hidden layers including: 3x3 2DConv Filter, 2x2 MaxPooling, and Batch Normalization. After these convolutions, the model is flattened and connected to a 512 node dense layer.  This is then passed through a dropout layer for regularization prior to the classifier.

In [None]:
backend.clear_session()
model = models.Sequential()

model.add(layers.Conv2D(32, (3,3), activation = 'relu', input_shape=(224,224,3)))
model.add(layers.MaxPool2D((2,2)))
model.add(BatchNormalization())
model.add(layers.Conv2D(32, (3,3), activation = 'relu'))
model.add(layers.MaxPool2D((2,2)))
model.add(BatchNormalization())
model.add(layers.Conv2D(32, (3,3), activation = 'relu'))
model.add(layers.MaxPool2D((2,2)))
model.add(BatchNormalization())

model.add(layers.Flatten())
model.add(layers.Dense(512, activation='relu'))
model.add(layers.Dropout(0.2))

model.add(layers.Dense(len(cats), activation='softmax'))

model.compile(optimizer = 'adam',
             loss = 'categorical_crossentropy',
             metrics = ['accuracy'])

history = model.fit_generator(train_generator,
                              epochs = epoch,
                              validation_data = validation_generator,
                              verbose = 1,
                              callbacks = [EarlyStopping(monitor='val_accuracy', 
                                                         patience = 5,
                                                         restore_best_weights=True)])

test_loss, test_acc = model.evaluate_generator(test_generator)
print('base_model_test_acc:', test_acc)

In [None]:
history_dict = history.history
loss_values = history_dict['loss']
val_loss_values = history_dict['val_loss']
acc_values = history_dict['accuracy']
val_acc_values = history_dict['val_accuracy']
epochs = range(1, len(history_dict['accuracy']) + 1)

plt.plot(epochs, loss_values, 'bo', label = 'Training loss')
plt.plot(epochs, val_loss_values, 'b', label = 'Validation loss')
plt.title('Training and validation loss')
plt.xlabel('Epochs')
plt.ylabel('Loss')
plt.legend()
plt.show()

plt.plot(epochs, acc_values, 'bo', label = 'Training accuracy')
plt.plot(epochs, val_acc_values, 'b', label = 'Validation accuracy')
plt.title('Training and validation accuracy')
plt.xlabel('Epochs')
plt.ylabel('Accuracy')
plt.legend()
plt.show()

This model achieved a poor test accuracy of 47%.  It quickly began overfitting on the training data after the second epoch.  Epoch number 3 increased accuract over 20% while validation accuracy essentially leveled off.  

However, rather than spend time creating a custom deep learning model, I will opt to use a pre-train neural network to save time.  I've had success with the Xception model in the past so I will attempt this.  I will also use data augmentation to increase the variability in my training data.  I won't tweak the height/width shifting, rotation, or zoom significantly on the images since the data curator has already done a fair amount of work isolating the images of the birds in the images.  I will, however, do horizontal_flips.  

In [None]:
epoch = 50
train_datagen2 = ImageDataGenerator(rescale=1./255,
                                    rotation_range=10,
                                    width_shift_range=0.05,
                                    height_shift_range=0.05,
                                    zoom_range=0.05,
                                    horizontal_flip = True,
                                    fill_mode='nearest')
test_datagen2 = ImageDataGenerator(rescale=1./255)
train_generator2 = train_datagen2.flow_from_directory(train_dir,
                                                     target_size=(224,224),
                                                     batch_size=20,
                                                     class_mode='categorical')
validation_generator2 = train_datagen2.flow_from_directory(validation_dir,
                                                          target_size=(224,224),
                                                          batch_size=20,
                                                          class_mode='categorical')
test_generator2 = test_datagen2.flow_from_directory(test_dir,
                                                   target_size=(224,224),
                                                   batch_size=20,
                                                   class_mode='categorical')

The training model itself will leverage the Xception pre-trained model.
![](https://miro.medium.com/max/1400/1*hOcAEj9QzqgBXcwUzmEvSg.png)

The above image shows the Xception model architecture.  I left all of the layers locked to prevent adjustment of weights in the training process; I did unlock the final 6 layers for training.  This will allow Xception to be fine-tuned to the bird image data.  

The Xception model then connects to a flattening layer and a dense layer of 512 nodes prior to the final classification output layer (using a softmax).  

In [None]:
backend.clear_session()
conv_base = Xception (weights = 'imagenet',
                    include_top = False,
                    input_shape = (224,224,3))
for layer in conv_base.layers[:-6]:
    layer.trainable = False

modelx = models.Sequential()
modelx.add(conv_base)
modelx.add(layers.Flatten())
modelx.add(layers.Dense(512, activation = 'relu'))
modelx.add(layers.Dense(len(cats), activation = 'softmax'))

modelx.compile(optimizer = tf.keras.optimizers.Adam(learning_rate=0.0001),
              loss = 'categorical_crossentropy',
              metrics = ['accuracy'])

history = modelx.fit_generator(train_generator2,
                              epochs = epoch,
                              validation_data = validation_generator2,
                              verbose = 1,
                              callbacks = [EarlyStopping(monitor='val_accuracy',
                                                        patience = 5,
                                                        restore_best_weights = True)])

test_loss, test_acc = modelx.evaluate_generator(test_generator2, steps = 48)

In [None]:
history_dict = history.history
loss_values = history_dict['loss']
val_loss_values = history_dict['val_loss']
acc_values = history_dict['accuracy']
val_acc_values = history_dict['val_accuracy']
epochs = range(1, len(history_dict['accuracy']) + 1)

plt.plot(epochs, loss_values, 'bo', label = 'Training loss')
plt.plot(epochs, val_loss_values, 'b', label = 'Validation loss')
plt.title('Training and validation loss')
plt.xlabel('Epochs')
plt.ylabel('Loss')
plt.legend()
plt.show()

plt.plot(epochs, acc_values, 'bo', label = 'Training accuracy')
plt.plot(epochs, val_acc_values, 'b', label = 'Validation accuracy')
plt.title('Training and validation accuracy')
plt.xlabel('Epochs')
plt.ylabel('Accuracy')
plt.legend()
plt.show()

print('Xception_test_acc:', test_acc)

The Xception based model achieved overall test accuracy of 91%.  The validation accuracy gradually increased from 85% to aproximately 90% from the first epoch to the 12th epoch.  After the 12th epoch, overfitting caused a decrease in validation accuracy which caused an early stop to the model.
