# Project Overview

The following model is used to predict classes of birds with a deep learning model using TensorFlow. It will consist of extracting the data, displaying the images along with their labels, augmenting the images for better processing in the model, building and fine tuning the model, and finally evaluating the model performance. 

In [None]:
import tensorflow as tf
from tensorflow.keras import backend as K
from tensorflow.keras import backend, models, layers, optimizers, regularizers
from tensorflow.keras.utils import to_categorical
import numpy as np
from tensorflow.keras.layers import BatchNormalization
from tensorflow.keras.layers import Dropout
from tensorflow.keras.callbacks import EarlyStopping
from sklearn.model_selection import train_test_split
from tensorflow.keras.utils import to_categorical
from IPython.display import display # Library to help view images
from PIL import Image # Library to help view images
from tensorflow.keras.preprocessing.image import ImageDataGenerator # Library for data augmentation
import os, shutil # Library for navigating files
import matplotlib.pyplot as plt
np.random.seed(42)
from keras.preprocessing.image import img_to_array 
from keras.preprocessing.image import array_to_img

In [None]:
from tensorflow.keras.applications import InceptionV3
from tensorflow.keras.applications import VGG16

# Data Overview

The dataset is based on 34,325 images. These are split into 32,025 Training images, 1150 Validation images, and 1150 Test images. There are 230 classes of birds in the dataset. 

Each image is a 224x224 pixel file that is represented in color. 


In [None]:
base_dir = '../input/100-bird-species'

In [None]:
# Specify the traning, validation, and test dirrectories.  
train_dir = os.path.join(base_dir, 'train')
validation_dir = os.path.join(base_dir, 'valid')
test_dir = os.path.join(base_dir, 'test')


#Normalize the pixels in the images.
train_datagen = ImageDataGenerator(rescale=1./255)
val_datagen = ImageDataGenerator(rescale=1./255)
test_datagen = ImageDataGenerator(rescale=1./255)


In [None]:
#set Epochs
epoch = 50

In [None]:
train_generator = train_datagen.flow_from_directory(
    train_dir, 
    target_size=(224, 224), 
    batch_size=20, 
    class_mode='categorical') 

validataion_generator = train_datagen.flow_from_directory(
    validation_dir,
    target_size=(224, 224),
    batch_size=20,
    class_mode='categorical')

test_generator = test_datagen.flow_from_directory(
    test_dir,
    target_size=(224, 224),
    batch_size=20,
    class_mode='categorical')

In [None]:
#List of all of the bird classes:
train_generator.class_indices

In [None]:
#Print images and their labels

def getKeybyValue(LabelDict, value):
    listItems = LabelDict.items()
    for item in listItems:
        if item[1] == value:
            return item[0]
    
    return None

def pltFourImages(dir):
    datagen2 = ImageDataGenerator()
    it2 = datagen2.flow_from_directory(
            dir,
            target_size=(224, 224),
            batch_size=20,
            class_mode='binary')

    labDict = it2.class_indices
    batchX, batchy = it2.next() 
    num_img = batchX.shape[0]
    imgs = [array_to_img(batchX[i]) for i in range(num_img)]
    indx = [int(batchy[i]) for i in range(len(batchy))]
    labs = [getKeybyValue(labDict, i) for i in indx]
   
    # settings
    h, w = 10, 10        
    nrows, ncols = 2, 2  
    figsize = [18,12]     

    # create figure (fig), and array of axes (ax)
    fig, ax = plt.subplots(nrows=nrows, ncols=ncols, figsize=figsize, dpi = 80)

    # plot image on each sub-plot
    for i, axi in enumerate(ax.flat):
        # i runs from 0 to (nrows*ncols-1)
        # axi is equivalent with ax[rowid][colid]
        axi.imshow(imgs[i], aspect = 'auto')

        # write Label as title
        axi.set_title(labs[i])

    plt.subplots_adjust(top = 0.99, bottom=0.01, hspace=0.2, wspace=0.2)
    plt.show()
    return

In [None]:


# Print four images from each of train, test and valid
for d in ['/train', '/test', '/valid']:
    print('\n\nImages from ', d)
    pltFourImages(base_dir + d)

# Source: https://www.kaggle.com/jimreed/analysis-of-bird-species-dataset

# Summary of Models

The base model used in this analysis is the pretrainde VGG16 neural network. This network is trained on the ImageNet dataset of over 14 million images. Conveniently, the image size used in VGG16 is the same as the image size used in this analysis: 224x224. VGG16 utilizes 5 blocks of 2D Convolutional layers and Max Pooling 2D. I unfroze the final block, which consists of 3 layers of Convolution and 1 layer of Max Pooling. Here is a visual summary of the VGG16 model:

![](https://neurohive.io/wp-content/uploads/2018/11/vgg16.png)

More reading here: https://neurohive.io/en/popular-networks/vgg16/

After the base model is run, I added a Flatten layer and 2 Dense Layers, using Relu and Softmax as the activations. While the results shift with each run, the initial VGG16 test score was 94.1%. The model ran for all 50 epochs without EarlyStopping kicking in. Normally I would increase the number of epochs, but this took over an hour to run, and it was only the first step in the model tuning process. 

Next, I added 2 Dropout layers of 0.2, which improved the model accuracy to 96.1%. I also changed the kernel_initializer to he_normal and glorot_normal for the final Dense layers. I tested adding additional Batch Normalization and 2D Convolution layers, but they had adverse effects on performance. 

After running this model, I ran it again for further training, which made it to 12 additional epochs and an accuracy of 97.8%. 

Finally, I took this model and used it as the base for another model. 

In [None]:
backend.clear_session()
vgg_base = VGG16(weights = 'imagenet', include_top = False, input_shape = (224, 224, 3))

In [None]:
print('VGG model base  summary:', vgg_base.summary())

In [None]:
# Here we freeze all the layers except the last 4.
for layer in vgg_base.layers[:-4]:
  layer.trainable = False
for layer in vgg_base.layers:
  print(layer, layer.trainable)

In [None]:
modelvgg_train = models.Sequential()
modelvgg_train.add(vgg_base)
modelvgg_train.add(layers.Flatten())
modelvgg_train.add(layers.Dense(2048, activation = 'relu'))
modelvgg_train.add(layers.Dense(250, activation = 'softmax'))

In [None]:
print('VGG model train  summary:', modelvgg_train.summary())

In [None]:
# We will still use the same data augmentation from above

modelvgg_train.compile(optimizer = tf.keras.optimizers.Adam(learning_rate=0.0001),
    loss = 'categorical_crossentropy',
    metrics = ['accuracy'])

In [None]:
history = modelvgg_train.fit_generator(
    train_generator,
    steps_per_epoch=200,
    epochs=epoch,
    validation_data=validataion_generator,
    verbose = 2,
    callbacks=[EarlyStopping(monitor = 'val_accuracy', patience = 5, restore_best_weights = True)])

In [None]:
test_loss, test_acc = modelvgg_train.evaluate_generator(test_generator, steps = 50)


print('VGG16_train_test_acc:', test_acc)

Initial VGG16 test score was 84.6%. This was using a smaller batch size, which was increased for future models. 

# Summary of Methods

The methods used below come from the Keras library. The primary model used is based on pretrained weights using the VGG16 network. In addition to this model, the data is augmented in several ways, such as rotation, zoom, and horizontal flipping. This allows the model to receive more images for training in order to have higher accuracy without needing brand new sources of data. 

Evaluation is performed using a comparison of test accuracy and validation accuracy/loss per epoch. Also, a confusion matrix is used to better interpret the results. 

In [None]:
#Data Augmentation
train_datagen2 = ImageDataGenerator(
    rescale=1./255,# The image augmentaion function in Keras
    rotation_range=40, # Rotate the images randomly by 40 degrees
    width_shift_range=0.2, # Shift the image horizontally by 20%
    height_shift_range=0.2, # Shift the image veritcally by 20%
    zoom_range=0.2, # Zoom in on image by 20%
    horizontal_flip=True, # Flip image horizontally 
    fill_mode='nearest') # How to fill missing pixels after a augmentaion opperation


test_datagen2 = ImageDataGenerator(rescale=1./255) 

train_generator2 = train_datagen2.flow_from_directory(
    train_dir,
    target_size=(224, 224),
    batch_size=64,
    class_mode='categorical')

validataion_generator2 = train_datagen2.flow_from_directory(
    validation_dir,
    target_size=(224, 224),
    batch_size=64,
    class_mode='categorical')

test_generator2 = test_datagen2.flow_from_directory( # Resize test data
    test_dir,
    target_size=(224, 224),
    batch_size=64,
    class_mode='categorical')

In [None]:
#Same technique as original with added dropout and kernel initializers; Using an augmented dataset; Updated steps_per_epoch to 500 to match number of samples / batch size

backend.clear_session()

modelvgg_dropout = models.Sequential()
modelvgg_dropout.add(vgg_base)

#Additional Layers
# modelvgg_train.add(layers.Conv2D(128, (3,3),  strides=(1, 1), activation = 'relu',kernel_initializer = 'he_uniform',padding='same'))
# modelvgg_train.add(layers.BatchNormalization())
# modelvgg_train.add(layers.Conv2D(128, (3,3),  strides=(1, 1), activation = 'relu',kernel_initializer = 'he_uniform',padding='same'))
# modelvgg_train.add(layers.BatchNormalization())
# modelvgg_train.add(layers.Conv2D(128, (3,3),  strides=(1, 1), activation = 'relu',kernel_initializer = 'he_uniform',padding='same'))
#modelvgg_train.add(layers.BatchNormalization())
# modelvgg_train.add(layers.MaxPool2D((2,2),  strides=(2, 2)))

modelvgg_dropout.add(layers.Flatten())
# modelvgg_dropout.add(Dropout(0.2))
modelvgg_dropout.add(layers.Dense(2048, activation = 'relu', kernel_initializer='he_normal'))
modelvgg_dropout.add(Dropout(0.3))
modelvgg_dropout.add(layers.Dense(250, activation = 'softmax', kernel_initializer='glorot_normal'))

print('VGG model dropout  summary:', modelvgg_dropout.summary())

# We will still use the same data augmentation from above

modelvgg_dropout.compile(optimizer = tf.keras.optimizers.Adam(learning_rate=0.0001),
    loss = 'categorical_crossentropy',
    metrics = ['accuracy'])

history = modelvgg_dropout.fit_generator(
    train_generator2,
    steps_per_epoch=500,
    epochs=epoch,
    validation_data=validataion_generator2,
    verbose = 2,
    callbacks=[EarlyStopping(monitor = 'val_accuracy', patience = 5, restore_best_weights = True)])



In [None]:
test_loss, test_acc = modelvgg_dropout.evaluate_generator(test_generator2) #steps = 50)


print('VGG16_dropout_acc:', test_acc)

Once adding dropout and new kernels, the score went up to 93.3%

In [None]:
#Try compiling the same fit again with a lower learning rate.

#backend.clear_session()


modelvgg_dropout.compile(optimizer = tf.keras.optimizers.Adam(learning_rate=0.00001),
    loss = 'categorical_crossentropy',
    metrics = ['accuracy'])

history = modelvgg_dropout.fit_generator(
    train_generator2,
    steps_per_epoch=500,
    epochs=epoch, 
    validation_data=validataion_generator2,
    verbose = 2,
    callbacks=[EarlyStopping(monitor = 'val_accuracy', patience = 5, restore_best_weights = True)])

test_loss, test_acc = modelvgg_dropout.evaluate_generator(test_generator2, steps = 50)


print('VGG16_more_epocs_acc:', test_acc)

Running the model through 8 more epocs brought the accuracy up to 97.8%

In [None]:
history_dict = history.history
loss_values = history_dict['loss']
val_loss_values = history_dict['val_loss']
acc_values = history_dict['accuracy']
val_acc_values = history_dict['val_accuracy']
epochs = range(1, len(history_dict['accuracy']) + 1)

plt.plot(epochs, loss_values, 'bo', label = 'Training loss')
plt.plot(epochs, val_loss_values, 'b', label = 'Validation loss')
plt.title('Training and validation loss')
plt.xlabel('Epochs')
plt.ylabel('Loss')
plt.legend()
plt.show()

plt.plot(epochs, acc_values, 'bo', label = 'Training accuracy')
plt.plot(epochs, val_acc_values, 'b', label = 'Validation accuracy')
plt.title('Training and validation accuracy')
plt.xlabel('Epochs')
plt.ylabel('Accuracy')
plt.legend()
plt.show()

### Show Predicted Images with Labels

In [None]:
from keras.preprocessing.image import load_img,img_to_array

dic=train_generator2.class_indices
icd={k:v for v,k in dic.items()}
def output(location):
    img=load_img(location,target_size=(224,224,3))
    img=img_to_array(img)
    img=img/255
    img=np.expand_dims(img,[0])
    answer=modelvgg_dropout.predict_classes(img)
    probability=round(np.max(modelvgg_dropout.predict_proba(img)*100),2)
    #print ('Bird Is',icd[answer[0]], 'With probability',probability)
    print('The model predicts that this bird is:', icd[answer[0]])
    #print (probability, ' % chances are there that the Bird Is',icd[answer[0]])

#Source: https://www.kaggle.com/anuragmishra2311/birds-classification-using-resnet-101


In [None]:


img='../input/100-bird-species/test/EURASIAN MAGPIE/2.jpg' 
pic=load_img('../input/100-bird-species/test/EURASIAN MAGPIE/2.jpg',target_size=(224,224,3))
plt.imshow(pic)
output(img)



In [None]:
img='../input/100-bird-species/test/ALBATROSS/5.jpg' 
pic=load_img('../input/100-bird-species/test/ALBATROSS/5.jpg',target_size=(224,224,3))
plt.imshow(pic)
output(img)

In [None]:
img='../input/100-bird-species/test/NORTHERN FLICKER/3.jpg' 
pic=load_img('../input/100-bird-species/test/NORTHERN FLICKER/3.jpg',target_size=(224,224,3))
plt.imshow(pic)
output(img)

In [None]:
img='../input/100-bird-species/test/KOOKABURRA/5.jpg' 
pic=load_img('../input/100-bird-species/test/KOOKABURRA/5.jpg',target_size=(224,224,3))
plt.imshow(pic)
output(img)

# Analysis of Results

The original VGG16 model performed quite well on the data set without any assistance needed. After some performance tweaks, it continued to improve in accuracy. However, there were definitely tradeoffs in performance compared to speed. I reached the conclusion that I could possibly continue to improve the model a fraction of a percentage with each new implementation of the model, but each run took 1 to 2 hours. Normally that would be fine, but having to wait for the results certainly impacted the ability to be nimble in making adjustments. 

With a final test accuracy of 0.9779999852180481, I am quite pleased with this model. There appear to be errors in some instances of this workbook, but they are due to the session restarting on kaggle. 