# Project Description 

The goal of this project is to continue exploring convolutional neural nets, as well as transfer learning. 

# Load in Images

I am loading in the "Birds 525 Species - Image Classificaiton" Dataset. I sourced this dataset from Kaggle at this url: https://www.kaggle.com/datasets/gpiosenka/100-bird-species

The script Load_Image_Data.py is being used to load in these images in using keras' ImageDataGenerator, which will perform rgb intensity scaling down to [0,1] as well as specifying the batch size of 32.

There are 525 unique classes to this dataset. Images are of shape (224,224,3)

In [1]:
from Load_Image_Data import Load_ImageData

path = "\\Bird Species Image Classification\\Bird Images"

train_data, test_data, val_data = Load_ImageData.Load_ImgData(path)

Found 84635 images belonging to 525 classes.  
Found 2625 images belonging to 525 classes.  
Found 2625 images belonging to 525 classes.

I'm going to replace the train_data I have loaded in with train data that contains data augmentation using ImageDatGenerator. Test and val data will remain the same (without any augmentation)

In [3]:
from keras.preprocessing.image import ImageDataGenerator

#respicify path to specifcily train data 
path = "\\Bird Species Image Classification\\Bird Images\\train"

aug_train = ImageDataGenerator(
    width_shift_range=0.1,      
    height_shift_range=0.1,     
    rotation_range=20,          
    brightness_range=(0.5, 1.5), 
    zoom_range=0.2,             
    horizontal_flip=True,       
    rescale=1./255              
)

train_data = aug_train.flow_from_directory(
    path,
    target_size=(224, 224),
    batch_size=32,
    class_mode='categorical',
    shuffle=True
)

Found 84635 images belonging to 525 classes.


train_data, test_data, val_data have been loaded in, now lets train our model

# Model

I will start by importing the Xception pretrained model from keras for transfer learning. The Xception Architectures contains seperable convolutional layers. Traditional convoultional layers tackle spatial and rgb channel patterns in the same fell swoop, wheras seperable conv layers seperate them into different task. 

By loading in a pretrained model (triained on the ImageNet) training time will be reduced significantly since the trasfered model will already be able to identiy common patterns on the low level such as basic surface patterns and edges. 

In [3]:
import tensorflow
from tensorflow import keras
from tensorflow.keras.applications import Xception

# Load the pre-trained Xception model without the top layers
Xception_base = Xception(weights='imagenet', include_top=False, input_shape=[224, 224, 3])

model = keras.models.Sequential([
    Xception_base, 
    
    #output layers
    keras.layers.GlobalAveragePooling2D(),
    keras.layers.Dense(1024, activation='relu'),
    keras.layers.Dense(525, activation='softmax') 
    
])

model.summary()

Model: "sequential"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 xception (Functional)       (None, 7, 7, 2048)        20861480  
                                                                 
 global_average_pooling2d (G  (None, 2048)             0         
 lobalAveragePooling2D)                                          
                                                                 
 dense (Dense)               (None, 1024)              2098176   
                                                                 
 dense_1 (Dense)             (None, 525)               538125    
                                                                 
Total params: 23,497,781
Trainable params: 23,443,253
Non-trainable params: 54,528
_________________________________________________________________


In [4]:
model.compile(loss="categorical_crossentropy",
             optimizer= "adam",
             metrics=['accuracy'])

In [5]:
history = model.fit(
    train_data,
    steps_per_epoch=train_data.samples // train_data.batch_size,
    epochs=2,
    validation_data=val_data,
    validation_steps=val_data.samples // val_data.batch_size)

Epoch 1/2
Epoch 2/2


In [2]:
model.save('\\Bird Species Image Classification\\model_2_epochs')

This took a very long time to train two times through the training set (almost 13 hours). I will be training the model in seperate chunks of time, saving after each chunk.

In [6]:
import tensorflow as tf

saved_model_path = '\\Bird Species Image Classification\\model_2_epochs'
loaded_model = tf.keras.models.load_model(saved_model_path)

In [7]:
loaded_model.evaluate(test_data)



[1.0075669288635254, 0.7234285473823547]

72% accuracy

Training Chunk 2:

In [8]:
history = loaded_model.fit(
    train_data,
    steps_per_epoch=train_data.samples // train_data.batch_size,
    epochs=2,
    validation_data=val_data,
    validation_steps=val_data.samples // val_data.batch_size)

Epoch 1/2
Epoch 2/2


In [3]:
save_model_path = '\\Bird Species Image Classification\\model_4_epochs'
loaded_model.save(saved_model_path)

In [9]:
loaded_model.evaluate(test_data)



[0.37826618552207947, 0.8914285898208618]

72% -> 89% accuracy

In [4]:
import tensorflow as tf

saved_model_path = '\\Bird Species Image Classification\\model_4_epochs'
loaded_model = tf.keras.models.load_model(saved_model_path)

In [14]:
loaded_model.evaluate(test_data)



[0.37826618552207947, 0.8914285898208618]

Training Chunk 3:

In [5]:
history = loaded_model.fit(
    train_data,
    steps_per_epoch=train_data.samples // train_data.batch_size,
    epochs=2,
    validation_data=val_data,
    validation_steps=val_data.samples // val_data.batch_size)

Epoch 1/2
Epoch 2/2


In [6]:
loaded_model.evaluate(test_data)



[0.33620306849479675, 0.9020952582359314]

Nice, 90% accuracy

In [4]:
save_model_path = '\\Bird Species Image Classification\\model_6_epochs'
loaded_model.save(save_model_path)

Due to how computationally expensive this task is currently, I am going to retire the project at 90% accuracy.