### Model Testing
Chase Kent-Dotson  
April 4, 2022

### Introduction
This workbook is a collection of tests and training outputs that I did early on. I initially thought that saving the Keras model would allow me to access the history but that is not the case, so there are no evaluation graphs available for these attempts, only the text output. I also overwrote some of my early code so not all models are represented, but all moderately successful attempts from this notebook are scored at the top of the Final Model Training notebook, where my final model is built and trained completely. All models attempted are also included as complete files in the models directory, even those not represented in the code below.
  
This notebook likely will not run from top to bottom without some re-modification and the output will not be preserved if modified and rerun, but I wanted to include it to show the process I went through to begin understanding use of the data pipeline and model construction. It contains my most successful CNN which did not use pretrained weights, and an explanation of the optimization process I went through to improve it before using transfer learning models, then my early attempts at transfer learning.   

### Table of Contents:
* [CNN Modeling from Scratch](#2)
* [Early Transfer Learning Models](#3)
* [Conclusion](#4)

### Imports <a class="anchor" id="1"></a>

In [None]:
# data handling
import numpy as np
import pandas as pd

# file directory interface
import os.path, sys

# neural network modeling
import tensorflow as tf 
from tensorflow import keras
from keras.models import Sequential
from keras.layers import Conv2D
from tensorflow.keras.layers import MaxPooling2D
from keras.layers import Dropout
from keras.layers import Flatten
from keras.layers import Dense
from keras.layers import BatchNormalization
from tensorflow.keras.optimizers import Adam
from keras.layers import GlobalAveragePooling2D
from keras import Model

# transfer learning modeling
import efficientnet.tfkeras as efn

# data generation for modeling
from keras.preprocessing.image import ImageDataGenerator

### CNN Modeling from Scratch <a class="anchor" id="2"></a>
Below are my early attempts at modeling using CNNs from scratch without any pre-trained weights. 

In [2]:
# set variables for ease of optimization later
batch_size = 32
num_classes = 196
dropout = 0.2
img_size= (224,224)

In [3]:
#Data agumentation

# data generator with a 1/255 rescale, +-20% random shear and zoom, and a random binary flip
train_datagen = ImageDataGenerator(rescale=1./255,
                                   shear_range=0.2,
                                   zoom_range=0.2,
                                   horizontal_flip=True,
                                   samplewise_std_normalization=True,
                                   samplewise_center=True,
                                   validation_split=0.2)

# no modifications to the test data
test_datagen = ImageDataGenerator(rescale=1./255,
                                  samplewise_std_normalization=True,
                                  samplewise_center=True)

# generate the data for the train set including automatic labels
train_data = train_datagen.flow_from_directory('data/car_data/car_data/train',
                                              target_size=img_size,
                                              shuffle=True,
                                              batch_size=batch_size,
                                              class_mode='categorical',
                                              subset='training')

# generate the validation data from the portion set aside earlier
validation_data = train_datagen.flow_from_directory('data/car_data/car_data/train',
                                                    target_size=img_size,
                                                    batch_size=batch_size,
                                                    class_mode='categorical',
                                                    subset='validation') 

# generate data for the test set
test_data = test_datagen.flow_from_directory('data/car_data/car_data/test',
                                              target_size=img_size,
                                              batch_size=batch_size,
                                              class_mode='categorical')

Found 6598 images belonging to 196 classes.
Found 1546 images belonging to 196 classes.
Found 8041 images belonging to 196 classes.


In [70]:
CNN_model = Sequential()

# Create simple CNN model architecture with Pooling for dimensionality reduction 
# and Dropout to reduce overfitting
CNN_model.add(Conv2D(32, kernel_size=(3, 3), activation = 'relu', input_shape = (128, 128, 3)))
CNN_model.add(BatchNormalization())
CNN_model.add(MaxPooling2D(pool_size=(2, 2)))
CNN_model.add(Dropout(dropout))

# 2nd Convolution and Pooling Layer
CNN_model.add(Conv2D(64,kernel_size=(3,3),activation='relu'))
CNN_model.add(BatchNormalization())
CNN_model.add(MaxPooling2D(pool_size=(2, 2)))
CNN_model.add(Dropout(dropout))

# 3rd Convolution and Pooling Layer
CNN_model.add(Conv2D(128,kernel_size=(3,3),activation='relu'))
CNN_model.add(BatchNormalization())
CNN_model.add(MaxPooling2D(pool_size=(2, 2)))
CNN_model.add(Dropout(dropout))

# 4th Convolution and Pooling Layer
CNN_model.add(Conv2D(256,kernel_size=(3,3),activation='relu'))
CNN_model.add(BatchNormalization())
CNN_model.add(MaxPooling2D(pool_size=(2, 2)))
CNN_model.add(Dropout(dropout))

# 5th Convolution and Pooling Layer
CNN_model.add(Conv2D(512,kernel_size=(3,3),activation='relu'))
CNN_model.add(BatchNormalization())
CNN_model.add(MaxPooling2D(pool_size=(2, 2)))
CNN_model.add(Dropout(dropout))

# Flatten the output of our convolutional layers
CNN_model.add(Flatten())

# Add dense layers
CNN_model.add(Dense(units=256,activation='relu'))
CNN_model.add(Dropout(dropout))
CNN_model.add(Dense(units=256,activation='relu'))
CNN_model.add(Dropout(dropout))
CNN_model.add(Dense(num_classes, activation='softmax'))

# Print out a summary of the network
CNN_model.summary()

Model: "sequential_15"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
conv2d_51 (Conv2D)           (None, 126, 126, 32)      896       
_________________________________________________________________
batch_normalization_24 (Batc (None, 126, 126, 32)      128       
_________________________________________________________________
max_pooling2d_49 (MaxPooling (None, 63, 63, 32)        0         
_________________________________________________________________
dropout_63 (Dropout)         (None, 63, 63, 32)        0         
_________________________________________________________________
conv2d_52 (Conv2D)           (None, 61, 61, 64)        18496     
_________________________________________________________________
batch_normalization_25 (Batc (None, 61, 61, 64)        256       
_________________________________________________________________
max_pooling2d_50 (MaxPooling (None, 30, 30, 64)      

In [71]:
optimizer = keras.optimizers.Adam(learning_rate=0.001)

# Compile the model with the desired loss function, optimizer, and metric(s) to track
CNN_model.compile(loss = 'categorical_crossentropy',
                  optimizer = optimizer,
                  metrics = ['accuracy'])

In [72]:
history = CNN_model.fit(train_data,
                        steps_per_epoch=train_data.samples // batch_size,
                        validation_data=validation_data,
                        epochs=10,
                        validation_steps=validation_data.samples // batch_size)

Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10


In [73]:
# Evaluate the model's performance on the test data
score = CNN_model.evaluate(test_data, verbose=1)

print('Test loss:', score[0])
print('Test accuracy:', score[1])

Test loss: 4.167438983917236
Test accuracy: 0.09314762055873871


In [74]:
CNN_model.save('models/model_2')

INFO:tensorflow:Assets written to: models/model_2\assets


In [75]:
reconstructed_model = keras.models.load_model('models/model_2')

In [76]:
r_history = reconstructed_model.fit(train_data,
                                    steps_per_epoch=train_data.samples // batch_size,
                                    validation_data=validation_data,
                                    epochs=30,
                                    validation_steps=validation_data.samples // batch_size)

Epoch 1/30
Epoch 2/30
Epoch 3/30
Epoch 4/30
Epoch 5/30
Epoch 6/30
Epoch 7/30
Epoch 8/30
Epoch 9/30
Epoch 10/30
Epoch 11/30
Epoch 12/30
Epoch 13/30
Epoch 14/30
Epoch 15/30
Epoch 16/30
Epoch 17/30
Epoch 18/30
Epoch 19/30
Epoch 20/30
Epoch 21/30
Epoch 22/30
Epoch 23/30
Epoch 24/30
Epoch 25/30
Epoch 26/30
Epoch 27/30
Epoch 28/30
Epoch 29/30
Epoch 30/30


In [77]:
# Evaluate the model's performance on the test data
score = reconstructed_model.evaluate(test_data, verbose=1)

print('Test loss:', score[0])
print('Test accuracy:', score[1])

Test loss: 2.8250043392181396
Test accuracy: 0.33291879296302795


In [78]:
reconstructed_model.save('models/model_2')

INFO:tensorflow:Assets written to: models/model_2\assets


### CNN Modeling from Scratch: Findings
Above is the construction and training output for my most successful CNN without using pre-trained weights. The first CNN constructed did not use batch normalization, and did not contain the final 512 node convolutional layer. It's maximum accuracy was __7%__. The model above, model_2, surpassed that accuracy in the first 10 epochs, and had a final accuracy of __33%__ after 40 epochs of training when the validation loss minimization and accuracy begun to stop their progress. The addition of batch normalization was helpful in preventing overtaining as the trend of validation progress slowed much quicker in the first model without it and the validation scores remained closer to the training scores for longer.  
  
Given that this is a 196 class multi-classification problem, I gathered from my previous tests that in order to achieve an accuracy I'd be satisfied with out of a CNN from scratch I'd need to increase complextity and then massively increase measures to combat overtraining (like lowering the learning rate), while investing huge amounts of time for the model to learn the problem. I figured the best pathway forward was to see if using a base model with pre trained weights could help me achieve a respectable accuracy within the amount of time I had available, so I moved onto that next. 

### Early Transfer Learning Models <a class="anchor" id="3"></a>
Below are some of my first attempts at applying transfer learning models to the problem.

In [27]:
# set up the transfer learning model structure
base_model = efn.EfficientNetB0(weights='imagenet', include_top=False)
x = base_model.output
x = GlobalAveragePooling2D()(x)
predictions = Dense(len(train_data.class_indices), activation='softmax')(x)
model = Model(inputs=base_model.input, outputs=predictions)

#  batch normalization will be trainable with the top layer
for layer in base_model.layers:
    if isinstance(layer, BatchNormalization):
        layer.trainable = True
    else:
        layer.trainable = False

# compile the model
model.compile(optimizer=keras.optimizers.Adam(learning_rate=0.0001), 
              loss='categorical_crossentropy', 
              metrics=['acc'])

model.summary()

Model: "model_5"
__________________________________________________________________________________________________
Layer (type)                    Output Shape         Param #     Connected to                     
input_6 (InputLayer)            [(None, None, None,  0                                            
__________________________________________________________________________________________________
stem_conv (Conv2D)              (None, None, None, 3 864         input_6[0][0]                    
__________________________________________________________________________________________________
stem_bn (BatchNormalization)    (None, None, None, 3 128         stem_conv[0][0]                  
__________________________________________________________________________________________________
stem_activation (Activation)    (None, None, None, 3 0           stem_bn[0][0]                    
____________________________________________________________________________________________

In [28]:
history = model.fit_generator(generator=train_data,
                              steps_per_epoch=train_data.samples // batch_size + 1 ,
                              validation_data=validation_data,
                              validation_steps=validation_data.samples // batch_size + 1,
                              epochs=10,                           
                              workers=8,             
                              max_queue_size=32,             
                              verbose=1)

Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10


In [29]:
# Evaluate the model's performance on the test data
score = model.evaluate(test_data, verbose=1)

print('Test loss:', score[0])
print('Test accuracy:', score[1])

Test loss: 3.620945930480957
Test accuracy: 0.30618083477020264


In [30]:
model.save('models/transfer_model_B0_lr0001')

INFO:tensorflow:Assets written to: models/transfer_model_B0_lr0001\assets




In [9]:
reconstructed_model = keras.models.load_model('models/transfer_model_B0_lr0001')

In [10]:
r_history = reconstructed_model.fit_generator(generator=train_data,
                                              steps_per_epoch=train_data.samples // batch_size + 1 ,
                                              validation_data=validation_data,
                                              validation_steps=validation_data.samples // batch_size + 1,
                                              initial_epoch=10,
                                              epochs=20,                           
                                              workers=8,             
                                              max_queue_size=32,             
                                              verbose=1)

Epoch 11/20
Epoch 12/20
Epoch 13/20
Epoch 14/20
Epoch 15/20
Epoch 16/20
Epoch 17/20
Epoch 18/20
Epoch 19/20
Epoch 20/20


In [11]:
# Evaluate the model's performance on the test data
score = reconstructed_model.evaluate(test_data, verbose=1)

print('Test loss:', score[0])
print('Test accuracy:', score[1])

Test loss: 2.4186792373657227
Test accuracy: 0.49570947885513306


In [13]:
reconstructed_model.save('models/transfer_model_B0_lr0001_20epoch')

INFO:tensorflow:Assets written to: models/transfer_model_B0_lr0001_20epoch\assets




In [14]:
reconstructed_model = keras.models.load_model('models/transfer_model_B0_lr0001_20epoch')

In [15]:
r_history = reconstructed_model.fit_generator(generator=train_data,
                                              steps_per_epoch=train_data.samples // batch_size + 1 ,
                                              validation_data=validation_data,
                                              validation_steps=validation_data.samples // batch_size + 1,
                                              initial_epoch=20,
                                              epochs=30,                           
                                              workers=8,             
                                              max_queue_size=32,             
                                              verbose=1)



Epoch 21/30
Epoch 22/30
Epoch 23/30
Epoch 24/30
Epoch 25/30
Epoch 26/30
Epoch 27/30
Epoch 28/30
Epoch 29/30
Epoch 30/30


In [16]:
# Evaluate the model's performance on the test data
score = reconstructed_model.evaluate(test_data, verbose=1)

print('Test loss:', score[0])
print('Test accuracy:', score[1])

Test loss: 1.7826638221740723
Test accuracy: 0.5951995849609375


In [17]:
reconstructed_model.save('models/transfer_model_B0_lr0001_30epoch')

INFO:tensorflow:Assets written to: models/transfer_model_B0_lr0001_30epoch\assets




In [18]:
reconstructed_model = keras.models.load_model('models/transfer_model_B0_lr0001_30epoch')

In [19]:
r_history = reconstructed_model.fit_generator(generator=train_data,
                                              steps_per_epoch=train_data.samples // batch_size + 1 ,
                                              validation_data=validation_data,
                                              validation_steps=validation_data.samples // batch_size + 1,
                                              initial_epoch=30,
                                              epochs=40,                           
                                              workers=8,             
                                              max_queue_size=32,             
                                              verbose=1)



Epoch 31/40
Epoch 32/40
Epoch 33/40
Epoch 34/40
Epoch 35/40
Epoch 36/40
Epoch 37/40
Epoch 38/40
Epoch 39/40
Epoch 40/40


In [20]:
# Evaluate the model's performance on the test data
score = reconstructed_model.evaluate(test_data, verbose=1)

print('Test loss:', score[0])
print('Test accuracy:', score[1])

Test loss: 1.4460686445236206
Test accuracy: 0.6522820591926575


In [21]:
reconstructed_model.save('models/transfer_model_B0_lr0001_40epoch')

INFO:tensorflow:Assets written to: models/transfer_model_B0_lr0001_40epoch\assets




In [4]:
reconstructed_model = keras.models.load_model('models/transfer_model_B0_lr0001_40epoch')

In [5]:
r_history = reconstructed_model.fit_generator(generator=train_data,
                                              steps_per_epoch=train_data.samples // batch_size + 1 ,
                                              validation_data=validation_data,
                                              validation_steps=validation_data.samples // batch_size + 1,
                                              initial_epoch=40,
                                              epochs=50,                           
                                              workers=8,             
                                              max_queue_size=32,             
                                              verbose=1)



Epoch 41/50
Epoch 42/50
Epoch 43/50
Epoch 44/50
Epoch 45/50
Epoch 46/50
Epoch 47/50
Epoch 48/50
Epoch 49/50
Epoch 50/50


In [6]:
# Evaluate the model's performance on the test data
score = reconstructed_model.evaluate(test_data, verbose=1)

print('Test loss:', score[0])
print('Test accuracy:', score[1])

Test loss: 1.2615721225738525
Test accuracy: 0.6862330436706543


In [7]:
reconstructed_model.save('models/transfer_model_B0_lr0001_50epoch')

INFO:tensorflow:Assets written to: models/transfer_model_B0_lr0001_50epoch\assets




In [8]:
reconstructed_model = keras.models.load_model('models/transfer_model_B0_lr0001_50epoch')

In [9]:
r_history = reconstructed_model.fit_generator(generator=train_data,
                                              steps_per_epoch=train_data.samples // batch_size + 1,
                                              validation_data=validation_data,
                                              validation_steps=validation_data.samples // batch_size + 1,
                                              initial_epoch=50,
                                              epochs=60,                           
                                              workers=8,             
                                              max_queue_size=32,             
                                              verbose=1)



Epoch 51/60
Epoch 52/60
Epoch 53/60
Epoch 54/60
Epoch 55/60
Epoch 56/60
Epoch 57/60
Epoch 58/60
Epoch 59/60
Epoch 60/60


In [10]:
# Evaluate the model's performance on the test data
score = reconstructed_model.evaluate(test_data, verbose=1)

print('Test loss:', score[0])
print('Test accuracy:', score[1])

Test loss: 1.1425036191940308
Test accuracy: 0.7072503566741943


In [11]:
reconstructed_model.save('models/transfer_model_B0_lr0001_60epoch')

INFO:tensorflow:Assets written to: models/transfer_model_B0_lr0001_60epoch\assets




In [5]:
reconstructed_model = keras.models.load_model('models/transfer_model_B0_lr001')

In [6]:
# Evaluate the model's performance on the test data
score = reconstructed_model.evaluate(test_data, verbose=1)

print('Test loss:', score[0])
print('Test accuracy:', score[1])

Test loss: 1.0135035514831543
Test accuracy: 0.7346101403236389


### Early Transfer Learning Models: Findings
My early transfer learning models proved immediately more predictively powerful than the previous CNN attempts. I attempted models using different learning rates, and confirmed my research that lowering the learning rate helps to prevent overfitting. I attempted different varients of the EfficientNet base model. While this models were better than the original CNNs I attempted, the tuning required more research and thought which I address in my Final Model Training notebook while I make a push for deciding upon and training my final model in preparation for evaluation. 

### Conclusion <a class="anchor" id="4"></a>
Implementing a CNN model from scratch proved fruitful, considering a naiive model guessing the majority class every time would provide only __0.8%__ accuracy, however more predictive power is required to make the model worthwhile from a business perspective as marketing data will need to be reliable in order to provide value. Transfer learning has proven very helpful thus far in achieving this goal, but will require better tuning to maximize the model's capabilities. I will tackle final model decisions, building, training, and tuning in my next notebook.