# Conclusions

1. For **hyperparameter tuning**, a moderate number of layer (between 3-5), a lot of units to capture the intracacies of the images, and a low learning rate are the optimal hyper parameters. This ensures the model isn't overfitting and that it trains with stability given the small data set.

2. **Transfer learning**:

- VGG16 with frozen layers yields good results when our dense model train based on the transferred convolutional weights from VGG16.

- VGG16 with the unfronzen blocks, when it was Model 2 or Model 3, both yielded extremely poor results )accuracy = 24.5%) as the great number of convolutions and their resulting parameters did not work well given this very small dataset. The model was attempting to convolute a limited set of 3600 images to the point where it perhaps wasn't able to recognize the images.

3. The **cross validation** confirmed these suspicions as it did not yield better results given nearoptimal hyper parameters were used.


In [1]:
import os
import PIL
import PIL.Image
import tensorflow as tf
import numpy as np
from numpy import asarray
import pandas as pd
from keras.models import Sequential, Model
from keras.layers import Input, Dropout, Flatten, Dense, GlobalAveragePooling2D
from keras import optimizers
from tensorflow.keras.applications.vgg16 import VGG16
from sklearn import preprocessing
from keras.preprocessing.image import load_img
from sklearn.model_selection import train_test_split
from sklearn.model_selection import StratifiedShuffleSplit
from sklearn.preprocessing import StandardScaler

In [2]:
batch_size = 256
img_height = 150
img_width = 150

**This method of importing images is rather long, but it is the only method I know to allow me to divide the dataset into train and test sets in order to conduct cross validation at the end of this notebook**

In [4]:
with tf.device('/device:CPU:0'): 
    
    X = []
    y = []
    base_dir = "C:/Users/Moham/.keras/datasets/flower_photos/"
    
    # THE FOLLOWING SHOULD BE USED FOR GOOGLE COLAB
    # import pathlib
    # dataset_url = "https://storage.googleapis.com/download.tensorflow.org/example_images/flower_photos.tgz"
    # data_dir = tf.keras.utils.get_file(origin=dataset_url,
    #                                    fname='flower_photos',
    #                                    untar=True)
    # base_dir = pathlib.Path(data_dir)
    
    
    for f in sorted(os.listdir(base_dir)):
        if os.path.isdir(base_dir+f):
            print(f"{f} is a target class")
            for i in sorted(os.listdir(base_dir+f)):
                print(f"{i} is an input image path")
                X.append(base_dir+f+'/'+i)
                y.append(f)

    imgs = [load_img(X[i]) for i in range(len(X))]
    img_array = [asarray(img) for img in imgs]
    imgs = [tf.image.resize(img, [150,150]) for img in img_array] # resize to 150 x 150 x 3
    X = np.stack(imgs, axis=0)

    # encode labels
    le = preprocessing.LabelEncoder()
    le.fit(y)
    y = le.transform(y)
    
    # train-test split
    X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.25, random_state=123, stratify=y)
    
    ## Check class balance ##  
    val_ratios = []
    train_ratios = []
    for i, j in enumerate(np.unique(y_test, return_counts=True)[1]):
        val_ratios.append(j/y_test.sum())
    for i, j in enumerate(np.unique(y_train, return_counts=True)[1]):
        train_ratios.append(j/y_train.sum())

    display(np.array(val_ratios)/np.array(train_ratios))
    
    # scaling inputs and reshape labels
    X_train, X_test = X_train/255, X_test/255    
    y_trian, y_test = y_train.reshape(-1), y_test.reshape(-1)
    


daisy is a target class
100080576_f52e8ee070_n.jpg is an input image path
10140303196_b88d3d6cec.jpg is an input image path
10172379554_b296050f82_n.jpg is an input image path
10172567486_2748826a8b.jpg is an input image path
10172636503_21bededa75_n.jpg is an input image path
102841525_bd6628ae3c.jpg is an input image path
1031799732_e7f4008c03.jpg is an input image path
10391248763_1d16681106_n.jpg is an input image path
10437754174_22ec990b77_m.jpg is an input image path
10437770546_8bb6f7bdd3_m.jpg is an input image path
10437929963_bc13eebe0c.jpg is an input image path
10466290366_cc72e33532.jpg is an input image path
10466558316_a7198b87e2.jpg is an input image path
10555749515_13a12a026e.jpg is an input image path
10555815624_dc211569b0.jpg is an input image path
10555826524_423eb8bf71_n.jpg is an input image path
10559679065_50d2b16f6d.jpg is an input image path
105806915_a9c13e2106_n.jpg is an input image path
10712722853_5632165b04.jpg is an input image path
107592979_aaa9cdf

array([0.99664959, 1.00172029, 0.99667582, 1.00065824, 1.00041959])

The ratios are all with in 5% error range so we can assume the labels are faily balanced between train and validation sets

# Generic Model Function

This is just a basic function that creates the model of choice given the base pretrained model, # of units, # of layers, and learning rate

In [5]:
def generic_model(base_model, n_neurons=[16,], dense_layers=1, lr=0.005):
    ''' 
    n_neurons: list in order of dense layers
    dense_layers: int, # of hidden layers
    returns: compiled model (MSE, adam optimizer, RMSE metric)
    '''
    
    model = Sequential()
    
    model.add(base_model)
    model.add(Flatten(name="Flatten"))
    
    # Dense layers

    for l, n in zip(range(dense_layers), n_neurons):
        model.add(Dense(n, kernel_initializer='HeNormal', activation='relu'))
    
    # Using logits for categorical cross entropy, hence no need for an output activation
    model.add(Dense(5)) 

    # compile network 
    model.compile(
        loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True),
        optimizer=tf.keras.optimizers.Adam(learning_rate=lr),
        metrics = ['accuracy']
      )
    return model

## Model 1: VGG 16 with no trainable layers

**Conclusion based on the below tuning:**
1. learning rate: lower is better as it learns more steadily, hence tryinh lr = 0.001 and lr = 0.0005
2. layers: a moderate number of layers and a steady learning rate (0.001) seem to work better and attain higher accuracy. Too many or too little layers may under/overfit, especially given our relatively small dataset.

Build

In [6]:
with tf.device('/device:CPU:0'):
    base_model = VGG16(weights="imagenet",  include_top=False, input_shape=(img_width, img_height, 3))
    for layer in base_model.layers:
        layer.trainable = False
    
    # hyper parameter testing
    
    learning_rates, layers, units = [0.01, 0.005, 0.001], [1, 3, 5], [1024, 512, 256, 128, 64]    
    models = []
    for i in learning_rates:
        for j in layers:
            models.append(generic_model(base_model=base_model, n_neurons=units[:j], dense_layers=j, lr=i))

Train

In [6]:
history = []
with tf.device('/device:GPU:0'):  # trainin on local machine
    for i, model in enumerate(models):
        print("Fitting {}/{} models...".format(i+1, len(models)))
        model.fit(x=X_train, y=y_train, validation_data=(X_test, y_test), epochs=50, verbose=1, batch_size=64)
        history.append(model)


Fitting 1/9 models...
Epoch 1/50
Epoch 2/50
Epoch 3/50
Epoch 4/50
Epoch 5/50
Epoch 6/50
Epoch 7/50
Epoch 8/50
Epoch 9/50
Epoch 10/50
Epoch 11/50
Epoch 12/50
Epoch 13/50
Epoch 14/50
Epoch 15/50
Epoch 16/50
Epoch 17/50
Epoch 18/50
Epoch 19/50
Epoch 20/50
Epoch 21/50
Epoch 22/50
Epoch 23/50
Epoch 24/50
Epoch 25/50
Epoch 26/50
Epoch 27/50
Epoch 28/50
Epoch 29/50
Epoch 30/50
Epoch 31/50
Epoch 32/50
Epoch 33/50
Epoch 34/50
Epoch 35/50
Epoch 36/50
Epoch 37/50
Epoch 38/50
Epoch 39/50
Epoch 40/50
Epoch 41/50
Epoch 42/50
Epoch 43/50
Epoch 44/50
Epoch 45/50
Epoch 46/50
Epoch 47/50
Epoch 48/50
Epoch 49/50
Epoch 50/50
Fitting 2/9 models...
Epoch 1/50
Epoch 2/50
Epoch 3/50
Epoch 4/50
Epoch 5/50
Epoch 6/50
Epoch 7/50
Epoch 8/50
Epoch 9/50
Epoch 10/50
Epoch 11/50
Epoch 12/50
Epoch 13/50
Epoch 14/50
Epoch 15/50
Epoch 16/50
Epoch 17/50
Epoch 18/50
Epoch 19/50
Epoch 20/50
Epoch 21/50
Epoch 22/50
Epoch 23/50
Epoch 24/50
Epoch 25/50
Epoch 26/50
Epoch 27/50
Epoch 28/50
Epoch 29/50
Epoch 30/50
Epoch 31/50
Ep

Epoch 15/50
Epoch 16/50
Epoch 17/50
Epoch 18/50
Epoch 19/50
Epoch 20/50
Epoch 21/50
Epoch 22/50
Epoch 23/50
Epoch 24/50
Epoch 25/50
Epoch 26/50
Epoch 27/50
Epoch 28/50
Epoch 29/50
Epoch 30/50
Epoch 31/50
Epoch 32/50
Epoch 33/50
Epoch 34/50
Epoch 35/50
Epoch 36/50
Epoch 37/50
Epoch 38/50
Epoch 39/50
Epoch 40/50
Epoch 41/50
Epoch 42/50
Epoch 43/50
Epoch 44/50
Epoch 45/50
Epoch 46/50
Epoch 47/50
Epoch 48/50
Epoch 49/50
Epoch 50/50
Fitting 4/9 models...
Epoch 1/50
Epoch 2/50
Epoch 3/50
Epoch 4/50
Epoch 5/50
Epoch 6/50
Epoch 7/50
Epoch 8/50
Epoch 9/50
Epoch 10/50
Epoch 11/50
Epoch 12/50
Epoch 13/50
Epoch 14/50
Epoch 15/50
Epoch 16/50
Epoch 17/50
Epoch 18/50
Epoch 19/50
Epoch 20/50
Epoch 21/50
Epoch 22/50
Epoch 23/50
Epoch 24/50
Epoch 25/50
Epoch 26/50
Epoch 27/50
Epoch 28/50
Epoch 29/50
Epoch 30/50
Epoch 31/50
Epoch 32/50
Epoch 33/50
Epoch 34/50
Epoch 35/50
Epoch 36/50
Epoch 37/50
Epoch 38/50
Epoch 39/50
Epoch 40/50
Epoch 41/50
Epoch 42/50
Epoch 43/50
Epoch 44/50
Epoch 45/50
Epoch 46/50
Epo

Epoch 28/50
Epoch 29/50
Epoch 30/50
Epoch 31/50
Epoch 32/50
Epoch 33/50
Epoch 34/50
Epoch 35/50
Epoch 36/50
Epoch 37/50
Epoch 38/50
Epoch 39/50
Epoch 40/50
Epoch 41/50
Epoch 42/50
Epoch 43/50
Epoch 44/50
Epoch 45/50
Epoch 46/50
Epoch 47/50
Epoch 48/50
Epoch 49/50
Epoch 50/50
Fitting 6/9 models...
Epoch 1/50
Epoch 2/50
Epoch 3/50
Epoch 4/50
Epoch 5/50
Epoch 6/50
Epoch 7/50
Epoch 8/50
Epoch 9/50
Epoch 10/50
Epoch 11/50
Epoch 12/50
Epoch 13/50
Epoch 14/50
Epoch 15/50
Epoch 16/50
Epoch 17/50
Epoch 18/50
Epoch 19/50
Epoch 20/50
Epoch 21/50
Epoch 22/50
Epoch 23/50
Epoch 24/50
Epoch 25/50
Epoch 26/50
Epoch 27/50
Epoch 28/50
Epoch 29/50
Epoch 30/50
Epoch 31/50
Epoch 32/50
Epoch 33/50
Epoch 34/50
Epoch 35/50
Epoch 36/50
Epoch 37/50
Epoch 38/50
Epoch 39/50
Epoch 40/50
Epoch 41/50
Epoch 42/50
Epoch 43/50
Epoch 44/50
Epoch 45/50
Epoch 46/50
Epoch 47/50
Epoch 48/50
Epoch 49/50
Epoch 50/50
Fitting 7/9 models...
Epoch 1/50
Epoch 2/50
Epoch 3/50
Epoch 4/50
Epoch 5/50
Epoch 6/50
Epoch 7/50
Epoch 8/50
E

Epoch 40/50
Epoch 41/50
Epoch 42/50
Epoch 43/50
Epoch 44/50
Epoch 45/50
Epoch 46/50
Epoch 47/50
Epoch 48/50
Epoch 49/50
Epoch 50/50
Fitting 8/9 models...
Epoch 1/50
Epoch 2/50
Epoch 3/50
Epoch 4/50
Epoch 5/50
Epoch 6/50
Epoch 7/50
Epoch 8/50
Epoch 9/50
Epoch 10/50
Epoch 11/50
Epoch 12/50
Epoch 13/50
Epoch 14/50
Epoch 15/50
Epoch 16/50
Epoch 17/50
Epoch 18/50
Epoch 19/50
Epoch 20/50
Epoch 21/50
Epoch 22/50
Epoch 23/50
Epoch 24/50
Epoch 25/50
Epoch 26/50
Epoch 27/50
Epoch 28/50
Epoch 29/50
Epoch 30/50
Epoch 31/50
Epoch 32/50
Epoch 33/50
Epoch 34/50
Epoch 35/50
Epoch 36/50
Epoch 37/50
Epoch 38/50
Epoch 39/50
Epoch 40/50
Epoch 41/50
Epoch 42/50
Epoch 43/50
Epoch 44/50


Epoch 45/50
Epoch 46/50
Epoch 47/50
Epoch 48/50
Epoch 49/50
Epoch 50/50
Fitting 9/9 models...
Epoch 1/50
Epoch 2/50
Epoch 3/50
Epoch 4/50
Epoch 5/50
Epoch 6/50
Epoch 7/50
Epoch 8/50
Epoch 9/50
Epoch 10/50
Epoch 11/50
Epoch 12/50
Epoch 13/50
Epoch 14/50
Epoch 15/50
Epoch 16/50
Epoch 17/50
Epoch 18/50
Epoch 19/50
Epoch 20/50
Epoch 21/50
Epoch 22/50
Epoch 23/50
Epoch 24/50
Epoch 25/50
Epoch 26/50
Epoch 27/50
Epoch 28/50
Epoch 29/50
Epoch 30/50
Epoch 31/50
Epoch 32/50
Epoch 33/50
Epoch 34/50
Epoch 35/50
Epoch 36/50
Epoch 37/50
Epoch 38/50
Epoch 39/50
Epoch 40/50
Epoch 41/50
Epoch 42/50
Epoch 43/50
Epoch 44/50
Epoch 45/50
Epoch 46/50
Epoch 47/50
Epoch 48/50
Epoch 49/50
Epoch 50/50


In [7]:
l = []
for i in learning_rates:
    for j in layers:
        l.append("LR {} |Layers: {} | Units {}, for layers 1 till {} respectively.".format(i, j, units[:j], j))

for i, result in enumerate(history):
    print("Model {} | {}:\nFinal training and validation accuracies \n\tTraining accuracy: {}\n\tValidation accuracy: {}\n".format(i+1, l[i], result.history.history['accuracy'][-1], result.history.history['val_accuracy'][-1]))

Model 1 | LR 0.01 |Layers: 1 | Units [1024], for layers 1 till 1 respectively.:
Final training and validation accuracies 
	Training accuracy: 1.0
	Validation accuracy: 0.8071895241737366

Model 2 | LR 0.01 |Layers: 3 | Units [1024, 512, 256], for layers 1 till 3 respectively.:
Final training and validation accuracies 
	Training accuracy: 0.9629360437393188
	Validation accuracy: 0.7734204530715942

Model 3 | LR 0.01 |Layers: 5 | Units [1024, 512, 256, 128, 64], for layers 1 till 5 respectively.:
Final training and validation accuracies 
	Training accuracy: 1.0
	Validation accuracy: 0.7908496856689453

Model 4 | LR 0.005 |Layers: 1 | Units [1024], for layers 1 till 1 respectively.:
Final training and validation accuracies 
	Training accuracy: 1.0
	Validation accuracy: 0.8082788586616516

Model 5 | LR 0.005 |Layers: 3 | Units [1024, 512, 256], for layers 1 till 3 respectively.:
Final training and validation accuracies 
	Training accuracy: 1.0
	Validation accuracy: 0.7941176295280457

Mode

## Model 2: VGG with trainable block 5

Build

In [7]:
# unfreezing last 4 layers (block 5)
with tf.device('/device:CPU:0'):
    base_model2 = VGG16(weights="imagenet",  include_top=False, input_shape=(img_width, img_height, 3))
    for layer in base_model2.layers[:-4]:
        layer.trainable = False
    

# setting up models for training
    learning_rates, layers, units = [0.001, 0.0005], [1, 2], [1024, 256] 
    models2 = []
    for i in learning_rates:
        for j in layers:
            models2.append(generic_model(base_model=base_model2, n_neurons=units[:j], dense_layers=j, lr=i))

Train

In [7]:
history2 = []
with tf.device('/device:GPU:0'):  # trainin on local machine
    for i, model in enumerate(models2):
        print("Fitting {}/{} models...".format(i+1, len(models2)))
        model.fit(x=X_train, y=y_train, validation_data=(X_test, y_test), epochs=50, verbose=1, batch_size=64)
        history2.append(model)


Fitting 1/4 models...
Epoch 1/50
Epoch 2/50
Epoch 3/50
Epoch 4/50
Epoch 5/50
Epoch 6/50
Epoch 7/50
Epoch 8/50
Epoch 9/50
Epoch 10/50
Epoch 11/50
Epoch 12/50
Epoch 13/50
Epoch 14/50
Epoch 15/50
Epoch 16/50
Epoch 17/50
Epoch 18/50
Epoch 19/50
Epoch 20/50
Epoch 21/50
Epoch 22/50
Epoch 23/50
Epoch 24/50
Epoch 25/50
Epoch 26/50
Epoch 27/50
Epoch 28/50
Epoch 29/50
Epoch 30/50
Epoch 31/50
Epoch 32/50
Epoch 33/50
Epoch 34/50
Epoch 35/50
Epoch 36/50
Epoch 37/50
Epoch 38/50
Epoch 39/50
Epoch 40/50
Epoch 41/50
Epoch 42/50
Epoch 43/50
Epoch 44/50
Epoch 45/50
Epoch 46/50
Epoch 47/50
Epoch 48/50
Epoch 49/50
Epoch 50/50
Fitting 2/4 models...
Epoch 1/50
Epoch 2/50
Epoch 3/50
Epoch 4/50
Epoch 5/50
Epoch 6/50
Epoch 7/50
Epoch 8/50
Epoch 9/50
Epoch 10/50
Epoch 11/50
Epoch 12/50
Epoch 13/50
Epoch 14/50
Epoch 15/50
Epoch 16/50
Epoch 17/50
Epoch 18/50
Epoch 19/50
Epoch 20/50
Epoch 21/50
Epoch 22/50
Epoch 23/50
Epoch 24/50
Epoch 25/50
Epoch 26/50
Epoch 27/50
Epoch 28/50
Epoch 29/50
Epoch 30/50
Epoch 31/50
Ep

Epoch 16/50
Epoch 17/50
Epoch 18/50
Epoch 19/50
Epoch 20/50
Epoch 21/50
Epoch 22/50
Epoch 23/50
Epoch 24/50
Epoch 25/50
Epoch 26/50
Epoch 27/50
Epoch 28/50
Epoch 29/50
Epoch 30/50
Epoch 31/50
Epoch 32/50
Epoch 33/50
Epoch 34/50
Epoch 35/50
Epoch 36/50
Epoch 37/50
Epoch 38/50
Epoch 39/50
Epoch 40/50
Epoch 41/50
Epoch 42/50
Epoch 43/50
Epoch 44/50
Epoch 45/50
Epoch 46/50
Epoch 47/50
Epoch 48/50
Epoch 49/50
Epoch 50/50
Fitting 4/4 models...
Epoch 1/50
Epoch 2/50
Epoch 3/50
Epoch 4/50
Epoch 5/50
Epoch 6/50
Epoch 7/50
Epoch 8/50
Epoch 9/50
Epoch 10/50
Epoch 11/50
Epoch 12/50
Epoch 13/50
Epoch 14/50
Epoch 15/50
Epoch 16/50
Epoch 17/50
Epoch 18/50
Epoch 19/50
Epoch 20/50
Epoch 21/50
Epoch 22/50
Epoch 23/50
Epoch 24/50
Epoch 25/50
Epoch 26/50
Epoch 27/50
Epoch 28/50
Epoch 29/50
Epoch 30/50
Epoch 31/50
Epoch 32/50
Epoch 33/50
Epoch 34/50
Epoch 35/50
Epoch 36/50
Epoch 37/50
Epoch 38/50
Epoch 39/50
Epoch 40/50
Epoch 41/50
Epoch 42/50
Epoch 43/50
Epoch 44/50
Epoch 45/50
Epoch 46/50
Epoch 47/50
Epo

In [9]:
l2 = []
for i in learning_rates:
    for j in layers:
        l2.append("LR {} |Layers: {} | Units {}, for layers 1 till {} respectively.".format(i, j, units[:j], j))

# setting up models for training
for i, result in enumerate(history2):
    print("Model {} | {}:\nFinal training and validation accuracies \n\tTraining accuracy: {}\n\tValidation accuracy: {}\n".format(i+1, l2[i], result.history.history['accuracy'][-1], result.history.history['val_accuracy'][-1]))

Model 1 | LR 0.001 |Layers: 1 | Units [1024], for layers 1 till 1 respectively.:
Final training and validation accuracies 
	Training accuracy: 0.24454942345619202
	Validation accuracy: 0.2450980395078659

Model 2 | LR 0.001 |Layers: 2 | Units [1024, 256], for layers 1 till 2 respectively.:
Final training and validation accuracies 
	Training accuracy: 0.24454942345619202
	Validation accuracy: 0.2450980395078659

Model 3 | LR 0.0005 |Layers: 1 | Units [1024], for layers 1 till 1 respectively.:
Final training and validation accuracies 
	Training accuracy: 0.24454942345619202
	Validation accuracy: 0.2450980395078659

Model 4 | LR 0.0005 |Layers: 2 | Units [1024, 256], for layers 1 till 2 respectively.:
Final training and validation accuracies 
	Training accuracy: 0.24454942345619202
	Validation accuracy: 0.2450980395078659



As exected, the small data set seems to be insufficient for block 5 of VGG16 to generalize which is  the accuracy is so poor.

## Model 3: VGG 16 with all layers trainable

Build

In [8]:
# unfreezing last 4 layers (block 5)
with tf.device('/device:CPU:0'):

    base_model3 = VGG16(weights="imagenet",  include_top=False, input_shape=(img_width, img_height, 3))
    for layer in base_model3.layers:
        layer.trainable = True
# setting up models for training
    learning_rates, layers, units = [0.001], [1, 3], [1024, 512]  # taking optimal # of layers and learning rates   
    models3 = []
    for i in learning_rates:
        for j in layers:
            models3.append(generic_model(base_model=base_model3, n_neurons=units[:j], dense_layers=j, lr=i))

Train

In [7]:
history3 = []
with tf.device('/device:CPU:0'): # trainin on local machine
    for i, model in enumerate(models3):
        print("Fitting {}/{} models...".format(i+1, len(models3)))
        with tf.device('/device:GPU:0'):
            model.fit(x=X_train, y=y_train, validation_data=(X_test, y_test), epochs=50, verbose=1, batch_size=32)
        history3.append(model)

Fitting 1/2 models...
Epoch 1/50
Epoch 2/50
Epoch 3/50
Epoch 4/50
Epoch 5/50
Epoch 6/50
Epoch 7/50
Epoch 8/50
Epoch 9/50
Epoch 10/50
Epoch 11/50
Epoch 12/50
Epoch 13/50
Epoch 14/50
Epoch 15/50
Epoch 16/50
Epoch 17/50
Epoch 18/50
Epoch 19/50
Epoch 20/50
Epoch 21/50
Epoch 22/50
Epoch 23/50
Epoch 24/50
Epoch 25/50
Epoch 26/50
Epoch 27/50
Epoch 28/50
Epoch 29/50
Epoch 30/50
Epoch 31/50
Epoch 32/50
Epoch 33/50
Epoch 34/50
Epoch 35/50
Epoch 36/50
Epoch 37/50
Epoch 38/50
Epoch 39/50
Epoch 40/50
Epoch 41/50
Epoch 42/50
Epoch 43/50
Epoch 44/50
Epoch 45/50
Epoch 46/50
Epoch 47/50
Epoch 48/50
Epoch 49/50
Epoch 50/50
Fitting 2/2 models...
Epoch 1/50
Epoch 2/50
Epoch 3/50
Epoch 4/50
Epoch 5/50
Epoch 6/50
Epoch 7/50


Epoch 8/50
Epoch 9/50
Epoch 10/50
Epoch 11/50
Epoch 12/50
Epoch 13/50
Epoch 14/50
Epoch 15/50
Epoch 16/50
Epoch 17/50
Epoch 18/50
Epoch 19/50
Epoch 20/50
Epoch 21/50
Epoch 22/50
Epoch 23/50
Epoch 24/50
Epoch 25/50
Epoch 26/50
Epoch 27/50
Epoch 28/50
Epoch 29/50
Epoch 30/50
Epoch 31/50
Epoch 32/50
Epoch 33/50
Epoch 34/50
Epoch 35/50
Epoch 36/50
Epoch 37/50
Epoch 38/50
Epoch 39/50
Epoch 40/50
Epoch 41/50
Epoch 42/50
Epoch 43/50
Epoch 44/50
Epoch 45/50
Epoch 46/50
Epoch 47/50
Epoch 48/50
Epoch 49/50
Epoch 50/50


In [8]:
l3 = []
for i in learning_rates:
    for j in layers:
        l3.append("LR {} |Layers: {} | Units {}, for layers 1 till {} respectively.".format(i, j, units[:j], j))
        
for i, result in enumerate(history3):
    print("Model {} | {}:\nFinal training and validation accuracies \n\tTraining accuracy: {}\n\tValidation accuracy: {}\n".format(i+1, l3[i], result.history.history['accuracy'][-1], result.history.history['val_accuracy'][-1]))

Model 1 | LR 0.001 |Layers: 1 | Units [1024], for layers 1 till 1 respectively.:
Final training and validation accuracies 
	Training accuracy: 0.24454942345619202
	Validation accuracy: 0.2450980395078659

Model 2 | LR 0.001 |Layers: 3 | Units [1024, 512], for layers 1 till 3 respectively.:
Final training and validation accuracies 
	Training accuracy: 0.24454942345619202
	Validation accuracy: 0.2450980395078659



What is said about model 2 is more applicable here given there are far more layers and weights and thus parameters to train on a limited size dataset.

## Cross validating models

**Conclusion drawn from CV results**:
1. The chose number of splits of 5 due to GPU memory limitation when using tensorflows fit method which stores model parameters indefinitely
2. Model 1, being VGG16 with all layers frozen and the dense layer network yielded a CV score that is comparable to those during training while tuning
3. Model 2 and 3, being VGG16 with only the 5th block unfroen and with all layer unfrozen respectively, yielded results similar to the hyper parameter tuning and training.
4. The poor results of Model 2 and 3 versus that of Model 1 are clearly due to a very small dataset which did not scale well with the execessive number of parameters of VGG16 

In [9]:
# This function cross validates a given model based on nsplits (i.e. nsplits = kfolds), 
# hence if nsplits = 5, 5 CV "fits" will occur rendering a validation 
# size of 0.2 and a train size of 0.8

from sklearn.model_selection import KFold

def cv(model, model_name, nsplits, X_tr, X_te, y_tr, y_te, verbose=True):
    inputs = np.concatenate((X_tr, X_te), axis=0)
    targets = np.concatenate((y_tr, y_te), axis=0)

    # Define the K-fold Cross Validator
    kfold = KFold(n_splits=nsplits, shuffle=True)

    # K-fold Cross Validation model evaluation

    fold_no = 1
    acc_per_fold = []
    loss_per_fold = []
    init_weights = model.get_weights()

    for train, test in kfold.split(inputs, targets):
         
        print('------------------------------------------------------------------------')
        print(f'Training for fold {fold_no} ...')

        # Fit data to model
        history = model.fit(inputs[train], targets[train],
                  batch_size=32,
                  epochs=20,
                  verbose=verbose)

        # Generate generalization metrics
        scores = model.evaluate(inputs[test], targets[test], verbose=verbose)
#         if verbose:
        print(f'Score for fold {fold_no}: {model.metrics_names[0]} of {scores[0]}; {model.metrics_names[1]} of {scores[1]*100}%')
        acc_per_fold.append(scores[1])
        loss_per_fold.append(scores[0])

        # Increase fold number
        fold_no = fold_no + 1
        model.set_weights(init_weights) # done due to the OOM problem

    return print("Average CV accuracy for model {} is: {}".format(model_name, np.mean(acc_per_fold))), acc_per_fold

In [10]:
# the optimal hyper parameters based on the tuning above yield an optimal # of layers between 1 and 3, and a lot of units
# a lower learning rate is also better and should be between 1x10**-3 -> 5x10**-4, the former was chose for CV purposes

with tf.device('/device:CPU:0'):
    model1 = generic_model(base_model=base_model, n_neurons= [1024, 512], dense_layers= 2, lr= 0.001)
    model2 = generic_model(base_model=base_model2, n_neurons= [1024, 512], dense_layers= 2, lr= 0.001)
    model3 = generic_model(base_model=base_model3, n_neurons= [1024, 512], dense_layers= 2, lr= 0.001)

In [12]:
cv1 = cv(model=model1, model_name= "Model 1", nsplits= 5, X_tr=X_train, X_te=X_test, y_tr=y_train, y_te=y_test, verbose=False)
cv1[0]

------------------------------------------------------------------------
Training for fold 1 ...
Score for fold 1: loss of 1.151639699935913; accuracy of 81.74387216567993%
------------------------------------------------------------------------
Training for fold 2 ...
Score for fold 2: loss of 1.2443650960922241; accuracy of 80.6539535522461%
------------------------------------------------------------------------
Training for fold 3 ...
Score for fold 3: loss of 1.334916353225708; accuracy of 78.88283133506775%
------------------------------------------------------------------------
Training for fold 4 ...
Score for fold 4: loss of 1.2733772993087769; accuracy of 80.6539535522461%
------------------------------------------------------------------------
Training for fold 5 ...
Score for fold 5: loss of 1.0185725688934326; accuracy of 83.37874412536621%
Average CV accuracy for model Model 1 is: 0.8106267094612122


In [14]:
# CV for model 2
cv2 = cv(model=model2, model_name= "Model 2", nsplits= 5, X_tr=X_train, X_te=X_test, y_tr=y_train, y_te=y_test, verbose=False)
cv2[0]

------------------------------------------------------------------------
Training for fold 1 ...
Score for fold 1: loss of 1.595876932144165; accuracy of 24.931880831718445%
------------------------------------------------------------------------
Training for fold 2 ...
Score for fold 2: loss of 1.6069499254226685; accuracy of 23.433242738246918%
------------------------------------------------------------------------
Training for fold 3 ...
Score for fold 3: loss of 1.5993417501449585; accuracy of 25.340598821640015%
------------------------------------------------------------------------
Training for fold 4 ...
Score for fold 4: loss of 1.6019450426101685; accuracy of 23.841962218284607%
------------------------------------------------------------------------
Training for fold 5 ...
Score for fold 5: loss of 1.6002917289733887; accuracy of 24.795641005039215%
Average CV accuracy for model Model 2 is: 0.2446866512298584


In [11]:
cv3 = cv(model=model3, model_name= "Model 3", nsplits= 5, X_tr=X_train, X_te=X_test, y_tr=y_train, y_te=y_test, verbose=False)
cv3[0]

------------------------------------------------------------------------
Training for fold 1 ...
Score for fold 1: loss of 1.5983572006225586; accuracy of 24.250681698322296%
------------------------------------------------------------------------
Training for fold 2 ...
Score for fold 2: loss of 1.6040875911712646; accuracy of 23.705722391605377%
------------------------------------------------------------------------
Training for fold 3 ...
Score for fold 3: loss of 1.601602554321289; accuracy of 25.476840138435364%
------------------------------------------------------------------------
Training for fold 4 ...
Score for fold 4: loss of 1.6005195379257202; accuracy of 23.433242738246918%
------------------------------------------------------------------------
Training for fold 5 ...
Score for fold 5: loss of 1.6002269983291626; accuracy of 25.476840138435364%
Average CV accuracy for model Model 3 is: 0.24468665421009064
