# Transfer Learning Model

## Setting Up

Since we are conducting transfer learning there were a few items that needed to be completed in order to get the model to work with the current dataset.

- First
    - We call in the model by invoking hub.KerasLayer with a link to it in the zoo
- Second
    - Our images do not have the correct input size so we needed to reshape with padding in order to not distort the images
- Third
    - Our model requires us to use images with 3 channels so we had to use greyscale_to_rgb function available in TensorFlow

## Initial Test

using efficientnetv2-b0 : https://tfhub.dev/google/imagenet/efficientnet_v2_imagenet1k_b0/feature_vector/2 for our model and conducting transfer learning, we froze the upper layers intially

#### Original Setup

<img src="model_arc.png" >

### Model Summary

Initial Parameters:
- batch size: 120=8
- epochs: 10
- tf.keras.regularizers.l2(0.0001))
- tf.keras.losses.CategoricalCrossentropy(from_logits=True, label_smoothing=0.1)
- tf.keras.optimizers.SGD(learning_rate=0.005, momentum=0.9)

In [None]:
# Model: "sequential"
# _________________________________________________________________
# Layer (type)                 Output Shape              Param #   
# =================================================================
# keras_layer (KerasLayer)     (None, 1280)              5919312   
# _________________________________________________________________
# dropout (Dropout)            (None, 1280)              0         
# _________________________________________________________________
# dense (Dense)                (None, 4)                 5124      
# =================================================================
# Total params: 5,924,436
# Trainable params: 5,124
# Non-trainable params: 5,919,312
# _________________________________________________________________

In [None]:
# 144/144 [==============================] - 6s 35ms/step - loss: 0.9730 - accuracy: 0.6191
# Train loss: 0.9729554057121277
# Train accuracy: 0.619140625
# 17/17 [==============================] - 2s 47ms/step - loss: 1.0249 - accuracy: 0.5614
# validation loss: 1.02485990524292
# validation accuracy: 0.5614035129547119
# 40/40 [==============================] - 2s 45ms/step - loss: 1.0307 - accuracy: 0.5637
# Test loss: 1.0306953191757202
# Test accuracy: 0.5637216567993164

### Tests

Tried to increase epochs but that did not do much to model accuracy. The model stopped learning after the 15th epoch. Going to try to lower batch size and see what occurs. Lowering batch size did not do much to increase the accuracy. Going to try to use a different optimizers with the current setup.

When we switched the Adam optimizer it increased test accuracy by 3%!

In [None]:
# 144/144 [==============================] - 6s 36ms/step - loss: 0.8980 - accuracy: 0.6793
# Train loss: 0.8980469107627869
# Train accuracy: 0.6792534589767456
# 17/17 [==============================] - 1s 34ms/step - loss: 0.9113 - accuracy: 0.6784
# validation loss: 0.9113450646400452
# validation accuracy: 0.6783625483512878
# 40/40 [==============================] - 1s 36ms/step - loss: 1.0105 - accuracy: 0.5934
# Test loss: 1.0104659795761108
# Test accuracy: 0.5934323668479919

The current network setup is the same as before but the batch size and epochs are 20 and 64. Going to increase epochs by an extreme amount to 100. The test accuracy increased by 3% to 62% but we do not think 100 epochs are necessary. Through testing we find that 60 epochs are necessary for achieving this accuracy.

We decided to add a datagenerator in order to extend the current dataset which will augment the images within the batch randomly. Adding this data generator, we found, lowered the accuracy intially, so we decided to tweak with it.

Before tweaking the data generator, we decided to unfreeze the model because we froze the previous layers besides the final ouput layer. This increased accuracy to 64.7%, but there was much more overfitting. We then decided to test the data generator in order to reduce this.

In [None]:
# 144/144 [==============================] - 7s 35ms/step - loss: 0.5382 - accuracy: 0.9010
# Train loss: 0.538211464881897
# Train accuracy: 0.9010416865348816
# 17/17 [==============================] - 3s 48ms/step - loss: 0.6544 - accuracy: 0.8460
# validation loss: 0.6543669104576111
# validation accuracy: 0.8460038900375366
# 40/40 [==============================] - 2s 45ms/step - loss: 1.0864 - accuracy: 0.6474
# Test loss: 1.0863642692565918
# Test accuracy: 0.6473807692527771

We added the following to the data generator:

In [31]:
train_datagen = tf.keras.preprocessing.image.ImageDataGenerator(
   horizontal_flip=True,
    vertical_flip=True,
    fill_mode='nearest',
        # Randomly translate the images horizontally and vertically
    width_shift_range=0.1,
    height_shift_range=0.1,

    # Randomly apply shearing transformations
    shear_range=0.1,

    # Randomly zoom in and out on the images
    zoom_range=0.1
)
train_datagen.fit(X_train)

- Besides of having only horizontal_flip, vertical_flip, and fill_mode, we added everything past that

This created a model of accuracy 67%. There is still a decent amount of overfitting ocurring, but that is bound to happen with a small dataset as the brain scans.

## Conclusion

After numerous testing we determined the best model to be of original architecture from the tensorflow hub where we switched out the final output layer for our number of classes. We also determined that Adam optimization was a much better optimizer than the original SGD. We also learned that using a datagenerator helped decrease overfitting and forced the model to learn actual features. In the end we ended with a model of 67% accuracy