## This uses transfer learning on a pre-trained classifier to build an Alpaca Classifier

 It uses MobileNetV2 which has been pre-trained on ImageNet, a dataset containing over 14 million images and 1000 classes.

## 1. Packages

In [1]:
import matplotlib.pyplot as plt
import numpy as np
import os
import tensorflow as tf
import tensorflow.keras.layers as tfl

from tensorflow.keras.preprocessing import image_dataset_from_directory
from tensorflow.keras.layers.experimental.preprocessing import RandomFlip, RandomRotation

In [2]:
#Generate dataset from images in the dataset folder

BATCH_SIZE = 32
IMG_SIZE = (160, 160)
directory = "dataset/"
# Create training and validation set. Use the same seed to avoid image overlap
train_dataset = image_dataset_from_directory(directory,
                                             shuffle=True,
                                             batch_size=BATCH_SIZE,
                                             image_size=IMG_SIZE,
                                             validation_split=0.2,
                                             subset='training',
                                             seed=42)
validation_dataset = image_dataset_from_directory(directory,
                                             shuffle=True,
                                             batch_size=BATCH_SIZE,
                                             image_size=IMG_SIZE,
                                             validation_split=0.2,
                                             subset='validation',
                                             seed=42)

Found 327 files belonging to 2 classes.
Using 262 files for training.
Found 327 files belonging to 2 classes.
Using 65 files for validation.


In [3]:
#view class names
class_names = train_dataset.class_names
class_names

['alpaca', 'not alpaca']

## Pre-fetch data and Pre-process


Using prefetch() prevents a memory bottleneck that can occur when reading from disk. It sets aside some data and keeps it ready for when it's needed, by creating a source dataset from the input data, applying a transformation to preprocess it, then iterating over the dataset one element at a time. 

In [4]:
AUTOTUNE = tf.data.experimental.AUTOTUNE
train_dataset = train_dataset.prefetch(buffer_size=AUTOTUNE) #choose number of elements to pre-fetch automatically

In [5]:
# data will need to be augmented to increase size of dataset

def data_augmenter():
    '''
    Create a Sequential model composed of 2 layers.
    layer 1 - causes flip on horinzotal axis
    layer 2 -causes rotation
    Returns:
        tf.keras.Sequential
    '''
    
    data_augmentation = tf.keras.models.Sequential();
    data_augmentation.add(RandomFlip('horizontal'))
    data_augmentation.add(RandomRotation(0.2))
   
    
    return data_augmentation

When using a pretrained model, it's best to reuse the weights it was trained on. MobileNetV2 is already part of Keras and we can access it. Since we're using a pre-trained model that was trained on the normalization values [-1,1], it's best practice to reuse that standard with tf.keras.applications.mobilenet_v2.preprocess_input.

In [6]:
preprocess_input = tf.keras.applications.mobilenet_v2.preprocess_input
#The preprocess_input function is meant to adequate the image to the format the model requires.
#Some models use images with values ranging from 0 to 1. Others from -1 to +1
# We use the values that were used for the original mobilenetv2. 
#we can get them from the applications.mobilenet_v2.preprocess_input

## Base Model to Alpaca model

Given the MobileNetV2 base model, the top layers are used for classification. Inorder to convert it to an alpaca classifier, we would exclude these top layers from our base model and then build our model from there

Training will be done as follows:
- Delete the top layer (the classification layer)
- Set include_top in base_model as False
- Add a new classifier layer
- Train only one layer by freezing the rest of the network
- A single neuron is enough to solve a binary classification problem.
- Freeze the base model and train the newly-created classifier layer
- Set base model.trainable=False to avoid changing the weights and train only the new layer
- Set training in base_model to False to avoid keeping track of statistics in the batch norm l

In [None]:
def alpaca_model(image_shape=IMG_SIZE, data_augmentation=data_augmenter()):
    ''' Define a tf.keras model for binary classification out of the MobileNetV2 model
    Arguments:
        image_shape -- Image width and height
        data_augmentation -- data augmentation function
    Returns:
    Returns:
        tf.keras.model
    '''
    
    
    input_shape = image_shape + (3,) # image shape plus 3 dimensions. In this case 160,160,3 (For the colour channels).
    
   
    #This is the base model of the mobileNetV2.Notice how the top layer is excluded
    base_model = tf.keras.applications.MobileNetV2(input_shape=input_shape,
                                                   include_top=False, # <== Important!!!!
                                                   weights='imagenet') # From imageNet
    
    # freeze the base model by making it non trainable
    base_model.trainable = False 

    # create the input layer (Same as the imageNetv2 input size)
    inputs = tf.keras.Input(shape=input_shape) 
    
    # apply data augmentation to the inputs
    x = data_augmentation(inputs)
    
    # data preprocessing using the same weights the model was trained on
    
    x = preprocess_input(x) 
    
    # set training to False to avoid keeping track of statistics in the batch norm layer
    x = base_model(x, training=False) 
    
    # -----------------add the new Binary classification layers------------------------------
    # use global avg pooling to summarize the info in each channel
    #Global Average Pooling is a pooling operation designed to replace fully connected layers in classical CNNs. 
    #The idea is to generate one feature map for each corresponding category of the classification task in the last
    #mlpconv layer. 
    #Instead of adding fully connected layers on top of the feature maps, 
    #we take the average of each feature map, and the resulting vector is fed directly into the softmax layer.
    
    x = tf.keras.layers.GlobalAveragePooling2D()(x) 
    
    # include dropout with probability of 0.2 to avoid overfitting
    x = tf.keras.layers.Dropout(0.2)(x)
        
    # use a prediction layer with one neuron (as a binary classifier only needs one)
    outputs = tf.keras.layers.Dense(1)(x)
    
   ### END CODE HERE
    
    model = tf.keras.Model(inputs, outputs)
    
    return model