# Project: 'Fruit Image Classification using Transfer Learning' 
## (Kusum Panchal and Arun Kumar Ramakrishnan)

We will train a new model that is able to recognize fresh and rotten fruit. We will use the combination of transfer learning, data augmentation, and fine tuning.  

## The Dataset

In this exercise, we will train a model to recognize fresh and rotten fruits. The dataset comes from [Kaggle](https://www.kaggle.com/sriramr/fruits-fresh-and-rotten-for-classification), a great place to go if you're interested in starting a project after this class. The dataset structure is in the `data/fruits` folder. There are 6 categories of fruits: fresh apples, fresh oranges, fresh bananas, rotten apples, rotten oranges, and rotten bananas. This will mean that our model will require an output layer of 6 neurons to do the categorization successfully. We'll compile the model with `categorical_crossentropy`, as we have more than two categories.

## Load ImageNet Base Model



We will use Transfer Learning here. We will start with a model pretrained on ImageNet; load the model with the correct weights, set an input shape, and choose to remove the last layers of the model. The images have three dimensions: a height, and width, and a number of channels. Because these pictures are in color, there will be three channels for red, green, and blue. We've filled in the input shape for you. 

In [None]:
from tensorflow import keras

base_model = keras.applications.VGG16(
    weights='imagenet',   
    input_shape=(224, 224, 3),
    include_top=False)

Downloading data from https://storage.googleapis.com/tensorflow/keras-applications/vgg16/vgg16_weights_tf_dim_ordering_tf_kernels_notop.h5


## Freeze Base Model

We will freeze the base model, so that all the learning from the ImageNet dataset does not get destroyed in the initial training.

In [None]:
# Freeze base model
base_model.trainable = False

## Add Layers to Model

Now it's time to add layers to the pretrained model. We need to pay close attention to the last dense layer and make sure it has the correct number of neurons to classify the different types of fruit.

In [None]:
# Create inputs with correct shape
inputs = keras.Input(shape=(224, 224, 3))

x = base_model(inputs, training=False)

# Add pooling layer or flatten layer
x = keras.layers.GlobalAveragePooling2D()(x)

# Add final dense layer
outputs = keras.layers.Dense(1, activation = 'softmax')(x)

# Combine inputs and outputs to create model
model = keras.Model(inputs,outputs)

In [None]:
model.summary()

## Compile Model

Now, we will compile the model with loss and metrics options. We're training on a number of different categories, rather than a binary classification problem.

In [None]:
model.compile(optimizer=keras.optimizers.RMSprop(learning_rate = .00001),  # Very low learning rate
              loss=keras.losses.CategoricalCrossentropy(),
              metrics=['accuracy'])

## Augment the Data

Further, we can try to augment the data to improve the dataset. Reference: [Keras ImageDataGenerator class](https://keras.io/api/preprocessing/image/#imagedatagenerator-class). 

In [None]:
from tensorflow.keras.preprocessing.image import ImageDataGenerator

datagen = ImageDataGenerator(
        samplewise_center=True,  # set each sample mean to 0
        rotation_range=10,  # randomly rotate images in the range (degrees, 0 to 180)
        zoom_range = 0.1, # Randomly zoom image 
        width_shift_range=0.1,  # randomly shift images horizontally (fraction of total width)
        height_shift_range=0.1,  # randomly shift images vertically (fraction of total height)
        horizontal_flip=True,  # randomly flip images
        vertical_flip=False) # we don't expect Bo to be upside-down so we will not flip vertically

## Load Dataset

Now we will load the train and validation datasets. It's important to pick the right folders, as well as the right `target_size` of the images (it needs to match the height and width input of the model we've created). 

In [None]:
# load and iterate training dataset
train_it = datagen.flow_from_directory('data/fruits/train/', 
                                       target_size=(224, 224), 
                                       color_mode='rgb', 
                                       class_mode='categorical', 
                                       batch_size=8)
# load and iterate validation dataset
valid_it = datagen.flow_from_directory('data/fruits/valid/', 
                                      target_size=(224, 224), 
                                      color_mode='rgb', 
                                      class_mode='categorical', 
                                      batch_size=8)

## Train the Model

Now, we will pass the `train` and `valid` iterators into the `fit` function, as well as setting your desired number of epochs.

In [None]:
model.fit(train_it,
          validation_data=valid_it,
          steps_per_epoch=train_it.samples/train_it.batch_size,
          validation_steps=valid_it.samples/valid_it.batch_size,
          epochs=10)

## Unfreeze Model for Fine Tuning

We can now fine tune the model further with a very low learning rate. For that, we first need to unfreeze the model.

In [None]:
# Unfreeze the base model
base_model.trainable = False

# Compile the model with a low learning rate
model.compile(optimizer=keras.optimizers.RMSprop(learning_rate = .00001),
              loss = keras.losses.CategoricalCrossentropy() , metrics = ['accuracy'])

In [None]:
model.fit(train_it,
          validation_data=valid_it,
          steps_per_epoch=train_it.samples/train_it.batch_size,
          validation_steps=valid_it.samples/valid_it.batch_size,
          epochs=1)

## Evaluate the Model

The evaluate function will return a tuple, where the first value is the loss, and the second value is the accuracy.

In [None]:
model.evaluate(valid_it, steps=valid_it.samples/valid_it.batch_size)