

# AAI612: Deep Learning & its Applications

*Notebook 4.3: Graded Assignment: Mini Project I*

<a href="https://colab.research.google.com/github/jgeitani/AAI612_Geitani/blob/main/Week4/JadGeitani_Notebook4.4.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Assessment

In this assessment, you will train a new model that is able to recognize fresh and rotten fruit. You will need to get the model to a validation accuracy of `92%` in order to pass the assessment, though we challenge you to do even better if you can. You will have the use the skills that you learned in the previous exercises. Specifically, we suggest using some combination of transfer learning, data augmentation, and fine tuning.

## The Dataset

In this exercise, you will train a model to recognize fresh and rotten fruits. Download the dataset from [Kaggle](https://www.kaggle.com/sriramr/fruits-fresh-and-rotten-for-classification). The dataset structure is in the `data/fruits` folder. There are 6 categories of fruits: fresh apples, fresh oranges, fresh bananas, rotten apples, rotten oranges, and rotten bananas. This will mean that your model will require an output layer of 6 neurons to do the categorization successfully. You'll also need to compile the model with `categorical_crossentropy`, as we have more than two categories.

![image.png](attachment:4c8c02c9-0cbe-4048-8d01-cdd5e3cf3fe6.png)<img src="./images/fruits.png" style="width: 600px;">

In [14]:
import kagglehub

# Download latest version
path = kagglehub.dataset_download("sriramr/fruits-fresh-and-rotten-for-classification")

print("Path to dataset files:", path)

Downloading from https://www.kaggle.com/api/v1/datasets/download/sriramr/fruits-fresh-and-rotten-for-classification?dataset_version_number=1...


100%|██████████| 3.58G/3.58G [00:35<00:00, 108MB/s]

Extracting files...





Path to dataset files: /root/.cache/kagglehub/datasets/sriramr/fruits-fresh-and-rotten-for-classification/versions/1


In [15]:
!ls /root/.cache/kagglehub/datasets/sriramr/fruits-fresh-and-rotten-for-classification/versions/1

dataset


In [16]:
import tensorflow as tf

# Check GPU availability
print("Num GPUs Available: ", len(tf.config.list_physical_devices('GPU')))
print(tf.config.list_physical_devices('GPU'))

# Check GPU details
!nvidia-smi


Num GPUs Available:  1
[PhysicalDevice(name='/physical_device:GPU:0', device_type='GPU')]
Fri Feb 14 10:41:56 2025       
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 550.54.15              Driver Version: 550.54.15      CUDA Version: 12.4     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|   0  Tesla T4                       Off |   00000000:00:04.0 Off |                    0 |
| N/A   75C    P0             31W /   70W |     230MiB /  15360MiB |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+-------

## Load ImageNet Base Model

Start with a model pretrained on `ImageNet`. Load the model with the correct weights, set an input shape, and choose to remove the last layers of the model. Remember that images have three dimensions: a height, and width, and a number of channels. Because these pictures are in color, there will be three channels for red, green, and blue. We've filled in the input shape for you. This cannot be changed or the assessment will fail. If you need a reference for setting up the pretrained model, please take a look at [Notebook 4.2](https://github.com/harmanani/AAI612/blob/main/Week4/Notebook%204.2.ipynb) where we implemented transfer learning.

In [17]:
import ssl
from tensorflow import keras

ssl._create_default_https_context = ssl._create_unverified_context


base_model = keras.applications.VGG16(
    weights='imagenet',
    input_shape=(224, 224, 3),
    include_top=False)

## Freeze Base Model

Next, we suggest freezing the base model. This is done so that all the learning from the ImageNet dataset does not get destroyed in the initial training.

In [18]:
# Freeze base model
base_model.trainable = False

## Add Layers to Model

Now it's time to add layers to the pretrained model. Pay close attention to the last dense layer and make sure it has the correct number of neurons to classify the different types of fruit.  You may add more layers than specified below.

In [19]:
# Create inputs with correct shape
inputs = keras.Input(shape=(224, 224, 3))

x = base_model(inputs, training=False)

# Add pooling layer or flatten layer
x = keras.layers.GlobalAveragePooling2D()(x)

# Add final dense layer
outputs = keras.layers.Dense(6, activation = 'softmax')(x)

# Combine inputs and outputs to create model
model = keras.Model(inputs, outputs)

In [20]:
model.summary()

## Compile Model

Now it's time to compile the model with loss and metrics options. Remember that we're training on a number of different categories, rather than a binary classification problem.

In [21]:
model.compile(loss = keras.losses.CategoricalCrossentropy(from_logits=False) , metrics = [keras.metrics.CategoricalAccuracy()])

## Augment the Data

If you'd like, try to augment the data to improve the dataset. There is also documentation for the [Keras ImageDataGenerator class](https://keras.io/api/preprocessing/image/#imagedatagenerator-class). This step is optional, but it may be helpful to get to 92% accuracy.

In [23]:
from tensorflow.keras.preprocessing.image import ImageDataGenerator

datagen = ImageDataGenerator(samplewise_center=True,
        rotation_range=10,
        zoom_range = 0.1,
        width_shift_range=0.1,
        height_shift_range=0.1,
        horizontal_flip=True,
        vertical_flip=False)

## Load Dataset

Now it's time to load the train and validation datasets. Pick the right folders, as well as the right `target_size` of the images (it needs to match the height and width input of the model you've created).

In [24]:
# load and iterate training dataset
train_it = datagen.flow_from_directory('/root/.cache/kagglehub/datasets/sriramr/fruits-fresh-and-rotten-for-classification/versions/1/dataset/train',
                                       target_size=(224, 224),
                                       color_mode='rgb',
                                       class_mode="categorical")
# load and iterate validation dataset
valid_it = datagen.flow_from_directory('/root/.cache/kagglehub/datasets/sriramr/fruits-fresh-and-rotten-for-classification/versions/1/dataset/test',
                                      target_size=(224, 224),
                                      color_mode='rgb',
                                      class_mode="categorical")

Found 10901 images belonging to 6 classes.
Found 2698 images belonging to 6 classes.


## Train the Model

Time to train the model! Pass the `train` and `valid` iterators into the `fit` function, as well as setting your desired number of epochs.

In [25]:
model.fit(train_it,
          validation_data=valid_it,
          steps_per_epoch=int(train_it.samples/train_it.batch_size),
          validation_steps=int(valid_it.samples/valid_it.batch_size),
          epochs=20)

  self._warn_if_super_not_called()


Epoch 1/20
[1m340/340[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m239s[0m 660ms/step - categorical_accuracy: 0.6274 - loss: 1.8455 - val_categorical_accuracy: 0.9275 - val_loss: 0.2023
Epoch 2/20
[1m  1/340[0m [37m━━━━━━━━━━━━━━━━━━━━[0m [1m52s[0m 156ms/step - categorical_accuracy: 0.9375 - loss: 0.1698



[1m340/340[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m40s[0m 119ms/step - categorical_accuracy: 0.9375 - loss: 0.1698 - val_categorical_accuracy: 0.9457 - val_loss: 0.1509
Epoch 3/20
[1m340/340[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m262s[0m 770ms/step - categorical_accuracy: 0.9478 - loss: 0.1547 - val_categorical_accuracy: 0.9632 - val_loss: 0.0945
Epoch 4/20
[1m340/340[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m40s[0m 118ms/step - categorical_accuracy: 0.9375 - loss: 0.1219 - val_categorical_accuracy: 0.9632 - val_loss: 0.0982
Epoch 5/20
[1m340/340[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m206s[0m 604ms/step - categorical_accuracy: 0.9634 - loss: 0.1008 - val_categorical_accuracy: 0.9743 - val_loss: 0.0737
Epoch 6/20
[1m340/340[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m41s[0m 121ms/step - categorical_accuracy: 0.9688 - loss: 0.2370 - val_categorical_accuracy: 

<keras.src.callbacks.history.History at 0x7e8b1a5f9650>

## Unfreeze Model for Fine Tuning

If you have reached 92% validation accuracy already, this next step is optional. If not, we suggest fine tuning the model with a very low learning rate.

In [28]:
# Unfreeze the base model
base_model.trainable = True

# Compile the model with a low learning rate
model.compile(optimizer=keras.optimizers.RMSprop(learning_rate = .00001), loss = keras.losses.CategoricalCrossentropy(from_logits=False)
 , metrics = [keras.metrics.CategoricalAccuracy()]
)

In [29]:
model.fit(train_it,
          validation_data=valid_it,
          steps_per_epoch=int(train_it.samples/train_it.batch_size),
          validation_steps=int(valid_it.samples/valid_it.batch_size),
          epochs=10)

Epoch 1/10
[1m340/340[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m275s[0m 795ms/step - categorical_accuracy: 0.9752 - loss: 0.0828 - val_categorical_accuracy: 0.9754 - val_loss: 0.0699
Epoch 2/10
[1m  1/340[0m [37m━━━━━━━━━━━━━━━━━━━━[0m [1m2:34[0m 455ms/step - categorical_accuracy: 1.0000 - loss: 0.0249



[1m340/340[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m42s[0m 122ms/step - categorical_accuracy: 1.0000 - loss: 0.0249 - val_categorical_accuracy: 0.9903 - val_loss: 0.0263
Epoch 3/10
[1m340/340[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m261s[0m 768ms/step - categorical_accuracy: 0.9959 - loss: 0.0123 - val_categorical_accuracy: 0.9948 - val_loss: 0.0146
Epoch 4/10
[1m340/340[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m40s[0m 118ms/step - categorical_accuracy: 1.0000 - loss: 0.0082 - val_categorical_accuracy: 0.9978 - val_loss: 0.0053
Epoch 5/10
[1m340/340[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m322s[0m 886ms/step - categorical_accuracy: 0.9972 - loss: 0.0105 - val_categorical_accuracy: 0.9981 - val_loss: 0.0039
Epoch 6/10
[1m340/340[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m40s[0m 118ms/step - categorical_accuracy: 1.0000 - loss: 1.1856e-04 - val_categorical_accur

<keras.src.callbacks.history.History at 0x7e8b1a125d10>

## Evaluate the Model

Hopefully, you now have a model that has a validation accuracy of 92% or higher. If not, you may want to go back and either run more epochs of training, or adjust your data augmentation.

## Discussion

This is a multi-class classification problem, so we used the CategoricalCrossentropy and set the from_logits to False and the output layer uses a softmax activation.

This model evaluates performance based on how often the predicted class matches the actual class.

The base model (VGG16 in this case) was initially frozen then later on was unfrozen. While freezing the model, we trained only the custom top layers, then we fine-tuned the entire model by unfreezing the base and using a lower learning rate.

Now from the obtained results we can see that the model performed exceptionally well during the frozen and also the unfrozen phase. In the frozen phase, there were early overfitting signs where the training accuracy reached 100% by epoch 8, while validation accuracy improved at a slower pace to become stable around 98-99%. However, the validation loss continued to decrease showing that the model generalized well.

Then in the unfrozen phase, the validation accuracy reached 99.96% at epoch 9, and the validation loss kept decreasing until it reached 0.0026 at epoch 10. We can see that in the unfrozen phase there is no overfitting since the difference between training accuracy and validation accuracy is almost negligible.

From the above results, we can see that his model has a high validation accuracy and extremely low validation loss, which means that the model generalizes well to unseen data.

To achieve these results we used a pre-trained model (VGG16),  fine-tuning by unfreezing the model and setting a low learning rate, we also used the appropriate loss and metrics which are CategoricalCrossentropy and CattegoricalAccuracy in this case. Reaching a validation accuracy of 99% and a validation loss of 0.0026 is considered great and shows that the model performs well with new data.