# Practical Lab 10
## Student Name: `Simardeep Singh`
## Student Roll Number:`8976948`

### INTRODUCTION
### Lab 10: Vanilla CNN and Fine-Tuning VGG16 for Dogs vs Cats Classification

Welcome to Lab 10! In this session, we'll delve into a common practice among Deep Learning Engineers: fine-tuning an existing model to suit a specific task. Specifically, we will focus on the task of classifying images as either dogs or cats. This involves taking a pre-existing model, which is somewhat related to our task, and tweaking it to perform better on our specific dataset.

#### Objective:
To understand and implement the process of fine-tuning a pre-trained model (VGG16) and compare its performance with a custom-built Vanilla CNN on the task of classifying images into two categories: dogs and cats.



#### Obtain the Data:
Download the Dogs vs Cats dataset as outlined in the CSCN8010 class notebook. Ensure you have the correct folder structure and files before proceeding.

#### Exploratory Data Analysis (EDA):
Perform an exploratory analysis of the dataset. Include relevant graphs, statistics, and insights to understand the composition and characteristics of the data. Points to consider:
- Distribution of classes (dogs vs cats)
- Sample images from each class
- Image dimensions and channels

#### Model Training:
In this section, we will train two different neural networks and evaluate their performance.

#### 1. Vanilla CNN:
- **Objective**: Define and train a custom Convolutional Neural Network (CNN) from scratch.
- **Details**: Construct the architecture, specifying the layers, activation functions, and optimizer. Train the model on the Dogs vs Cats dataset.
- **Evaluation**: Utilize callbacks to save the best version of the model and use validation data to check for overfitting.

#### 2. Fine-Tuning VGG16:
- **Objective**: Fine-tune the pre-trained VGG16 model (originally trained on ImageNet) for our specific task.
- **Details**: Modify the top layers of the VGG16 model to suit our binary classification task. Ensure the initial layers remain frozen, and only the new layers are trainable.
- **Evaluation**: Train the model on the dataset, using validation splits to monitor for overfitting. Plot relevant performance graphs.

#### Performance Evaluation:
After training both models, evaluate their performance based on the following metrics:
- Accuracy
- Confusion Matrix
- Precision, Recall, F1-score
- Precision-Recall Curve

#### Case Studies:
Explore specific instances where the models failed to predict correctly. Analyze and discuss possible reasons for these misclassifications.



In [None]:
#LIBRARIES
import os
import shutil
import pathlib
from tensorflow.keras.utils import image_dataset_from_directory
import matplotlib.pyplot as plt
import numpy as np
import seaborn as sns
import tensorflow as tf
from tensorflow.keras import layers, models ,optimizers
from tensorflow.keras.applications import VGG16
from tensorflow.keras.preprocessing.image import ImageDataGenerator
from tensorflow import keras


##### **Q1 Obtain the Data: Get the Dogs vs Cats dataset**

In [None]:

# Define original and new directory paths
original_data_dir = pathlib.Path("C:\\Foundations_of_Machine_Learning_Frameworks_lab\\Labs\\lab1\\CSCN8010-labs-simardeep-singh\\data\\dogVsCatData\\train")
subset_data_dir = pathlib.Path("C:\\Foundations_of_Machine_Learning_Frameworks_lab\\Labs\\lab1\\CSCN8010-labs-simardeep-singh\\data\\dogVsCatData\\subset")

def create_data_subset(subset_name, start_index, end_index):
    """Create subsets of data for cats and dogs."""
    for category in ("cat", "dog"):
        # Create new directory path for the subset
        subset_dir = subset_data_dir / subset_name / category
        # Create the directory if it does not exist
        os.makedirs(subset_dir, exist_ok=True)
        # Copy images from the original directory to the new subset directory
        file_names = [f"{category}.{i}.jpg" for i in range(start_index, end_index)]
        for file_name in file_names:
            shutil.copyfile(src=original_data_dir / file_name,
                            dst=subset_dir / file_name)

# Create subsets for training, validation, and testing
create_data_subset("train", start_index=0, end_index=1000)
create_data_subset("validation", start_index=1000, end_index=1500)
create_data_subset("test", start_index=1500, end_index=2500)


In [None]:

data_folder = pathlib.Path(subset_data_dir)

train_dataset = image_dataset_from_directory(
    data_folder / "train",
    image_size=(180, 180),
    batch_size=32)
validation_dataset = image_dataset_from_directory(
    data_folder / "validation",
    image_size=(180, 180),
    batch_size=32)
test_dataset = image_dataset_from_directory(
    data_folder / "test",
    image_size=(180, 180),
    batch_size=32)



##### **Q2 EDA: Explore the data with relevant graphs, statistics and insights**

#### `1. Visualizing Sample Images`

In [None]:
plt.figure(figsize=(10, 10))
class_names = train_dataset.class_names
for images, labels in train_dataset.take(1):
    for i in range(9):
        ax = plt.subplot(3, 3, i + 1)
        plt.imshow(images[i].numpy().astype("uint8"))
        plt.title(class_names[labels[i]])
        plt.axis("off")


#### `2. Checking Class Distribution`

In [None]:
# Counting number of dogs and cats in the training dataset
label_count = {'dog': 0, 'cat': 0}
for images, labels in train_dataset.unbatch()
    label = class_names[labels.numpy()]
    label_count[label] += 1

plt.bar(label_count.keys(), label_count.values())
plt.title('Class Distribution in Training Set')
plt.show()


#### `3. Image Sizes and Colors`

In [None]:
def plot_color_distribution(images, title):
    plt.figure(figsize=(10, 5))
    colors = ['Red', 'Green', 'Blue']
    for i, color in enumerate(colors):
        avg_color_intensity = np.mean(images[:,:,:,i], axis=(1, 2))
        sns.histplot(avg_color_intensity, kde=True, color=color.lower(), label=f'{color} channel')
    plt.title(f'Color distribution in {title}')
    plt.legend()
    plt.xlabel('Intensity')
    plt.ylabel('Frequency')
    plt.show()

for images, labels in train_dataset.take(1): 
    plot_color_distribution(images, "Training Images")


#### `4. Image Pixel Intensity Distribution`

In [None]:
def plot_pixel_distribution(images, title):
    pixel_values = images.numpy().flatten()
    plt.figure(figsize=(10, 5))
    sns.histplot(pixel_values, bins=50, kde=True)
    plt.title(f'Pixel Intensity Distribution in {title}')
    plt.xlabel('Pixel Intensity')
    plt.ylabel('Frequency')
    plt.show()

for images, _ in train_dataset.take(1):  
    plot_pixel_distribution(images, "Training Images")


#### ` 5. Explore Image Augmentation`

In [None]:
data_augmentation = tf.keras.Sequential([
  tf.keras.layers.RandomFlip('horizontal'),
  tf.keras.layers.RandomRotation(0.1),
  tf.keras.layers.RandomZoom(0.1),
])

plt.figure(figsize=(10, 10))
for images, _ in train_dataset.take(1):
    for i in range(9):
        augmented_images = data_augmentation(images)
        ax = plt.subplot(3, 3, i + 1)
        plt.imshow(augmented_images[0].numpy().astype("uint8"))
        plt.axis("off")


#### `6. data batch shape & labels batch shape`

In [None]:
for data_batch, labels_batch in train_dataset:
    print("data batch shape:", data_batch.shape)
    print("labels batch shape:", labels_batch.shape)
    break


##### **Q3 Train two networks (make sure to use callbacks to save the best model version as done in lab 9):**
#### 3.1 Define a Neural Network of your choice 

In [None]:

# Define the model
model_vanilla = models.Sequential([
    layers.Conv2D(32, (3, 3), activation='relu', input_shape=(180, 180, 3)),
    layers.MaxPooling2D((2, 2)),
    layers.Conv2D(64, (3, 3), activation='relu'),
    layers.MaxPooling2D((2, 2)),
    layers.Conv2D(128, (3, 3), activation='relu'),
    layers.MaxPooling2D((2, 2)),
    layers.Flatten(),
    layers.Dense(512, activation='relu'),
    layers.Dense(1, activation='sigmoid')
])


In [None]:
model_vanilla.summary()

In [None]:
model_vanilla.compile(optimizer='adam',
              loss='binary_crossentropy',
              metrics=['accuracy'])


In [None]:
model_vanilla.summary()

In [None]:
callbacks = [
    tf.keras.callbacks.ModelCheckpoint(
        filepath="C:\\Foundations_of_Machine_Learning_Frameworks_lab\\Labs\\lab1\\CSCN8010-labs-simardeep-singh\\convnet_from_scratch.keras",
        save_best_only=True,
        monitor="val_loss")
]

In [None]:
history = model_vanilla.fit(
    train_dataset,
    epochs=30,
    validation_data=validation_dataset,
    callbacks=callbacks)

#### 3.2 Fine-Tune VGG16 (pre-trained on imagenet). Make sure to use validation to test for over-fitting. Plot the appropriate graph 

In [None]:
base_model = VGG16(weights='imagenet', include_top=False, input_shape=(180, 180, 3))
base_model.trainable = False  


In [None]:
model = models.Sequential([
    base_model,
    layers.Flatten(),
    layers.Dense(256, activation='relu'),
    layers.Dropout(0.5), 
    layers.Dense(1, activation='sigmoid') 
])


In [None]:
model.compile(optimizer=optimizers.Adam(learning_rate=0.0001),
              loss='binary_crossentropy',
              metrics=['accuracy'])


In [None]:
callbacks = [
    tf.keras.callbacks.ModelCheckpoint(
        filepath="C:\\Foundations_of_Machine_Learning_Frameworks_lab\\Labs\\lab1\\CSCN8010-labs-simardeep-singh\\vgg16_fine_tuned.keras",
        save_best_only=True,
        monitor="val_loss")
]


In [None]:

history = model.fit(
    train_dataset,
    epochs=30,
    validation_data=validation_dataset,
    callbacks=callbacks)

In [None]:
acc = history.history['accuracy']
val_acc = history.history['val_accuracy']
loss = history.history['loss']
val_loss = history.history['val_loss']

epochs = range(1, len(acc) + 1)

plt.figure(figsize=(15, 5))
plt.subplot(1, 2, 1)
plt.plot(epochs, acc, label='Training Accuracy')
plt.plot(epochs, val_acc, label='Validation Accuracy')
plt.title('Training and Validation Accuracy')
plt.legend()

plt.subplot(1, 2, 2)
plt.plot(epochs, loss, label='Training Loss')
plt.plot(epochs, val_loss, label='Validation Loss')
plt.title('Training and Validation Loss')
plt.legend()

plt.show()


#### **Q4 Explore the relative performance of the models (make sure to load the best version of each model)**

##### `Loading best models`

In [None]:
vanilla_model_path = 'C:\\Foundations_of_Machine_Learning_Frameworks_lab\\Labs\\lab1\\CSCN8010-labs-simardeep-singh\\convnet_from_scratch.keras'
vanilla_model = tf.keras.models.load_model(vanilla_model_path)

# UNABLE TO LOAD VGG16 MODEL

#### `Accuracy`

In [None]:
test_loss, test_accuracy = vanilla_model.evaluate(test_dataset)
print(f"Test Accuracy: {test_accuracy:.3f}")


##### **The vanilla model achieved a test accuracy of approximately 59.5%, with a loss of 0.6590. The accuracy metric indicates that the model correctly predicts the outcome (whether a sample belongs to one class or the other in a binary classification task) about 60.11% of the time during training and approximately 59.5% of the time on the unseen test data.**

#### `Confusion Metrics`

In [None]:
import numpy as np
from sklearn.metrics import classification_report, confusion_matrix

# Get true labels and model predictions
y_true = np.concatenate([labels.numpy() for _, labels in test_dataset])
y_pred = np.round(vanilla_model.predict(test_dataset)).flatten()

# Compute metrics
print("Confusion Matrix:")
print(confusion_matrix(y_true, y_pred))
print("\nClassification Report:")
print(classification_report(y_true, y_pred, target_names=['Class 0', 'Class 1']))


##### **The model was evaluated over 63 batches, taking approximately 7 seconds with an average step time of 112 milliseconds. The confusion matrix reveals that out of 2000 total samples (1000 in each class), the model correctly predicted 449 instances of Class 0 and 536 instances of Class 1, leading to an overall accuracy of around 49.25% which means the model performs only slightly better than random guessing in this binary classification task.**

#### `precision, recall, F1-score`

In [None]:
import numpy as np


y_true = np.concatenate([y for _, y in test_dataset], axis=0)  
y_pred_probs = vanilla_model.predict(test_dataset) 

threshold = 0.5
y_pred = (y_pred_probs > threshold).astype(int)  
from sklearn.metrics import precision_score, recall_score, f1_score


precision = precision_score(y_true, y_pred)
recall = recall_score(y_true, y_pred)
f1 = f1_score(y_true, y_pred)

print(f'Precision: {precision:.3f}')
print(f'Recall: {recall:.3f}')
print(f'F1-Score: {f1:.3f}')


**-The model was evaluated over 63 batches, taking an average of 120 milliseconds per step, totaling approximately 8 seconds for the complete evaluation.**

**-Precision: 0.481 - This means that when the model predicts the positive class, it is correct about 48.1% of the time. This is an indicator of the quality of the positive class predictions.**

**-Recall: 0.523 - Signifying that the model correctly identifies 52.3% of all actual positive instances. This measures the model's ability to detect positive instances.**

**-F1-Score: 0.501 - An F1-score of 0.501 suggests the model has moderate accuracy, with a balanced trade-off between precision and recall.**

####   `precision-recall curve.`

In [None]:
import matplotlib.pyplot as plt
from sklearn.metrics import precision_recall_curve
import numpy as np


y_scores = vanilla_model.predict(test_dataset)  
y_true = np.concatenate([y for x, y in test_dataset], axis=0)  

precision, recall, thresholds = precision_recall_curve(y_true, y_scores)

plt.figure(figsize=(8, 6))
plt.plot(recall, precision, marker='.')
plt.title('Precision-Recall curve')
plt.xlabel('Recall')
plt.ylabel('Precision')
plt.grid(True)
plt.show()


**At the beginning of the threshold spectrum, we observe that the precision starts at 0.5 with a recall of 1. This indicates that when classifying almost all samples as positive, half are correct (since precision is the fraction of true positives among all predicted positives), and all true positives are captured (since recall is the fraction of true positives among all actual positives). As thresholds increase, both precision and recall show variations, which is expected as the classification criteria become stricter.**

### `CONCLUSION`

1. Accuracy:  the model seems to demonstrate moderate effectiveness. An accuracy around 50-60% was indicated, which suggests that the model performs slightly better than random guessing but is not highly reliable for making accurate predictions.

2. Confusion Matrix: The confusion matrix shows a relatively even distribution of true positives (TP), false positives (FP), true negatives (TN), and false negatives (FN). However, the model seems to struggle equally with both classes, as indicated by the high number of false positives and false negatives. This suggests that the model does not strongly favor one class over the other but also fails to effectively distinguish between them in many cases.

3. Precision and Recall: The model's precision and recall were low (around 0.48 and 0.52, respectively). This indicates that when the model predicts an instance as positive, it is correct less than half the time. Similarly, the model identifies just over half of all actual positive instances. These values suggest a lack of confidence in the model’s predictive power.

4. F1-Score: The F1-score, being a harmonic mean of precision and recall, was also moderate (around 0.50). This score confirms that the model has a balanced but mediocre performance regarding precision and recall. The balance is not due to high performance but rather to a similar level of low performance on both metrics.

5. Precision-Recall Curve: The precision-recall curve provide insight into the trade-off between precision and recall at different thresholds. However the curve suggested a moderate starting performance which deteriorates with increasing thresholds.

**The model exhibits average performance, struggling to effectively differentiate between the classes with significant precision or recall. Its predictive power is marginally better than random chance but falls short of a desirable or reliable predictive model. The similar low rates of precision and recall indicate that there is no significant bias towards one class but rather a general difficulty in classification.**
