##### <span style="color: #FF0000;">Baseline Model: Implementation and Rationale</span>

##### <span style="color: #1E90FF;">1. Goal of the Baseline</span>

The aim of our baseline model was to quickly establish a from-scratch convolutional network to confirm our dataset loading, shape consistency, and minimal data augmentation pipeline. By building a small "two-block CNN," we gain an early reference point (both in accuracy and training behavior) before applying more sophisticated or pretrained approaches.

##### <span style="color: #1E90FF;">2. Architectural Decisions</span>

- **Input Shape**: (180,180,3) to align with image_dataset_from_directory(..., image_size=(180,180))
- **Two Convolution Blocks**:
  - Each block has Conv2D(32,3) or Conv2D(64,3) repeated twice, then a MaxPooling2D
  - This standard pattern follows a typical VGG-like design, albeit in a smaller scale
- **Flatten → Dense**:
  - We flatten the pooled feature maps, then apply a Dense(128, relu) with a Dropout(0.5) for partial regularization
  - Lastly, a Dense(num_classes, softmax) for multi-class classification

##### <span style="color: #1E90FF;">3. Data Augmentation</span>

Before the first convolution block, we incorporate Keras's built-in augmentation layers:
- RandomFlip horizontally
- RandomRotation ~10%
- RandomZoom ~10%

These transformations help the model learn invariance to flips, minor rotation, and scale changes, presumably beneficial given our limited dataset sizes (~100–172 images per class).

##### <span style="color: #1E90FF;">4. Key Observations from Training</span>

- **Fluctuating Loss**: The training loss sometimes dips, then spikes. This can be symptomatic of a somewhat high learning rate or an architecture that is quickly overfitting in certain epochs
- **Reasonable Test Accuracy**: (~77–80%) by the final epoch, demonstrating that even a modest CNN can differentiate the classes with moderate reliability
- **Overfitting Tendency**: Our training accuracy occasionally outstripped validation, but data augmentation plus dropout helps mitigate it, showing a measure of stability

##### <span style="color: #1E90FF;">5. Future Enhancements</span>

- **Hyperparameter Tuning**: Investigate a smaller or variable learning rate, or experiment with an additional conv block for deeper feature extraction
- **Transfer Learning**: As recommended in our roadmap, consider using a pretrained network (e.g., VGG16 or MobileNet). This often boosts accuracy for small or moderate datasets
- **Offline Augmentation**: Potentially expand the dataset with the separate offline augmentation script—generating more samples to reduce overfitting

##### <span style="color: #1E90FF;">6. Conclusion and Next Steps</span>

The baseline model proves that a simple two-block CNN, plus minimal on-the-fly augmentation, yields a workable solution. This foundation is well-suited for iterative refinement—either by adjusting hyperparameters or transitioning to a more advanced Transfer Learning pipeline—as guided by our overall project roadmap.

In [None]:
%pip install tensorflow
%pip install matplotlib
%pip install numpy
%pip install scikit-learn   


In [None]:
#!/usr/bin/env python
# baseline_implementation_no_dip.py

import os
import numpy as np
import matplotlib.pyplot as plt
import cv2

import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers

###########################################################################
# Data Augmentation Pipeline
###########################################################################
myDataAug = keras.Sequential([
    layers.RandomFlip("horizontal"),
    layers.RandomRotation(0.1),
    layers.RandomZoom(0.1),
    # You can add layers.RandomContrast(0.1) if desired.
], name="MyDataAug")

###########################################################################
# Baseline CNN Model: repeated 3×3 conv + maxpool
###########################################################################
def build_baseline_cnn(num_classes=3, input_shape=(180,180,3)):
    """
    Basic CNN: repeated 3x3 conv -> maxpool -> flatten -> dense.
    """
    # The input shape is (180,180,3) to match your dataset resizing
    inputs = keras.Input(shape=input_shape, name="input_image")

    # (Optional) Data augmentation first
    x = myDataAug(inputs)

    # 1st conv block
    x = layers.Conv2D(32, kernel_size=3, padding='same', activation='relu')(x)
    x = layers.Conv2D(32, kernel_size=3, padding='same', activation='relu')(x)
    x = layers.MaxPooling2D()(x)

    # 2nd conv block
    x = layers.Conv2D(64, kernel_size=3, padding='same', activation='relu')(x)
    x = layers.Conv2D(64, kernel_size=3, padding='same', activation='relu')(x)
    x = layers.MaxPooling2D()(x)

    # (Optional) 3rd conv block
    # x = layers.Conv2D(128, kernel_size=3, padding='same', activation='relu')(x)
    # x = layers.Conv2D(128, kernel_size=3, padding='same', activation='relu')(x)
    # x = layers.MaxPooling2D()(x)

    # Flatten and dense
    x = layers.Flatten()(x)
    x = layers.Dense(128, activation='relu')(x)
    x = layers.Dropout(0.5)(x)

    outputs = layers.Dense(num_classes, activation='softmax')(x)
    model = keras.Model(inputs, outputs, name="BaselineCNN")
    return model

###########################################################################
# Main
###########################################################################
def main():
    # 1) Paths
    train_dir = "/Users/ryangichuru/Documents/SSD-K/Uni/2nd year/Intro to AI/CNN/assignment-2-ryantigi254-main/data/Output Data/train"
    val_dir   = "/Users/ryangichuru/Documents/SSD-K/Uni/2nd year/Intro to AI/CNN/assignment-2-ryantigi254-main/data/Output Data/val"
    test_dir  = "/Users/ryangichuru/Documents/SSD-K/Uni/2nd year/Intro to AI/CNN/assignment-2-ryantigi254-main/data/Output Data/test"

    # 2) Load Datasets
    batch_size = 32
    img_size = (180, 180)

    train_ds = tf.keras.preprocessing.image_dataset_from_directory(
        train_dir,
        image_size=img_size,
        batch_size=batch_size,
        label_mode='categorical'
    )
    val_ds = tf.keras.preprocessing.image_dataset_from_directory(
        val_dir,
        image_size=img_size,
        batch_size=batch_size,
        label_mode='categorical'
    )
    test_ds = tf.keras.preprocessing.image_dataset_from_directory(
        test_dir,
        image_size=img_size,
        batch_size=batch_size,
        label_mode='categorical'
    )

    # 3) Build model
    num_classes = 3  # e.g. 3 people or classes
    baseline_model = build_baseline_cnn(num_classes=num_classes,
                                        input_shape=(180,180,3))
    baseline_model.summary()

    # 4) Compile
    baseline_model.compile(optimizer='rmsprop',
                           loss='categorical_crossentropy',
                           metrics=['accuracy'])

    # 5) Train
    callbacks_list = [
        keras.callbacks.ModelCheckpoint("baseline_cnn_best.h5",
                                        save_best_only=True,
                                        monitor="val_loss")
    ]
    epochs = 10
    history = baseline_model.fit(
        train_ds,
        validation_data=val_ds,
        epochs=epochs,
        callbacks=callbacks_list
    )

    # 6) Evaluate on test set
    print("\nEvaluating on test set ...")
    test_loss, test_acc = baseline_model.evaluate(test_ds)
    print(f"Test loss: {test_loss:.4f}")
    print(f"Test accuracy: {test_acc:.4f}")

    # 7) Plot training vs. validation
    acc = history.history['accuracy']
    val_acc = history.history['val_accuracy']
    loss = history.history['loss']
    val_loss = history.history['val_loss']
    epochs_range = range(1, len(acc)+1)

    plt.figure(figsize=(12,5))
    plt.subplot(1,2,1)
    plt.plot(epochs_range, acc, 'bo-', label='Training Acc')
    plt.plot(epochs_range, val_acc, 'ro-', label='Validation Acc')
    plt.title('Training & Validation Accuracy')
    plt.legend()

    plt.subplot(1,2,2)
    plt.plot(epochs_range, loss, 'bo-', label='Training Loss')
    plt.plot(epochs_range, val_loss, 'ro-', label='Validation Loss')
    plt.title('Training & Validation Loss')
    plt.legend()
    plt.show()

    # 8) Confusion Matrix
    print("\nGenerating confusion matrix ...")
    all_labels = []
    all_preds = []
    for images, labels in test_ds:
        preds = baseline_model.predict(images)
        all_preds.extend(tf.argmax(preds, axis=1).numpy())
        all_labels.extend(tf.argmax(labels, axis=1).numpy())

    from sklearn.metrics import confusion_matrix, classification_report
    cm = confusion_matrix(all_labels, all_preds)
    print("Confusion Matrix:\n", cm)
    print("Classification Report:\n", 
          classification_report(all_labels, all_preds))

if __name__ == "__main__":
    main()