# CSC2042S Machine Learning @ UCT
# Supervised Learning 

**Author: Buqwana Xolisile**

This notebook implements a binary classifier extended to multi-class classification using the **one-vs-rest** approach. It begins by loading the `simpsons-mnist-master` dataset using the **Loader** class and performing data preprocessing, including normalization and splitting into training, validation, and test sets.

Next, the **BinaryPerceptron** and **MultiClassPerceptron** classes are initialized. The perceptron models are trained on both grayscale and RGB images, with the **PerceptronTrainer** class handling weight updates, data shuffling, and different stopping criteria such as fixed epochs, error threshold, and early stopping.

After training, the models are evaluated on the validation and test sets. Key metrics such as accuracy, precision, recall, and F1 score are computed to assess classification performance. Confusion matrices are also visualized to identify which characters are well-classified and where misclassifications occur.

Finally, hyperparameter tuning is performed using the **HyperparameterTuner** class, exploring different learning rates, weight initialization methods, and normalization techniques. The best-performing models are selected based on validation accuracy, and their performance is compared between grayscale and RGB inputs, highlighting the effect of color information on multi-class character recognition.


**Outline:**

* [1. Data processing](#1.-data-processing)
* [2. Multi-class perceptron implementation](#section2)
* [3. Training](#section3)
* [4. Hyperparameter tuning](#section4)
* [5. Evaluation](#section5)
  

## Imports, Installations, and Downloads

This notebook will make extensive use of some standard Python libraries for scientific computing, namely:
* ``numpy`` for storing and manipulating arrays/matrices of numerical data.
* ``matplotlib`` for displaying image data.
* ``PIL (Image)`` for opening and processing images.
* ``random`` for generating random numbers and shuffling data.
* ``os`` for interacting with files and directories.
* ``struct`` for handling binary data formats.

In [None]:
%matplotlib inline
from matplotlib.pyplot import imshow
import matplotlib.pyplot as plt
from PIL import Image
import numpy as np
import random
import os
import struct

# 1. Data processing

We will now load the `Simpsons-MNIST` which is a small dataset of The Simpsons characters consisting of a training set of 8,000 examples
and a test set of 2,000 examples. Each example is a 28x28 RGB/grayscale image, associated with a label from 10 classes.
This dataset is available in both formats, RGB and grayscale available at: [MNIST](https://github.com/google/n-digit-mnist)
We use the defined class below **Loader** which has the following functions which works as follows:

The **load_images_from_directory** function takes a folder containing subfolders for each character, and each subfolder is assigned a numeric label according to the label_map. It then goes through all the JPEG images in each subfolder, loads them using PIL, and converts them into NumPy arrays. For every image, the corresponding label of its folder is also stored. Finally, the function returns two NumPy arrays: one containing all the images and one containing their associated labels.Also the **normalize_images** function which converts image pixel values to floating-point numbers and scales them to a range between 0 and 1.

**flatten_images** this function flattens each image into a 1D vector by reshaping the array from (n_samples, height, width) or (n_samples, height, width, channels) to (n_samples, features) for perceptron modeling,then the **split_train_validation function** takes the training data and splits it into training and validation sets. It randomly shuffles the data, selects a 20% portion for validation, and keeps the rest for training then returns both the new training set and the validation set.

Finally **load_rgb_data** and **load_grayscale_data** functions that will mainly act as wrappers that call earlier helper functions that handle the process of loading images, splitting them into training and validation sets, normalizing pixel values, and flattening them into vectors in which they return clean datasets.

In [None]:
class Loader:
    """
    Simpsons-MNIST JPEG Image Loader
    Loads images from 'simpsons-mnist-master/dataset/grayscale' and 'simpsons-mnist-master/dataset/rgb'
    directories containing 'test' and 'train' folders with JPEG files.
    """
    
    def __init__(self, base_dir='simpsons-mnist-master/dataset'):
        self.base_dir = base_dir
        self.grayscale_dir = os.path.join(base_dir, 'grayscale')
        self.rgb_dir = os.path.join(base_dir, 'rgb')
        
    def load_images_from_directory(self, directory_path):
        """Load all JPEG images from a directory and return as numpy array"""
        images = []
        labels = []
        label_map = {
            "bart_simpson": 0,
            "charles_montgomery_burns": 1,
            "homer_simpson": 2,
            "krusty_the_clown": 3,
            "lisa_simpson": 4,
            "marge_simpson": 5,
            "milhouse_van_houten": 6,
            "moe_szyslak": 7,
            "ned_flanders": 8, 
            "principal_skinner": 9
        }
        
        character_folders = [f for f in os.listdir(directory_path) if os.path.isdir(os.path.join(directory_path, f))]  # Get all character folders 
        for folder_name in character_folders:
            label = label_map[folder_name]
            
            folder_path = os.path.join(directory_path, folder_name)
            image_files = [f for f in os.listdir(folder_path) if f.lower().endswith(('.jpg'))]  # Get all JPEG files in this character folder
    
            for filename in image_files:
                file_path = os.path.join(folder_path, filename)
                with Image.open(file_path) as img: # Load image using PIL
                    img_array = np.array(img) # Convert to numpy array
                    images.append(img_array)
                    labels.append(label)
        return np.array(images), np.array(labels)
        
    def normalize_images(self, images):
        """Normalize images to [0,1] range"""
        return images.astype(np.float32) / 255.0
        
    def flatten_images(self, images):
        """Flatten images into vectors for perceptron modeling."""
        n_samples = images.shape[0]
        return images.reshape(n_samples, -1)

    def split_train_validation(self, X_train, y_train, validation_ratio=0.25):
        """Split training data into training and validation sets"""
        n_samples = len(X_train)
        n_validation = int(n_samples * validation_ratio)
    
        indices = np.random.permutation(n_samples)  # Create random indices for splitting
        val_indices = indices[:n_validation]
        train_indices = indices[n_validation:]
        
        X_valid = X_train[val_indices]
        y_valid = y_train[val_indices]
        X_train_split = X_train[train_indices]
        y_train_split = y_train[train_indices]
        
        return (X_train_split, y_train_split), (X_valid, y_valid)

    def load_grayscale_data(self):
        """Load and process grayscale MNIST data"""
        train_dir = os.path.join(self.grayscale_dir, 'train')
        test_dir = os.path.join(self.grayscale_dir, 'test')
        
        X_train_raw, y_train = self.load_images_from_directory(train_dir) # Load training 
        X_test_raw, y_test = self.load_images_from_directory(test_dir) # Load Test data
        
        (X_train_raw, y_train), (X_val_raw, y_valid) = self.split_train_validation(X_train_raw, y_train)  # Split training data into train and validation

        X_train_raw = self.normalize_images(X_train_raw)
        X_val_raw = self.normalize_images(X_val_raw)
        X_test_raw = self.normalize_images(X_test_raw)
        
        X_train = self.flatten_images(X_train_raw)
        X_valid = self.flatten_images(X_val_raw)
        X_test = self.flatten_images(X_test_raw)
        
        return (X_train, y_train), ( X_valid, y_valid), (X_test, y_test)
    
    def load_rgb_data(self):
        """Load and process RGB MNIST data"""
        train_dir = os.path.join(self.rgb_dir, 'train')
        test_dir = os.path.join(self.rgb_dir, 'test')
        
        X_train_raw, y_train = self.load_images_from_directory(train_dir)  # Load training
        X_test_raw, y_test = self.load_images_from_directory(test_dir)  # Test data
        
        (X_train_raw, y_train), (X_val_raw, y_valid) = self.split_train_validation(X_train_raw, y_train)  # Split training data into train and validation

        X_train_raw = self.normalize_images(X_train_raw)
        X_val_raw = self.normalize_images(X_val_raw)
        X_test_raw = self.normalize_images(X_test_raw)
        
        X_train = self.flatten_images(X_train_raw)
        X_valid = self.flatten_images(X_val_raw)
        X_test = self.flatten_images(X_test_raw)
        
        return (X_train, y_train), (X_valid, y_valid), (X_test, y_test)
    
    def load_all_data(self):
        """Load both grayscale and RGB data"""
        grayscale_data = self.load_grayscale_data()
        rgb_data = self.load_rgb_data() 
        return {
            'grayscale': {
                'train': grayscale_data[0],
                'validation': grayscale_data[1],
                'test': grayscale_data[2]
            },
            'rgb': {
                'train': rgb_data[0],
                'validation': rgb_data[1],
                'test': rgb_data[2]
            }
        }
    
    def visualize_sample(self, images, labels, n_samples=10, title="Sample Images"):
        """Visualize sample images"""
        fig, axes = plt.subplots(1, n_samples, figsize=(12, 3))
        for i in range(n_samples):
            if images.shape[1] == 784:  # Grayscale
                    img = images[i].reshape(28, 28)
                    axes[i].imshow(img, cmap='gray')
            elif images.shape[1] == 2352:  # RGB (28x28x3)
                    img = images[i].reshape(28, 28, 3)
                    axes[i].imshow(img)
            axes[i].set_title(f'Label: {labels[i]}')
            axes[i].axis('off')
            
        plt.suptitle(title)
        plt.tight_layout()
        plt.show()

In [None]:
if __name__ == "__main__":
    loader = Loader('simpsons-mnist-master/dataset') # Initialize the loader class.
    try:
        print("Data preprocessing in progress......") 
        all_data = loader.load_all_data()
        grayscale_train = all_data['grayscale']['train']
        grayscale_val = all_data['grayscale']['validation']
        grayscale_test = all_data['grayscale']['test']
        
        rgb_train = all_data['rgb']['train']
        rgb_val = all_data['rgb']['validation']
        rgb_test = all_data['rgb']['test']
        
        print(f"Grayscale - Training: {grayscale_train[0].shape}, Validation: {grayscale_val[0].shape}, Test: {grayscale_test[0].shape}")
        print(f"RGB -  Training:  {rgb_train[0].shape}, Validation: {rgb_val[0].shape}, Test: {rgb_test[0].shape}")
        print("Data loading completed successfully!")
        loader.visualize_sample(grayscale_train[0][:10], grayscale_train[1][:10], title="Grayscale Training Samples")  
        loader.visualize_sample(grayscale_val[0][:10], grayscale_val[1][:10], title="Grayscale Validation Samples")  
        loader.visualize_sample(grayscale_test[0][:10], grayscale_test[1][:10], title="Grayscale Test Samples")  
       
        loader.visualize_sample(rgb_train[0][:10], rgb_train[1][:10], title="RGB Training Samples")  
        loader.visualize_sample(rgb_val[0][:10], rgb_val[1][:10], title="RGB Validation Samples")   
        loader.visualize_sample(rgb_test[0][:10], rgb_test[1][:10], title="RGB Test Samples")    
    except FileNotFoundError as e:
        print(f"Error: Could not find directory. Please ensure the directory structure exists:")
    except Exception as e:
        print(f"Error loading data: {e}")
        

<a id='section2'></a>
# 2. Multi-class perceptron implementation 
# Binary Perceptron
We will now define the `BinaryPerceptron` which is a simple classifier that predicts binary labels (0 or 1) using a linear decision boundary.  
The perceptron is initialized with the number of input features and a learning rate `alpha`; it assigns small random weights (sampled from a Gaussian) to each feature and sets the bias to zero so the model starts with a random decision boundary that is later adjusted during training.

The **predict** function computes the weighted sum of an input vector plus the bias and returns `1` when that sum is greater than or equal to zero and `0` otherwise, implementing the step activation rule used for binary decisions.  The **weighted_sum** function returns the raw linear combination `w·x + b` without applying the threshold, which is useful for inspection and debugging.

The **apply_learning_rule** function is the core of learning: it first computes the current prediction, compares it with the true label, and updates each weight and the bias according to the perceptron learning rule `w <- w + alpha * (y - y_hat) * x` and `b <- b + alpha * (y - y_hat)`, thereby nudging the decision boundary to reduce classification errors over time.

Finally, the **__repr__** function returns a readable string describing the perceptron’s current weights, bias, and learning rate so the model state can be inspected easily during experiments.





In [None]:
import numpy as np
class BinaryPerceptron:
    """
    Binary Perceptron classifier implementing the perceptron learning rule.
    Predicts binary labels (0 or 1) using a linear decision boundary.
    """
    
    def __init__(self, num_of_features, alpha=0.01):
        """Initialize the Binary Perceptron."""
        self.alpha = alpha
        self.weights = np.random.normal(0, 0.1, num_of_features)
        self.bias = 0.0
        
    def predict(self, x): 
        """Predict 0 or 1 depending on whether weighted sum >= 0."""  
        return np.where(np.dot(x, self.weights) + self.bias >= 0, 1, 0)
        
    def weighted_sum(self, x):
        """Returns weighted sum."""
        return np.dot(x, self.weights) + self.bias
        
    def apply_learning_rule(self, x, y):
        """
        Update weights and bias using the perceptron learning rule:
          w_i <- w_i + α (y - g(x)) x_i
          b   <- b   + α (y - g(x))
        """
        y_hat = self.predict(x)
        error = y - y_hat
        self.weights = self.weights + self.alpha * error * x
        self.bias = self.bias + self.alpha * error
 
    def __repr__(self):
        return f"BinaryPerceptron(weights={self.weights}, bias={self.bias:.3f}, alpha={self.alpha})"

<a id='section2.1'></a>
# Multi-class perceptron
While a  `BinaryPerceptron` can only decide between two outcomes, the multi-class version expands this idea to ten possible labels. It does this by keeping a small team of perceptrons, one dedicated to each class. This is known as the One-vs-Rest strategy. 

We will now define the `MultiClassPerceptron`, which extends the `BinaryPerceptron` to perform multi-class classification using a **One-vs-Rest (OvR)** strategy as mentioned. The model maintains one `BinaryPerceptron` per class, so for a dataset with 10 classes, there are 10 perceptrons, each trained to distinguish its class from all others.

The **__init__** method initializes these perceptrons, using a unique random seed for each one to ensure diverse initial weights and independent decision boundaries.  The **predict** method computes the weighted sum for each perceptron and selects the class with the highest score.  
This works for both single inputs and batches, making the model flexible for inference on multiple samples at once.

The **apply_learning_rule** method updates all perceptrons in a One-vs-Rest fashion: the perceptron for the correct class is trained with a positive label (`1`), while all others are trained with a negative label (`0`). This process reinforces the correct class while pushing down the scores of the incorrect ones, improving classification performance over time.

Finally, the **__repr__** method provides a readable summary showing the number of classes and the learning rate, making it easy to inspect the model’s configuration and training setup.

In [None]:
class MultiClassPerceptron:
    """
    Multi-class Perceptron using One-vs-Rest strategy.
    Maintains one BinaryPerceptron per class.
    """
    
    def __init__(self, num_of_features, alpha=0.01):
        self.num_classes = 10
        self.perceptrons = []
        
        for i in range(self.num_classes):
            np.random.seed(42 + i)  # Different seed for each perceptron
            perceptron = BinaryPerceptron(num_of_features, alpha)
            self.perceptrons.append(perceptron)
    
    def predict(self, X):
        """
        Predict class labels for input X.
        X can be a single sample (1D) or batch of samples (2D).
        Returns predictions for each sample.
        """
        if X.ndim == 1:
            scores = [p.weighted_sum(X) for p in self.perceptrons]  # Single sample
            return np.argmax(scores)
        else:
            predictions = []  # Batch of samples
            for sample in X:
                scores = [p.weighted_sum(sample) for p in self.perceptrons]
                predictions.append(np.argmax(scores))
            return np.array(predictions)
    
    def apply_learning_rule(self, x, y):
        """
        Train all perceptrons in One-vs-Rest fashion.
        x is a single sample, y is the true class label (0..num_classes-1).
        """
        for i, perceptron in enumerate(self.perceptrons):
            binary_label = 1 if i == y else 0
            perceptron.apply_learning_rule(x, binary_label)
    
    def __repr__(self):
        return f"MultiClassPerceptron(num_classes={self.num_classes}, alpha={self.perceptrons[0].alpha})"

<a id='section3'></a>
# 3. Training
Now we want to train our models to see how they model and respond to input data given to them. To do this, we will now define the `PerceptronTrainer`, a class designed to handle the training of a `BinaryPerceptron` with flexible stopping criteria. This trainer accepts training data and optional validation data and supports three types of stopping: a fixed number of epochs, an error threshold, and early stopping based on validation accuracy.

The **__init__** method sets up the perceptron, datasets, stopping parameters, and internal trackers for epochs and validation performance. The **shuffle_data** method randomly shuffles the training data at the start of each epoch so the perceptron sees the samples in a different order every time, which helps improve generalization and reduces bias from sample order. The **accuracy** method computes how many predictions are correct as a fraction of total samples, allowing us to measure model performance on both training and validation sets.

The **stopping_criteria** method checks whether training should stop. For fixed epochs, training stops when the maximum number of epochs is reached. For error threshold stopping, it halts when the training error drops below a chosen value. For early stopping, it monitors validation accuracy and stops if no improvement occurs over a set number of epochs (patience), helping prevent overfitting.

The **train** method runs the main training loop: for each epoch it shuffles the data, applies the perceptron learning rule to each sample, computes training accuracy, prints progress, and checks whether the stopping condition is met. When training completes, it returns the trained perceptron.This trainer provides a consistent and convenient way to experiment with different training strategies and observe how the perceptron learns from data.


In [None]:
class PerceptronTrainer:
    """Trainer class for Binary Perceptron."""
    def __init__(self, perceptron, X_train, y_train, X_val=None, y_val=None,
                 stopping_criterion="fixed_epochs", total_epochs=50,
                 error_threshold=0.05, patience=5, min_delta=1e-4):
        """Initialize trainer with options for stopping criteria."""
        self.perceptron = perceptron
        self.X_train = X_train
        self.y_train = y_train
        self.X_val = X_val
        self.y_val = y_val

        self.stopping_criterion = stopping_criterion
        self.total_epochs = total_epochs  
        self.error_threshold = error_threshold
        self.patience = patience
        self.min_delta = min_delta

        self.epochs_trained = 0
        
        self.stopped_reason = None
        self.epochs_without_improvement = 0
        self.val_accuracies = []

    def shuffle_data(self, X, y):
        """Return shuffled copies of X and y."""
        indices = np.random.permutation(len(X))
        return X[indices], y[indices]

    def accuracy(self, X, y):
        """Compute accuracy for given data."""
        return np.mean(self.perceptron.predict(X) == y)
        
    def stopping_criteria(self, epoch, train_error_rate, val_accuracy=None):
        """Checks stopping criteria and returns True if training should stop."""
        if self.stopping_criterion == 'fixed_epochs':
            if epoch >= self.total_epochs:   
                self.stopped_reason = f"Reached total epochs ({self.total_epochs})"
                return True
    
        elif self.stopping_criterion == 'error_threshold':
            if train_error_rate <= self.error_threshold:
                self.stopped_reason = (
                    f"Training error ({train_error_rate:.4f}) "
                    f"below threshold ({self.error_threshold})"
                )
                return True
            if epoch >= self.total_epochs:  
                self.stopped_reason = (
                    f"Reached total epochs ({self.total_epochs}) "
                    f"without meeting error threshold"
                )
                return True
    
        elif self.stopping_criterion == 'early_stopping':
            if val_accuracy is None:
                raise ValueError("Validation data required for early stopping")
                
            best_val_acc = max(self.val_accuracies) if self.val_accuracies else 0.0
    
            if val_accuracy <= best_val_acc + self.min_delta:
                self.epochs_without_improvement += 1
            else:
                self.epochs_without_improvement = 0
    
            if self.epochs_without_improvement >= self.patience:
                self.stopped_reason = (
                    f"No improvement for {self.patience} epochs (early stopping)"
                )
                return True
    
            if epoch >= self.total_epochs:
                self.stopped_reason = f"Reached total epochs ({self.total_epochs})"
                return True
    
        return False
    
    def train(self):
        """Train perceptron according to the chosen stopping criterion."""      
        for epoch in range(1, self.total_epochs + 1):
            X_shuffled, y_shuffled = self.shuffle_data(self.X_train, self.y_train)
            for x, y in zip(X_shuffled, y_shuffled):
                self.perceptron.apply_learning_rule(x, y)
                    
            train_acc = self.accuracy(self.X_train, self.y_train)
            train_error_rate = 1 - train_acc
    
            val_accuracy = None
            if self.X_val is not None and self.y_val is not None:
                val_accuracy = self.accuracy(self.X_val, self.y_val)
                self.val_accuracies.append(val_accuracy)
                print(f"Epoch {epoch}/{self.total_epochs} | Train Acc: {train_acc*100:.2f}% | Val Acc: {val_accuracy*100:.2f}%")
            else:
                print(f"Epoch {epoch}/{self.total_epochs} | Train Acc: {train_acc*100:.2f}%")

            self.epochs_trained = epoch
            
            if self.stopping_criteria(epoch, train_error_rate, val_accuracy):
                print(f"Stopping: {self.stopped_reason}")
                break
    
        print("Training finished.")
        return self.perceptron


In [None]:
if __name__ == "__main__":
    loader = Loader('simpsons-mnist-master/dataset')
    try:
        print("Data preprocessing in progress......") 
        all_data = loader.load_all_data()
        
        X_train, y_train = all_data['grayscale']['train']
        X_val, y_val = all_data['grayscale']['validation']
        X_test, y_test = all_data['grayscale']['test']
        print(f"Grayscale - Training: {X_train.shape}, Validation: {X_val.shape}, Test: {X_test.shape}")
        
        X_train_rgb, y_train_rgb = all_data['rgb']['train']
        X_val_rgb, y_val_rgb = all_data['rgb']['validation']
        X_test_rgb, y_test_rgb = all_data['rgb']['test']
        print(f"RGB - Training: {X_train_rgb.shape}, Validation: {X_val_rgb.shape}, Test: {X_test_rgb.shape}")
        print("Data loading completed successfully!\n")
        
        def compare_stopping_criteria(
            X_train, y_train, X_val, y_val, X_test, y_test, data_name="Data",
            fixed_epochs=100, error_threshold=0.05, patience=10, min_delta=1e-4, total_epochs=200
        ):
            """
            Compare different stopping criteria for training the perceptron.
            """
            criteria_results = {}
            
            criteria_settings = [
                ('fixed_epochs', {'total_epochs': fixed_epochs}),
                ('error_threshold', {'total_epochs': total_epochs, 'error_threshold': error_threshold}),
                ('early_stopping', {'total_epochs': total_epochs, 'patience': patience, 'min_delta': min_delta})
            ]
            
            for criterion_name, params in criteria_settings:
                perceptron = MultiClassPerceptron(
                    num_of_features=X_train.shape[1],
                    alpha=0.001  # Learning rate (small for stability)
                )
                trainer = PerceptronTrainer(
                    perceptron, X_train, y_train, X_val, y_val,
                    stopping_criterion=criterion_name, **params
                )
                trained_perceptron = trainer.train()
                test_accuracy = trainer.accuracy(X_test, y_test)
                
                criteria_results[criterion_name] = {
                    'epochs_trained': trainer.epochs_trained,
                    'final_train_acc': trainer.accuracy(X_train, y_train),
                    'final_val_acc': trainer.accuracy(X_val, y_val) if X_val is not None else None,
                    'test_accuracy': test_accuracy,
                    'stopped_reason': trainer.stopped_reason
                }
                
            print(f"\nSTOPPING CRITERIA COMPARISON ({data_name})")
            print("="*60)
            for criterion, results in criteria_results.items():
                print(f"{criterion:15s}: {results['epochs_trained']:3d} epochs")
                print(f"{'':17s} Train: {results['final_train_acc']:.4f}")
                print(f"{'':17s} Val:   {results['final_val_acc']:.4f}")
                print(f"{'':17s} Test:  {results['test_accuracy']:.4f}")
                print(f"{'':17s} Reason: {results['stopped_reason']}")
                print("-"*60)
        
        compare_stopping_criteria(
            X_train, y_train, X_val, y_val, X_test, y_test,
            data_name="Grayscale",
            fixed_epochs=400, error_threshold=0.25, patience=50, min_delta=0.001, total_epochs=400
        )
          
        compare_stopping_criteria(
            X_train_rgb, y_train_rgb, X_val_rgb, y_val_rgb, X_test_rgb, y_test_rgb,
            data_name="RGB",
            fixed_epochs=600, error_threshold=0.20, patience=25, min_delta=0.001, total_epochs=600
        )
        
    except FileNotFoundError:
        print("Error: Could not find directory. Please ensure the dataset path is correct.")
    except Exception as e:
        print(f"Error loading or training data: {e}")


<a id='section4'></a>
# 4.Hyperparameter tuning 
Hyperparameter tuning means trying out different settings to find the best ones for training a machine learning model.  
These settings, called **hyperparameters**, control how the model learns but are not learned from the data itself.  
They include things like:

- **Learning rate** – how fast the model updates during training.  
- **Weight initialization** – how the model starts its learning.  
- **Input normalization** – how the input data is scaled or adjusted.  

When training two models—one using RGB images and the other using grayscale images ,we try different combinations of these hyperparameters to see which settings work best for each input type.

### What Each Hyperparameter Does

**1. Learning Rate**  
Controls how big the steps are when the model updates during training.  
- If it’s too high, the model might overshoot and miss the best solution.  
- If it’s too low, training can be very slow and take longer to converge.  

**2. Weight Initialization**  
Determines how the model’s weights are set before training begins.  
Common methods include:  
- **Zero Initialization:** All weights start at zero.  
- **Constant Initialization:** All weights start at the same chosen value.  
- **Uniform Initialization:** Weights are randomly selected from a uniform range.  
- **Gaussian Initialization:** Weights are randomly selected from a normal (bell curve) distribution.  

**3. Normalization Technique**  
Adjusts the input data before training so the model learns more effectively.  
Options include:  
- **No Normalization:** Use raw pixel values directly.  
- **Normalization to [0, 1]:** Scale pixel values so they lie between 0 and 1.  
- **Z-Score Normalization:** Adjust pixel values based on the mean and standard deviation of the dataset.  

By testing different combinations of these hyperparameters, we can find the settings that help each model learn faster and achieve better accuracy.

We will now define the `EnhancedBinaryPerceptron` class, which extends the basic `BinaryPerceptron`. This enhanced version supports multiple strategies for initializing the weights, making it ideal for hyperparameter tuning where different starting points can impact training performance.

The **__init__** method takes the number of features, a learning rate `alpha`, and an `init_strategy` parameter that can be `'zero'`, `'constant'`, `'uniform'`, or `'gaussian'`. Based on this strategy, the weights are initialized as all zeros, a fixed constant value, random values from a uniform range, or random values from a Gaussian distribution. The bias is always initialized to `0.0`.

The **__repr__** method provides a clear summary of the perceptron’s initialization choice, learning rate, and weights. Using this enhanced class makes it possible to experiment with different initialization strategies, which can influence how quickly the model converges and the final accuracy achieved.  




In [None]:
class EnhancedBinaryPerceptron(BinaryPerceptron):
    """Enhanced Binary Perceptron with multiple initialization strategies."""

    def __init__(self, num_of_features, alpha=0.01, init_strategy='gaussian'):
        super().__init__(num_of_features, alpha)  # Call the parent constructor
        
        self.init_strategy = init_strategy

        if init_strategy == 'zero':
            self.weights = np.zeros(num_of_features)
        elif init_strategy == 'constant':
            self.weights = np.full(num_of_features, 0.1)
        elif init_strategy == 'uniform':
            self.weights = np.random.uniform(-0.1, 0.1, num_of_features)
        elif init_strategy == 'gaussian':
            self.weights = np.random.normal(0, 0.1, num_of_features)
        else:
            raise ValueError(f"Unknown initialization strategy: {init_strategy}")

        self.bias = 0.0

    def __repr__(self):
        return f"EnhancedBinaryPerceptron(init={self.init_strategy}, alpha={self.alpha})"


`EnhancedMultiClassPerceptron` class which extends the multi-class perceptron by using `EnhancedBinaryPerceptron` instances.  
This class follows a **One-vs-Rest** approach, creating one enhanced binary perceptron for each class in the dataset.

The **__init__** method takes the number of features, a learning rate `alpha`, an `init_strategy` for weight initialization, and the number of classes.  
For reproducibility, a different random seed is used for each perceptron, and each one is initialized according to the chosen strategy.  
This design allows experimentation with different weight starting points across all classes.

The **predict** method computes the scores from all perceptrons and selects the class with the highest score for each sample.  

The **apply_learning_rule** method updates all perceptrons using a One-vs-Rest approach:  
the perceptron corresponding to the correct class is updated with a positive target (`1`), while all others are updated with a negative target (`0`).  

Finally, the **__repr__** method provides a concise summary of the model’s initialization strategy, learning rate, and number of classes.


In [None]:
class EnhancedMultiClassPerceptron:
    """Enhanced Multi-class Perceptron using EnhancedBinaryPerceptrons."""

    def __init__(self, num_of_features, alpha=0.01, init_strategy='gaussian', num_classes=10):
        self.init_strategy = init_strategy
        self.perceptrons = []

       
        for i in range(num_classes):  # Create one EnhancedBinaryPerceptron per class
            np.random.seed(42 + i)  # reproducible but varied
            perceptron = EnhancedBinaryPerceptron(num_of_features, alpha, init_strategy)
            self.perceptrons.append(perceptron)

    def predict(self, X):
        """Predict class labels for samples in X."""
        scores = np.array([p.weighted_sum(X) for p in self.perceptrons]).T
        return np.argmax(scores, axis=1)

    def apply_learning_rule(self, x, y_true):
        """Update all perceptrons (1-vs-rest scheme)."""
        for i, perceptron in enumerate(self.perceptrons):
            target = 1 if y_true == i else 0
            perceptron.apply_learning_rule(x, target)

    def __repr__(self):
        return f"EnhancedMultiClassPerceptron(init={self.init_strategy}, alpha={self.perceptrons[0].alpha})"


The `EnhancedPerceptronTrainer` class, which is designed to train a perceptron model with different stopping criteria and performance monitoring.  
This trainer can work with both training and validation datasets, and supports three stopping conditions and fixed number of epochs, reaching a target error threshold, or early stopping with a patience parameter based on validation accuracy.

The **shuffle_data** method randomizes the order of samples at the start of each epoch, helping the model generalize better by reducing bias from sample order.  
The **accuracy** method computes the fraction of correctly predicted labels for any dataset. The **stopping_criteria** method checks whether training should stop, monitoring validation accuracy when early stopping is selected.

The **train** method runs the full training loop and it shuffles the data, applies the perceptron learning rule to each sample, and tracks training and validation accuracy after every epoch.  
Validation results are stored to detect improvements over time, and training halts once the chosen stopping condition is satisfied.  

In [None]:
 class EnhancedPerceptronTrainer:
    def __init__(self, perceptron, X_train, y_train, X_val=None, y_val=None,
                 stopping_criterion="fixed_epochs", total_epochs=50,
                 error_threshold=0.05, patience=5, min_delta=1e-4):
        self.perceptron = perceptron
        self.X_train, self.y_train = X_train, y_train
        self.X_val, self.y_val = X_val, y_val
        self.stopping_criterion, self.total_epochs = stopping_criterion, total_epochs
        self.error_threshold, self.patience, self.min_delta = error_threshold, patience, min_delta

        self.epochs_trained = 0
        self.stopped_reason = None
        self.epochs_without_improvement = 0
        self.val_accuracies = []

    def shuffle_data(self, X, y):
        idx = np.random.permutation(len(X))
        return X[idx], y[idx]

    def accuracy(self, X, y):
        return np.mean(self.perceptron.predict(X) == y)

    def stopping_criteria(self, epoch, train_error, val_acc=None):
        if self.stopping_criterion == 'fixed_epochs':
            return epoch >= self.total_epochs

        elif self.stopping_criterion == 'error_threshold':
            return (train_error <= self.error_threshold) or (epoch >= self.total_epochs)

        elif self.stopping_criterion == 'early_stopping':
            if val_acc is None:
                raise ValueError("Validation data required for early stopping")
            best_val = max(self.val_accuracies) if self.val_accuracies else 0
            if val_acc <= best_val + self.min_delta:
                self.epochs_without_improvement += 1
            else:
                self.epochs_without_improvement = 0
            return self.epochs_without_improvement >= self.patience or epoch >= self.total_epochs

        return False

    def train(self, verbose=False):
        for epoch in range(1, self.total_epochs + 1):
            X_shuf, y_shuf = self.shuffle_data(self.X_train, self.y_train)

            for x, y in zip(X_shuf, y_shuf):
                self.perceptron.apply_learning_rule(x, y)

            train_acc = self.accuracy(self.X_train, self.y_train)
            train_error = 1 - train_acc
            val_acc = self.accuracy(self.X_val, self.y_val) if self.X_val is not None else None

            if val_acc is not None:
                self.val_accuracies.append(val_acc)

            self.epochs_trained = epoch

            if self.stopping_criteria(epoch, train_error, val_acc):
                if verbose:
                    print(f"Stopping at epoch {epoch}")
                break

            if verbose:
                print(f"Epoch {epoch}: Train={train_acc:.4f}, Val={val_acc:.4f}" if val_acc else f"Epoch {epoch}: Train={train_acc:.4f}")

        return self.perceptron


We will now define the `DataNormalizer`class, which is a helper class used to preprocess input data before passing it to a perceptron model.  
This class provides a static **normalize** method that supports multiple normalization strategies.

If `strategy='none'`, the data is returned unchanged.  
If `strategy='minmax'`, pixel values are scaled to the range `[0, 1]` by dividing by `255.0`.  
If `strategy='zscore'`, the data is standardized by subtracting the mean and dividing by the standard deviation for each feature, with safe handling for features with zero variance.  


In [None]:
class DataNormalizer:
    def normalize(X, strategy='minmax'):
        if strategy == 'none': return X
        elif strategy == 'minmax': return X.astype(np.float32) / 255.0
        elif strategy == 'zscore':
            mean = np.mean(X, axis=0, keepdims=True)
            std = np.std(X, axis=0, keepdims=True)
            std[std == 0] = 1
            return (X - mean) / std
        else: raise ValueError(f"Unknown normalization strategy: {strategy}")

We will now define the `HyperparameterTuner` class, which automates the search for the best training settings for `EnhancedMultiClassPerceptron` models on both RGB and grayscale datasets.

The class takes training, validation, and test datasets, along with a mode (`"rgb"` or `"grayscale"`) that controls which hyperparameter ranges to explore. The hyperparameters include the learning rate, weight initialization strategy, and input normalization method. For RGB data, early stopping with patience is applied, while grayscale data uses a fixed number of epochs.

The **run_search** method performs a grid search over all combinations of hyperparameters. For each configuration, it normalizes the data with `DataNormalizer`, initializes the perceptron with the chosen weight strategy, trains it with `EnhancedPerceptronTrainer`, and evaluates performance on the training, validation, and test sets. It records key metrics such as accuracy, number of epochs trained, training time, and stopping reason.

Once all combinations are tested, the results are sorted by test accuracy, and the top five configurations are printed.  
Finally, the best-performing setup is highlighted.


In [None]:
import time
class HyperparameterTuner:
    """Hyperparameter tuner for RGB and Grayscale MultiClassPerceptrons using DataNormalizer."""

    def __init__(self, X_train, y_train, X_val, y_val, X_test, y_test, mode="rgb"):
        self.X_train, self.y_train = X_train, y_train
        self.X_val, self.y_val = X_val, y_val
        self.X_test, self.y_test = X_test, y_test
        self.mode = mode.lower()
        
        if self.mode == "rgb":
            self.learning_rates = [0.01, 0.005, 0.001, 0.0005]
            self.init_strategies = ['gaussian', 'uniform', 'constant', 'zero']
            self.norm_strategies = ['minmax', 'zscore', 'none']
            self.total_epochs = 50
            self.stopping_criterion = "early_stopping"
            self.patience = 10
            self.min_delta = 0.001
        elif self.mode == "grayscale":
            self.learning_rates = [0.1, 0.05, 0.01, 0.005]
            self.init_strategies = ['gaussian', 'uniform', 'constant']
            self.norm_strategies = ['minmax', 'zscore']
            self.total_epochs = 30
            self.stopping_criterion = "fixed_epochs"
            self.patience = None
            self.min_delta = None
        else:
            raise ValueError("mode must be 'rgb' or 'grayscale'")

        self.all_results = []
        self.best_config = {'test_acc': 0, 'config': None, 'model': None}

    def run_search(self):
        """Run grid search over learning rates, initialization, and normalization."""
        total_combinations = len(self.learning_rates) * len(self.init_strategies) * len(self.norm_strategies)
        combo_count = 0
        print(f"HYPERPARAMETER SEARCH - {self.mode.upper()}")

        for lr in self.learning_rates:
            for init_strat in self.init_strategies:
                for norm_strat in self.norm_strategies:
                    combo_count += 1
                    print(f"\n--- Combo {combo_count}/{total_combinations}: "
                          f"lr={lr}, init={init_strat}, norm={norm_strat} ---")
                    try:
                        # Normalize using DataNormalizer
                        X_train_norm = DataNormalizer.normalize(self.X_train.copy(), norm_strat)
                        X_val_norm = DataNormalizer.normalize(self.X_val.copy(), norm_strat)
                        X_test_norm = DataNormalizer.normalize(self.X_test.copy(), norm_strat)

                        # Initialize model
                        perceptron = EnhancedMultiClassPerceptron(
                            num_of_features=X_train_norm.shape[1],
                            alpha=lr,
                            init_strategy=init_strat
                        )

                        # Setup trainer arguments
                        trainer_args = {
                            'perceptron': perceptron,
                            'X_train': X_train_norm,
                            'y_train': self.y_train,
                            'X_val': X_val_norm,
                            'y_val': self.y_val,
                            'stopping_criterion': self.stopping_criterion,
                            'total_epochs': self.total_epochs
                        }
                        if self.stopping_criterion == "early_stopping":
                            trainer_args['patience'] = self.patience
                            trainer_args['min_delta'] = self.min_delta

                        trainer = EnhancedPerceptronTrainer(**trainer_args)

                        # Train and measure time
                        start_time = time.time()
                        trained_model = trainer.train(verbose=False)
                        training_time = time.time() - start_time

                        # Evaluate
                        test_acc = trainer.accuracy(X_test_norm, self.y_test)
                        val_acc = trainer.accuracy(X_val_norm, self.y_val)
                        train_acc = trainer.accuracy(X_train_norm, self.y_train)

                        # Record results
                        result = {
                            'learning_rate': lr,
                            'init_strategy': init_strat,
                            'norm_strategy': norm_strat,
                            'epochs_trained': trainer.epochs_trained,
                            'train_acc': train_acc,
                            'val_acc': val_acc,
                            'test_acc': test_acc,
                            'training_time': training_time,
                            'stopped_reason': trainer.stopped_reason,
                            'X_val_norm': X_val_norm  # store for evaluation
                        }
                        self.all_results.append(result)

                        # Update best model
                        if test_acc > self.best_config['test_acc']:
                            self.best_config['test_acc'] = test_acc
                            self.best_config['config'] = result
                            self.best_config['model'] = trained_model

                        print(f"  → Test Acc: {test_acc:.4f}, Val Acc: {val_acc:.4f}, "
                              f"Train Acc: {train_acc:.4f}, Epochs: {trainer.epochs_trained}")

                    except Exception as e:
                        print(f"  → ERROR: {str(e)}")
                        continue
        self.all_results.sort(key=lambda x: x['test_acc'], reverse=True)
        print(f"\nTOP 5 CONFIGURATIONS:")
        for i, res in enumerate(self.all_results[:5]):
            print(f"{i+1}. Test Acc: {res['test_acc']:.4f} | LR: {res['learning_rate']} | "
                  f"Init: {res['init_strategy']:8s} | Norm: {res['norm_strategy']:6s} | "
                  f"Epochs: {res['epochs_trained']:3d}")

        best = self.best_config['config']
        print(f"\nBEST CONFIGURATION:")
        print(f"Learning Rate: {best['learning_rate']}")
        print(f"Initialization: {best['init_strategy']}")
        print(f"Normalization: {best['norm_strategy']}")
        print(f"Test Accuracy: {best['test_acc']:.4f}")
        print(f"Validation Accuracy: {best['val_acc']:.4f}")
        print(f"Training Accuracy: {best['train_acc']:.4f}")
        print(f"Epochs Trained: {best['epochs_trained']}")
        print(f"Training Time: {best['training_time']:.2f}s")

        return self.all_results, self.best_config


In [None]:
if __name__ == "__main__":
    loader = Loader('simpsons-mnist-master/dataset')  # Initialize the loader class.
    try:
        print("Data preprocessing in progress......")
        all_data = loader.load_all_data()
        
        if all_data is None:
            raise ValueError("Data loader returned None")

        X_train, y_train = all_data['grayscale']['train']
        X_val, y_val = all_data['grayscale']['validation']
        X_test, y_test = all_data['grayscale']['test']

        X_train_rgb, y_train_rgb = all_data['rgb']['train']
        X_val_rgb, y_val_rgb = all_data['rgb']['validation']
        X_test_rgb, y_test_rgb = all_data['rgb']['test']

        print(f"Grayscale - Training: {X_train.shape}, Validation: {X_val.shape}, Test: {X_test.shape}")
        print(f"RGB - Training: {X_train_rgb.shape}, Validation: {X_val_rgb.shape}, Test: {X_test_rgb.shape}")
        print("Data loading completed successfully!\n")
 
        rgb_tuner = HyperparameterTuner(X_train_rgb, y_train_rgb, X_val_rgb, y_val_rgb, X_test_rgb, y_test_rgb, mode="rgb")
        rgb_results, rgb_best_config = rgb_tuner.run_search()
        
        grayscale_tuner = HyperparameterTuner(X_train, y_train, X_val, y_val, X_test, y_test, mode="grayscale")
        grayscale_results, grayscale_best_config = grayscale_tuner.run_search()
        
        print("COMPARATIVE ANALYSIS")
        rgb_best = max(rgb_results, key=lambda x: x['test_acc'])
        grayscale_best = max(grayscale_results, key=lambda x: x['test_acc'])

        print(f"\nBEST RGB PERFORMANCE:")
        print(f"  Test Accuracy: {rgb_best['test_acc']:.4f}")
        print(f"  Configuration: LR={rgb_best['learning_rate']}, Init={rgb_best['init_strategy']}, Norm={rgb_best['norm_strategy']}")

        print(f"\nBEST GRAYSCALE PERFORMANCE:")
        print(f"  Test Accuracy: {grayscale_best['test_acc']:.4f}")
        print(f"  Configuration: LR={grayscale_best['learning_rate']}, Init={grayscale_best['init_strategy']}, Norm={grayscale_best['norm_strategy']}")

        print(f"\nPERFORMANCE GAP: {rgb_best['test_acc'] - grayscale_best['test_acc']:.4f}")
        
    except Exception as e:
        print(f"Error loading or training data: {e}")


### Imports, Installations, and Downloads

We will use standard Python libraries for evaluation:

- **numpy** – for storing and manipulating arrays and matrices of numerical data.  
- **matplotlib** – for visualizing plots, including image data and training results.  
- **sklearn.metrics** – for evaluating model performance, including accuracy, precision, recall, F1-score, and confusion matrix visualization.


In [None]:
import numpy as np
import matplotlib.pyplot as plt
from sklearn.metrics import accuracy_score, precision_score, recall_score, f1_score, confusion_matrix, ConfusionMatrixDisplay

<a id='section3'></a>
# 5. Evaluation
We will now define the `Evaluator` class, which is responsible for assessing a trained model’s performance on validation and test data.  
This class provides a single **evaluate** method that computes several key metrics and visualizes results.

The method first predicts labels for the validation set and calculates validation accuracy.  It then predicts on the test set, computing test accuracy, precision, recall, and F1-score for each class. These metrics give a detailed view of the model’s strengths and weaknesses across different classes.

A confusion matrix is generated using `sklearn.metrics.confusion_matrix`, and visualized with `ConfusionMatrixDisplay`, showing how often each class is correctly or incorrectly predicted.  The confusion matrix is displayed as a heatmap with class names on both axes, making errors easy to interpret.

Finally, the method returns a dictionary containing validation accuracy, test accuracy, and per-class precision, recall, and F1-score, which can be logged or compared across models.


In [None]:
class Evaluator:
    def evaluate(model, X_val, y_val, X_test, y_test, class_names, title="Model Evaluation"):
        y_val_pred = model.predict(X_val)
        val_acc = accuracy_score(y_val, y_val_pred)
        
        y_test_pred = model.predict(X_test)
        test_acc = accuracy_score(y_test, y_test_pred)
        precision = precision_score(y_test, y_test_pred, average=None, zero_division=0)
        recall = recall_score(y_test, y_test_pred, average=None, zero_division=0)
        f1 = f1_score(y_test, y_test_pred, average=None, zero_division=0)
        
        cm = confusion_matrix(y_test, y_test_pred, labels=range(len(class_names)))
        disp = ConfusionMatrixDisplay(confusion_matrix=cm, display_labels=class_names)
        disp.plot(cmap="Blues", xticks_rotation=45)
        plt.title(f"{title} - Confusion Matrix")
        plt.show()
        
        return {
            "val_accuracy": val_acc,
            "test_accuracy": test_acc,
            "precision_per_class": precision.tolist(),
            "recall_per_class": recall.tolist(),
            "f1_per_class": f1.tolist()
        }


In [None]:
 if __name__ == "__main__":
    try:
        loader = Loader('simpsons-mnist-master/dataset')
        all_data = loader.load_all_data()
        
        X_train_g, y_train_g = all_data['grayscale']['train']
        X_val_g, y_val_g = all_data['grayscale']['validation']
        X_test_g, y_test_g = all_data['grayscale']['test']
        
        X_train_rgb, y_train_rgb = all_data['rgb']['train']
        X_val_rgb, y_val_rgb = all_data['rgb']['validation']
        X_test_rgb, y_test_rgb = all_data['rgb']['test']
        
        class_names = [str(i) for i in range(10)]
        lr_list = [0.01, 0.005, 0.001]
        init_list = ['gaussian', 'uniform']
        norm_list = ['minmax', 'zscore']
        print("\nTuning hyperparameters for Grayscale dataset...")
        grayscale_tuner = HyperparameterTuner(
            X_train_g, y_train_g, X_val_g, y_val_g, X_test_g, y_test_g, mode="grayscale"
        )
        grayscale_results, grayscale_best_config = grayscale_tuner.run_search()
        print("\nTuning hyperparameters for RGB dataset...")
        rgb_tuner = HyperparameterTuner(
            X_train_rgb, y_train_rgb, X_val_rgb, y_val_rgb, X_test_rgb, y_test_rgb, mode="rgb"
        )
        rgb_results, rgb_best_config = rgb_tuner.run_search()
        print("\nEvaluating Grayscale best model...")
        gray_metrics = Evaluator.evaluate(
            model=grayscale_best_config['model'],
            X_val=grayscale_best_config['config']['X_val_norm'],
            y_val=y_val_g,
            X_test=DataNormalizer.normalize(X_test_g.copy(), grayscale_best_config['config']['norm_strategy']),
            y_test=y_test_g,
            class_names=class_names,
            title="Grayscale Final Model"
        )
        print("\nEvaluating RGB best model...")
        rgb_metrics = Evaluator.evaluate(
            model=rgb_best_config['model'],
            X_val=rgb_best_config['config']['X_val_norm'],
            y_val=y_val_rgb,
            X_test=DataNormalizer.normalize(X_test_rgb.copy(), rgb_best_config['config']['norm_strategy']),
            y_test=y_test_rgb,
            class_names=class_names,
            title="RGB Final Model"
        )
        print("\nComparison of final models:")
        print("\nGrayscale metrics:", gray_metrics)
        print("\nRGB metrics:", rgb_metrics)

    except Exception as e:
        print("Error during loading, training or evaluation:", e)


In [None]:
import matplotlib.pyplot as plt
import numpy as n
class PredictionVisualizer:
    def __init__(self, model, class_names=None, is_rgb=False):
        """
        Initialize a visualizer for predictions.

        model: trained perceptron (binary or multi-class)
        class_names: optional list of labels for mapping numeric outputs to names
        is_rgb: True if dataset is RGB, False if grayscale
        """
        self.model = model
        self.class_names = class_names
        self.is_rgb = is_rgb

    def _predict_label(self, x):
        """Predicts the class label for a single input x."""
        output = self.model.predict(x.reshape(1, -1))  # use predict instead of forward
        pred_label = output[0] if hasattr(output, "__len__") else output
        return pred_label

    def show_correct_only(self, X, y, n_samples=9):
        """
        Display a grid of n correctly predicted samples.

        X: input data
        y: true labels
        n_samples: number of correct predictions to display
        """
        # Predict all labels
        y_pred = self.model.predict(X)

        # Find indices where predictions are correct
        correct_indices = np.where(y_pred == y)[0]

        if len(correct_indices) == 0:
            print("No correctly predicted samples found.")
            return

        # Randomly choose up to n_samples
        chosen = np.random.choice(correct_indices, size=min(n_samples, len(correct_indices)), replace=False)

        plt.figure(figsize=(10, 10))
        for i, idx in enumerate(chosen):
            x_sample = X[idx]
            true_label = y[idx]
            pred_label = y_pred[idx]

            true_name = self.class_names[true_label] if self.class_names else str(true_label)
            pred_name = self.class_names[pred_label] if self.class_names else str(pred_label)

            plt.subplot(int(np.sqrt(n_samples)), int(np.sqrt(n_samples)), i + 1)
            if self.is_rgb:
                img = x_sample.reshape(28, 28, 3)
                plt.imshow(img.astype(np.uint8))
            else:
                img = x_sample.reshape(28, 28)
                plt.imshow(img, cmap="gray")

            plt.title(f"T:{true_name}\nP:{pred_name}", fontsize=8)
            plt.axis("off")

        plt.tight_layout()
        plt.show()

In [None]:
if __name__ == "__main__":
    try:
        loader = Loader('simpsons-mnist-master/dataset')
        all_data = loader.load_all_data()

        X_train_g, y_train_g = all_data['grayscale']['train']
        X_val_g, y_val_g = all_data['grayscale']['validation']
        X_test_g, y_test_g = all_data['grayscale']['test']

        X_train_rgb, y_train_rgb = all_data['rgb']['train']
        X_val_rgb, y_val_rgb = all_data['rgb']['validation']
        X_test_rgb, y_test_rgb = all_data['rgb']['test']

        class_names = [str(i) for i in range(10)]

        lr_list = [0.01, 0.005, 0.001]
        init_list = ['gaussian', 'uniform']
        norm_list = ['minmax', 'zscore']
 
        print("\nTuning hyperparameters for Grayscale dataset...")
        grayscale_tuner = HyperparameterTuner(
            X_train_g, y_train_g, X_val_g, y_val_g, X_test_g, y_test_g, mode="grayscale"
        )
        grayscale_results, grayscale_best_config = grayscale_tuner.run_search()
 
        print("\nTuning hyperparameters for RGB dataset...")
        rgb_tuner = HyperparameterTuner(
            X_train_rgb, y_train_rgb, X_val_rgb, y_val_rgb, X_test_rgb, y_test_rgb, mode="rgb"
        )
        rgb_results, rgb_best_config = rgb_tuner.run_search()
        
        print("\nEvaluating Grayscale best model...")
        gray_metrics = Evaluator.evaluate(
            model=grayscale_best_config['model'],
            X_val=grayscale_best_config['config']['X_val_norm'],
            y_val=y_val_g,
            X_test=DataNormalizer.normalize(
                X_test_g.copy(), grayscale_best_config['config']['norm_strategy']
            ),
            y_test=y_test_g,
            class_names=class_names,
            title="Grayscale Final Model"
        )

        print("\nEvaluating RGB best model...")
        rgb_metrics = Evaluator.evaluate(
            model=rgb_best_config['model'],
            X_val=rgb_best_config['config']['X_val_norm'],
            y_val=y_val_rgb,
            X_test=DataNormalizer.normalize(
                X_test_rgb.copy(), rgb_best_config['config']['norm_strategy']
            ),
            y_test=y_test_rgb,
            class_names=class_names,
            title="RGB Final Model"
        )

        print("\nComparison of final models:")
        print("\nGrayscale metrics:", gray_metrics)
        print("\nRGB metrics:", rgb_metrics)
        print("\nVisualizing 9 correctly predicted samples...")

        gray_vis = PredictionVisualizer(
            model=grayscale_best_config['model'],
            class_names=class_names,
            is_rgb=False
        )
        gray_vis.show_correct_only(X_val_g, y_val_g, n_samples=9)

        rgb_vis = PredictionVisualizer(
            model=rgb_best_config['model'],
            class_names=class_names,
            is_rgb=True
        )
        rgb_vis.show_correct_only(X_val_rgb, y_val_rgb, n_samples=9)

    except Exception as e:
        print("Error during loading, training or evaluation:", e)
