# Image Classification Pipeline Summary

## 1. Datasets and Models Used
   - **Datasets**:
     - 4 datasets in total: three individual datasets and one combined dataset.
   - **Model**:
     - Custom lightweight CNN model with:
       - **Convolutional Layers**: Three layers with Batch Normalization, ReLU activation, and MaxPooling for spatial feature extraction.
       - **Fully Connected Layers**: Two layers with Batch Normalization and ReLU in the hidden layer.
       - **Output Layer**: Final fully connected layer for classification into 9 classes.

## 2. Experiment Setup
   - **Hyperparameter Ranges**:
     - **Batch Size**: [16, 32, 64, 128].
     - **Learning Rate**: [0.00001, 0.00005, 0.0001, 0.0005, 0.001, 0.005].
     - **Epochs**: [15, 25, 35, 50].
   - Selected optimal hyperparameters for each model based on validation performance.

## 3. Training and Validation Process
   - **Training Loop**:
     - Optimized models over multiple epochs for various hyperparameter combinations.
     - Logged training and validation accuracy and loss per epoch.
     - Tracked total training time for each model.
   - **Validation**:
     - Monitored performance on validation data to track generalization and prevent overfitting.

## 4. Preprocessing Steps
   - **Image Enhancements**:
     - Applied Median Blur, Basic Sharpening, and Contrast Stretching.
     - Used CLAHE (Contrast Limited Adaptive Histogram Equalization) for enhanced contrast.
   - **Transformations**:
     - Resized images to 64x64.
     - Applied Random Horizontal Flip for data augmentation.
     - Normalized pixel values to mean `[0.485, 0.456, 0.406]` and standard deviation `[0.229, 0.224, 0.225]`.

## 5. Evaluation Metrics and Visualizations
   - **Test Set Evaluation**:
     - Assessed using Macro-averaged F1 score, precision, recall, and per-class F1 scores.
   - **Visualizations**:
     - Confusion Matrix for class-wise prediction analysis.
     - Classification Report Heatmap with precision, recall, and F1 scores.
     - ROC Curves for multi-class AUC (Area Under the Curve) evaluation.

## 6. Key Outputs for Each Dataset-Model Combination
   - **Training and Validation Curves**:
     - Generated and saved plots for training and validation accuracy and loss.
   - **Model State Saving**:
     - Saved trained model states for potential future use.
   - **Detailed Metrics Visualizations**:
     - Produced classification reports, confusion matrices, and ROC curves for comprehensive performance analysis.


In [1]:
# Prints the installed versions of Python, NumPy, and PyTorch libraries
import sys
import numpy as np
import torch
print(f"Python Version: {sys.version}")
print(f"NumPy Version: {np.__version__}")
print(f"PyTorch Version: {torch.__version__}")

# Function to check GPU availability and display memory statistics using PyTorch's CUDA interface
def check_gpu_status():
    # Check if GPU is available
    if torch.cuda.is_available():
        print(f"CUDA is available. PyTorch is using GPU.\n")
        # Get the number of available GPUs
        num_gpus = torch.cuda.device_count()
        print(f"Number of GPUs available: {num_gpus}")
        # Loop through each GPU and display its details
        for gpu_id in range(num_gpus):
            gpu_name = torch.cuda.get_device_name(gpu_id)
            gpu_memory_allocated = torch.cuda.memory_allocated(gpu_id) / (1024 ** 3)  # In GB
            gpu_memory_cached = torch.cuda.memory_reserved(gpu_id) / (1024 ** 3)      # In GB
            gpu_memory_total = torch.cuda.get_device_properties(gpu_id).total_memory / (1024 ** 3)  # In GB
            print(f"\nGPU {gpu_id}: {gpu_name}")
            print(f"  Total Memory: {gpu_memory_total:.2f} GB")
            print(f"  Memory Allocated: {gpu_memory_allocated:.2f} GB")
            print(f"  Memory Reserved (Cached): {gpu_memory_cached:.2f} GB")
    else:
        print("CUDA is not available. PyTorch is using the CPU.")

# Run the GPU status check
check_gpu_status()

Python Version: 3.8.9 (tags/v3.8.9:a743f81, Apr  6 2021, 14:02:34) [MSC v.1928 64 bit (AMD64)]
NumPy Version: 1.24.1
PyTorch Version: 2.4.1+cu121
CUDA is available. PyTorch is using GPU.

Number of GPUs available: 1

GPU 0: NVIDIA GeForce RTX 4070
  Total Memory: 11.99 GB
  Memory Allocated: 0.00 GB
  Memory Reserved (Cached): 0.00 GB


In [5]:
# Image classification pipeline using a custom simple CNN model and enhanced preprocessing

import torch
import torch.nn as nn
import torch.nn.functional as F
import torch.optim as optim
from torchvision import datasets, transforms
from torch.utils.data import DataLoader
from sklearn.metrics import f1_score, classification_report, confusion_matrix, precision_recall_fscore_support, roc_curve, auc
import matplotlib.pyplot as plt
import seaborn as sns
import logging
import time
import numpy as np
import random
import cv2
from PIL import Image
from itertools import cycle
from sklearn.preprocessing import label_binarize

# Ensure reproducibility
def set_seed(seed=42):
    random.seed(seed)
    np.random.seed(seed)
    torch.manual_seed(seed)


set_seed(42)

# Configure logging
logging.basicConfig(level=logging.INFO, format='%(asctime)s - %(message)s')
logger = logging.getLogger(__name__)

# Paths
train_dir = 'dataset/dataset_aug/train'
val_dir = 'dataset/dataset_aug/val'
test_dir = 'dataset/dataset_aug/test'

# Hyperparameters
batch_sizes = [128]
learning_rates = [0.0001]
epoch_counts = [25]
NUM_CLASSES = len(datasets.ImageFolder(train_dir).classes)
print("Class to index mapping:", datasets.ImageFolder(train_dir).class_to_idx)
print(f"NUM_CLASSES set to: {NUM_CLASSES}")

DEVICE = torch.device("cuda" if torch.cuda.is_available() else "cpu")

# Preprocessing Functions
def apply_median_blur(image: np.ndarray) -> np.ndarray:
    try:
        image = image.astype(np.uint8)
        if len(image.shape) == 2:
            image = cv2.cvtColor(image, cv2.COLOR_GRAY2RGB)
        elif image.shape[2] == 4:
            image = cv2.cvtColor(image, cv2.COLOR_RGBA2RGB)
        return cv2.medianBlur(image, 3)
    except Exception as e:
        logger.error(f"Error in median blur: {str(e)}")
        return image

def apply_basic_sharpen(image: np.ndarray) -> np.ndarray:
    try:
        kernel = np.array([[-1, -1, -1], [-1, 9, -1], [-1, -1, -1]])
        image = image.astype(np.uint8)
        return cv2.filter2D(image, -1, kernel)
    except Exception as e:
        logger.error(f"Error in basic sharpen: {str(e)}")
        return image

def apply_contrast_stretch(image: np.ndarray) -> np.ndarray:
    try:
        image_float = image.astype(float)
        for i in range(3):
            p2, p98 = np.percentile(image_float[:, :, i], (2, 98))
            image_float[:, :, i] = np.clip((image_float[:, :, i] - p2) / (p98 - p2) * 255, 0, 255)
        return image_float.astype(np.uint8)
    except Exception as e:
        logger.error(f"Error in contrast stretch: {str(e)}")
        return image

class CLAHE:
    def __call__(self, img):
        img = np.array(img)
        img = cv2.cvtColor(img, cv2.COLOR_RGB2LAB)
        l, a, b = cv2.split(img)
        clahe = cv2.createCLAHE(clipLimit=2.0, tileGridSize=(8, 8))
        cl = clahe.apply(l)
        img = cv2.merge((cl, a, b))
        return transforms.functional.to_pil_image(cv2.cvtColor(img, cv2.COLOR_LAB2RGB))

# Data Transformations
data_transforms = {
    'train': transforms.Compose([
        transforms.Lambda(lambda x: Image.fromarray(apply_median_blur(np.array(x)))),
        transforms.Lambda(lambda x: Image.fromarray(apply_basic_sharpen(np.array(x)))),
        transforms.Lambda(lambda x: Image.fromarray(apply_contrast_stretch(np.array(x)))),
        CLAHE(),
        transforms.Resize((64, 64)),
        transforms.RandomHorizontalFlip(),
        transforms.ToTensor(),
        transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225])
    ]),
    'val': transforms.Compose([
        transforms.Lambda(lambda x: Image.fromarray(apply_median_blur(np.array(x)))),
        transforms.Lambda(lambda x: Image.fromarray(apply_basic_sharpen(np.array(x)))),
        transforms.Lambda(lambda x: Image.fromarray(apply_contrast_stretch(np.array(x)))),
        CLAHE(),
        transforms.Resize((64, 64)),
        transforms.ToTensor(),
        transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225])
    ]),
    'test': transforms.Compose([
        transforms.Lambda(lambda x: Image.fromarray(apply_median_blur(np.array(x)))),
        transforms.Lambda(lambda x: Image.fromarray(apply_basic_sharpen(np.array(x)))),
        transforms.Lambda(lambda x: Image.fromarray(apply_contrast_stretch(np.array(x)))),
        CLAHE(),
        transforms.Resize((64, 64)),
        transforms.ToTensor(),
        transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225])
    ]),
}

# Load datasets
train_dataset = datasets.ImageFolder(train_dir, transform=data_transforms['train'])
val_dataset = datasets.ImageFolder(val_dir, transform=data_transforms['val'])
test_dataset = datasets.ImageFolder(test_dir, transform=data_transforms['test'])

logger.info(f"Training dataset size: {len(train_dataset)}")
logger.info(f"Validation dataset size: {len(val_dataset)}")
logger.info(f"Test dataset size: {len(test_dataset)}")

# Custom CNN Model Definition
class CustomCNN(nn.Module):
    def __init__(self, num_classes=NUM_CLASSES):
        super(CustomCNN, self).__init__()
        self.conv1 = nn.Conv2d(3, 16, kernel_size=3, padding=1)
        self.bn1 = nn.BatchNorm2d(16)
        self.pool = nn.MaxPool2d(2, 2)
        self.conv2 = nn.Conv2d(16, 32, kernel_size=3, padding=1)
        self.bn2 = nn.BatchNorm2d(32)
        self.conv3 = nn.Conv2d(32, 64, kernel_size=3, padding=1)
        self.bn3 = nn.BatchNorm2d(64)
        self.fc1 = nn.Linear(64 * 8 * 8, 64)
        self.bn4 = nn.BatchNorm1d(64)
        self.fc2 = nn.Linear(64, num_classes)

    def forward(self, x):
        x = self.pool(F.relu(self.bn1(self.conv1(x))))
        x = self.pool(F.relu(self.bn2(self.conv2(x))))
        x = self.pool(F.relu(self.bn3(self.conv3(x))))
        x = x.view(-1, 64 * 8 * 8)
        x = F.relu(self.bn4(self.fc1(x)))
        x = self.fc2(x)
        return x

def train_and_validate(model, train_loader, val_loader, criterion, optimizer, epochs):
    train_acc_history, val_acc_history = [], []
    train_loss_history, val_loss_history = [], []
    total_training_time = 0
    
    for epoch in range(epochs):
        epoch_start_time = time.time()
        model.train()
        train_loss = 0
        train_correct = 0

        for inputs, labels in train_loader:
            inputs, labels = inputs.to(DEVICE), labels.to(DEVICE)
            # Debugging: check label range
            if not torch.all((labels >= 0) & (labels < NUM_CLASSES)):
                print(f"Invalid label(s) found in batch: {labels}")
                print(f"Unique labels in batch: {labels.unique()}")
                raise ValueError(f"Invalid label detected. Expected labels between 0 and {NUM_CLASSES - 1}.")
            optimizer.zero_grad()
            outputs = model(inputs)
            loss = criterion(outputs, labels)

            loss.backward()
            optimizer.step()

            train_loss += loss.item() * inputs.size(0)
            _, preds = torch.max(outputs, 1)
            train_correct += torch.sum(preds == labels.data)

        train_loss /= len(train_loader.dataset)
        train_acc = train_correct.double() / len(train_loader.dataset)
        train_loss_history.append(train_loss)
        train_acc_history.append(train_acc.cpu())

        logger.info(f"Epoch {epoch + 1}/{epochs} - Training: Loss = {train_loss:.4f}, Accuracy = {train_acc:.4f}")

        model.eval()
        val_loss = 0
        val_correct = 0

        with torch.no_grad():
            for inputs, labels in val_loader:
                inputs, labels = inputs.to(DEVICE), labels.to(DEVICE)
                outputs = model(inputs)
                loss = criterion(outputs, labels)

                val_loss += loss.item() * inputs.size(0)
                _, preds = torch.max(outputs, 1)
                val_correct += torch.sum(preds == labels.data)

        val_loss /= len(val_loader.dataset)
        val_acc = val_correct.double() / len(val_loader.dataset)
        val_loss_history.append(val_loss)
        val_acc_history.append(val_acc.cpu())

        epoch_time = time.time() - epoch_start_time
        total_training_time += epoch_time

        logger.info(f"Epoch {epoch + 1}/{epochs} - Validation: Loss = {val_loss:.4f}, Accuracy = {val_acc:.4f}")
        logger.info(f"Time for epoch {epoch + 1}: {epoch_time:.2f}s")

    logger.info(f"Total Training Time: {total_training_time:.2f}s")

    return train_acc_history, val_acc_history, train_loss_history, val_loss_history

def test_and_evaluate(model, test_loader, class_names):
    model.eval()
    all_labels = []
    all_preds = []
    all_probs = []

    with torch.no_grad():
        for inputs, labels in test_loader:
            inputs, labels = inputs.to(DEVICE), labels.to(DEVICE)
            outputs = model(inputs)
            probabilities = F.softmax(outputs, dim=1)
            _, preds = torch.max(outputs, 1)
            
            all_labels.extend(labels.cpu().numpy())
            all_preds.extend(preds.cpu().numpy())
            all_probs.extend(probabilities.cpu().numpy())

    # Calculate and log macro F1 score
    macro_f1 = f1_score(all_labels, all_preds, average='macro')
    logger.info(f"Macro-Averaged F1 Score: {macro_f1:.4f}")

    # Create classification metrics heatmap
    precision, recall, f1, _ = precision_recall_fscore_support(all_labels, all_preds, average=None, labels=np.unique(all_labels))
    metrics_df = np.array([precision, recall, f1]).T
    plt.figure(figsize=(8, 6))
    sns.heatmap(metrics_df, annot=True, cmap="viridis", xticklabels=["Precision", "Recall", "F1"], yticklabels=class_names)
    plt.title("Dataset CustomCNN Classification Report")
    plt.xlabel("Metric")
    plt.ylabel("Class")
    plt.savefig('dataset_customcnn_classification_report.png', dpi=300, bbox_inches='tight', pad_inches=0.1)
    plt.close()

    # Create confusion matrix
    cm = confusion_matrix(all_labels, all_preds)
    plt.figure(figsize=(10, 8))
    sns.heatmap(cm, annot=True, fmt="d", cmap="Blues", xticklabels=class_names, yticklabels=class_names)
    plt.xlabel("Predicted Label")
    plt.ylabel("True Label")
    plt.title("Dataset CustomCNN Confusion Matrix")
    plt.savefig('dataset_customcnn_confusion_matrix.png', dpi=300, bbox_inches='tight', pad_inches=0.1)
    plt.close()

    # Create ROC curves
    all_labels_binarized = label_binarize(all_labels, classes=np.arange(NUM_CLASSES))
    all_probs = np.array(all_probs)

    plt.figure(figsize=(10, 8))
    colors = cycle(['aqua', 'darkorange', 'cornflowerblue', 'green', 'red', 'purple', 'brown', 'pink', 'gray'])
    for i, color in zip(range(NUM_CLASSES), colors):
        fpr, tpr, _ = roc_curve(all_labels_binarized[:, i], all_probs[:, i])
        roc_auc = auc(fpr, tpr)
        plt.plot(fpr, tpr, color=color, lw=2, label=f'Class {class_names[i]} (AUC = {roc_auc:.2f})')

    plt.plot([0, 1], [0, 1], color='navy', lw=2, linestyle='--')
    plt.xlabel('False Positive Rate')
    plt.ylabel('True Positive Rate')
    plt.title('Dataset CustomCNN ROC Curves for All Classes')
    plt.legend(loc="lower right")
    plt.savefig('dataset_customcnn_ROC_All_Classes.png', dpi=300, bbox_inches='tight', pad_inches=0.1)
    plt.close()

# Training and Evaluation Pipeline
for batch_size in batch_sizes:
    for lr in learning_rates:
        for epochs in epoch_counts:
            logger.info(f"Training with Batch Size: {batch_size}, Learning Rate: {lr}, Epochs: {epochs}")

            train_loader = DataLoader(train_dataset, batch_size=batch_size, shuffle=True)
            val_loader = DataLoader(val_dataset, batch_size=batch_size, shuffle=False)
            test_loader = DataLoader(test_dataset, batch_size=batch_size, shuffle=False)

            model = CustomCNN()
            model = model.to(DEVICE)

            criterion = nn.CrossEntropyLoss()
            optimizer = optim.Adam(model.parameters(), lr=lr)

            train_acc_history, val_acc_history, train_loss_history, val_loss_history = train_and_validate(
                model, train_loader, val_loader, criterion, optimizer, epochs
            )

            # Save accuracy and loss graphs
            epochs_range = range(1, epochs + 1)
            plt.figure()
            plt.plot(epochs_range, train_acc_history, label='Training Accuracy')
            plt.plot(epochs_range, val_acc_history, label='Validation Accuracy')
            plt.xlabel('Epoch')
            plt.ylabel('Accuracy')
            plt.title(f'Dataset CustomCNN Accuracy (Batch Size {batch_size}, LR {lr}, Epochs {epochs})')
            plt.legend()
            plt.savefig(f'dataset_customcnn_accuracy_batch_{batch_size}_lr_{lr}_epochs_{epochs}.png', dpi=300, bbox_inches='tight', pad_inches=0.1)
            plt.close()

            plt.figure()
            plt.plot(epochs_range, train_loss_history, label='Training Loss')
            plt.plot(epochs_range, val_loss_history, label='Validation Loss')
            plt.xlabel('Epoch')
            plt.ylabel('Loss')
            plt.title(f'Dataset CustomCNN Loss (Batch Size {batch_size}, LR {lr}, Epochs {epochs})')
            plt.legend()
            plt.savefig(f'dataset_customcnn_loss_batch_{batch_size}_lr_{lr}_epochs_{epochs}.png', dpi=300, bbox_inches='tight', pad_inches=0.1)
            plt.close()

            # Test and evaluate model
            test_and_evaluate(model, test_loader, class_names=test_dataset.classes)

            # Save the trained model
            torch.save(model.state_dict(), 'dataset_customcnn_model_trained.pth')

2025-05-08 21:06:14,796 - Training dataset size: 19063
2025-05-08 21:06:14,797 - Validation dataset size: 4037
2025-05-08 21:06:14,797 - Test dataset size: 4202
2025-05-08 21:06:14,799 - Training with Batch Size: 128, Learning Rate: 0.0001, Epochs: 25


Class to index mapping: {'1': 0, '10': 1, '100': 2, '1000': 3, '2': 4, '20': 5, '200': 6, '5': 7, '50': 8, '500': 9}
NUM_CLASSES set to: 10


2025-05-08 21:07:30,048 - Epoch 1/25 - Training: Loss = 1.7005, Accuracy = 0.4913
2025-05-08 21:07:45,550 - Epoch 1/25 - Validation: Loss = 1.4755, Accuracy = 0.5915
2025-05-08 21:07:45,551 - Time for epoch 1: 90.75s
2025-05-08 21:09:01,472 - Epoch 2/25 - Training: Loss = 1.2240, Accuracy = 0.6998
2025-05-08 21:09:16,786 - Epoch 2/25 - Validation: Loss = 1.1688, Accuracy = 0.7023
2025-05-08 21:09:16,787 - Time for epoch 2: 91.23s
2025-05-08 21:10:31,751 - Epoch 3/25 - Training: Loss = 0.9455, Accuracy = 0.7896
2025-05-08 21:10:47,300 - Epoch 3/25 - Validation: Loss = 0.9779, Accuracy = 0.7595
2025-05-08 21:10:47,301 - Time for epoch 3: 90.51s
2025-05-08 21:12:02,186 - Epoch 4/25 - Training: Loss = 0.7509, Accuracy = 0.8429
2025-05-08 21:12:18,015 - Epoch 4/25 - Validation: Loss = 0.7976, Accuracy = 0.8006
2025-05-08 21:12:18,015 - Time for epoch 4: 90.71s
2025-05-08 21:13:32,513 - Epoch 5/25 - Training: Loss = 0.6096, Accuracy = 0.8759
2025-05-08 21:13:47,995 - Epoch 5/25 - Validation:

In [4]:
# Assuming 'model' is your defined PyTorch model
total_params = sum(p.numel() for p in model.parameters())
trainable_params = sum(p.numel() for p in model.parameters() if p.requires_grad)

print(f'Total parameters: {total_params}')
print(f'Trainable parameters: {trainable_params}')

Total parameters: 286794
Trainable parameters: 286794


In [15]:
# Inference function for a single image

import torch
import torch.nn as nn
import torch.nn.functional as F
from torchvision import transforms
from PIL import Image
import numpy as np
import cv2


def apply_median_blur(image: np.ndarray) -> np.ndarray:
    image = image.astype(np.uint8)
    if len(image.shape) == 2:
        image = cv2.cvtColor(image, cv2.COLOR_GRAY2RGB)
    elif image.shape[2] == 4:
        image = cv2.cvtColor(image, cv2.COLOR_RGBA2RGB)
    return cv2.medianBlur(image, 3)

def apply_basic_sharpen(image: np.ndarray) -> np.ndarray:
    kernel = np.array([[-1, -1, -1], [-1, 9, -1], [-1, -1, -1]])
    image = image.astype(np.uint8)
    return cv2.filter2D(image, -1, kernel)

def apply_contrast_stretch(image: np.ndarray) -> np.ndarray:
    image_float = image.astype(float)
    for i in range(3):
        p2, p98 = np.percentile(image_float[:, :, i], (2, 98))
        image_float[:, :, i] = np.clip((image_float[:, :, i] - p2) / (p98 - p2) * 255, 0, 255)
    return image_float.astype(np.uint8)

class CLAHE:
    def __call__(self, img):
        img = np.array(img)
        img = cv2.cvtColor(img, cv2.COLOR_RGB2LAB)
        l, a, b = cv2.split(img)
        clahe = cv2.createCLAHE(clipLimit=2.0, tileGridSize=(8, 8))
        cl = clahe.apply(l)
        img = cv2.merge((cl, a, b))
        return transforms.functional.to_pil_image(cv2.cvtColor(img, cv2.COLOR_LAB2RGB))


preprocess = transforms.Compose([
    transforms.Lambda(lambda x: Image.fromarray(apply_median_blur(np.array(x)))),
    transforms.Lambda(lambda x: Image.fromarray(apply_basic_sharpen(np.array(x)))),
    transforms.Lambda(lambda x: Image.fromarray(apply_contrast_stretch(np.array(x)))),
    CLAHE(),
    transforms.Resize((64, 64)),
    transforms.ToTensor(),
    transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225])
])


class CustomCNN(nn.Module):
    def __init__(self, num_classes):
        super(CustomCNN, self).__init__()
        self.conv1 = nn.Conv2d(3, 16, kernel_size=3, padding=1)
        self.bn1 = nn.BatchNorm2d(16)
        self.pool = nn.MaxPool2d(2, 2)
        self.conv2 = nn.Conv2d(16, 32, kernel_size=3, padding=1)
        self.bn2 = nn.BatchNorm2d(32)
        self.conv3 = nn.Conv2d(32, 64, kernel_size=3, padding=1)
        self.bn3 = nn.BatchNorm2d(64)
        self.fc1 = nn.Linear(64 * 8 * 8, 64)
        self.bn4 = nn.BatchNorm1d(64)
        self.fc2 = nn.Linear(64, num_classes)

    def forward(self, x):
        x = self.pool(F.relu(self.bn1(self.conv1(x))))
        x = self.pool(F.relu(self.bn2(self.conv2(x))))
        x = self.pool(F.relu(self.bn3(self.conv3(x))))
        x = x.view(-1, 64 * 8 * 8)
        x = F.relu(self.bn4(self.fc1(x)))
        x = self.fc2(x)
        return x


def classify_note_image(image_path, model_weights_path, class_names):
    DEVICE = torch.device("cuda" if torch.cuda.is_available() else "cpu")
    num_classes = len(class_names)

    # Load the model & weights
    model = CustomCNN(num_classes=num_classes)
    model.load_state_dict(torch.load(model_weights_path, map_location=DEVICE))
    model = model.to(DEVICE)
    model.eval()

    # Load and preprocess the image
    img = Image.open(image_path).convert('RGB')
    input_tensor = preprocess(img).unsqueeze(0).to(DEVICE)

    # Run inference
    with torch.no_grad():
        outputs = model(input_tensor)
        probabilities = F.softmax(outputs, dim=1)
        _, predicted_idx = torch.max(probabilities, 1)

    predicted_label = class_names[predicted_idx.item()]
    confidence = probabilities[0][predicted_idx].item()

    print(f"Predicted Class: {predicted_label} (Confidence: {confidence:.2f})")
    return predicted_label, confidence


In [21]:

# Define your class labels based on your dataset
class_names = ['1', '10', '100','1000', '2', '20', '200', '5', '50', '500']

# Call the function with your image
image_path = 'fig_20_torn_note.png'
model_weights_path = 'dataset_customcnn_model_trained.pth'
classify_note_image(image_path, model_weights_path, class_names)

  model.load_state_dict(torch.load(model_weights_path, map_location=DEVICE))


Predicted Class: 20 (Confidence: 0.97)


('20', 0.9661031365394592)

In [12]:
print(datasets.ImageFolder(train_dir).class_to_idx)


{'1': 0, '10': 1, '100': 2, '1000': 3, '2': 4, '20': 5, '200': 6, '5': 7, '50': 8, '500': 9}
