# Coin Classifier Model
This model is designed for recognition of different currencies, their denominations and country

# Outline
1. [Import Packages](#1---import-packages)
2. [Hyperparameters](#2---hyperparameters)
3. [Loading the Dataset](#3---loading-the-dataset)
   - [Encoding Labels](#31---encoding-labels)
   - [Train/Val split](#32---splitting-the-data-into-test-and-validation-sets)
   - [Image loading](#33---image-preprocessing-and-loading)
4. [Data Augmentation](#4---data-augmentation)
5. [Data Loader](#5---dataloader)
6. [Model](#6---load-pre-trained-efficientnet-b0-model)
7. [Pipeline](#7---setup-pipeline)
   - [Validation](#71---validation-pipeline)
   - [Training](#72---training-pipeline)
8. [Evaluation](#8---evaluation)
9. [Inference](#9---inference)

<a name="1"></a>
## 1 - Import Packages

The following packages are used:
- `numpy` for  scientific computation in python
- `torch` and `sklearn` for defining the model architecture
- `os` and `pandas` for data manipulation
- `PIL` for image manipulation
- `matplotlib` and `seaborn` for plotting

In [None]:
import os
import timm
import warnings
import platform
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
from PIL import Image, UnidentifiedImageError

import torch
from torch import nn, optim
from torchvision import transforms
from torch.utils.data import Dataset, DataLoader

from sklearn.model_selection import train_test_split
from sklearn.metrics import classification_report, confusion_matrix

### Choose your device type
Only run the block for your device type

- (Note: If your device doesn't match either of the blocks, running any one of them is fine)

In [None]:
# Enable GPU (For Macbook Silicon)
device = torch.device("mps" if torch.backends.mps.is_available() else "cpu")
print(f"Using device: {device}")

In [None]:
# Enable GPU (For Nvidia-based systems)
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
print(f"Using device: {device}")

<a name="2"></a>
## 2 - Hyperparameters

Hyperparameter Choices
- Batch Size: Set to 32 to allow fast training without overloading memory. This size provides a good balance between convergence stability and training speed.
- Learning Rate: Set to 1e-4 to ensure gradual learning and prevent overshooting the minimum, especially since we’re using a pretrained model.
- Epochs: 10 epochs used, depending on validation performance, to prevent overfitting.

In [None]:
TRAIN_CSV = "train.csv"
TRAIN_IMG_DIR = "./train"
TEST_CSV = "test.csv"
TEST_IMG_DIR = "./test"
OUTLINE_CSV = "sample_submission.csv"
BATCH_SIZE = 32
EPOCHS = 10
LEARNING_RATE = 1e-4

In [None]:
# Set num_workers based on OS
if platform.system() == "Darwin":  # macOS
    NUM_WORKERS = 0
else:
    NUM_WORKERS = 2  # or 4, depending on the system

<a name="5"></a>
## 3 - Loading the Dataset

<a name="3.1"></a>
### 3.1 - Encoding labels

In [None]:
df = pd.read_csv(TRAIN_CSV)
label_to_idx = {label: idx for idx, label in enumerate(sorted(df["Class"].unique()))}
idx_to_label = {v: k for k, v in label_to_idx.items()}
df["label"] = df["Class"].map(label_to_idx)

<a name="3.2"></a>
### 3.2 - Splitting the data into test and validation sets

In [None]:
train_df, val_df = train_test_split(df, test_size=0.1, stratify=df["label"], random_state=42)

<a name="3.3"></a>
### 3.3 - Image preprocessing and loading

In [None]:
class CoinDataset(Dataset):
    def __init__(self, dataframe, img_dir, transform=None):
        self.dataframe = dataframe
        self.img_dir = img_dir
        self.transform = transform
        self.supported_exts = ['.jpg', '.jpeg', '.png', '.webp']
        self.label2idx = {label: idx for idx, label in enumerate(sorted(self.dataframe['Class'].unique()))}

    def __len__(self):
        return len(self.dataframe)

    def __getitem__(self, idx):
        img_id = str(self.dataframe.iloc[idx, 0])
        label_str = self.dataframe.iloc[idx, 1]
        label = torch.tensor(self.label2idx[label_str])

        img_path = None
        for ext in self.supported_exts:
            possible_path = os.path.join(self.img_dir, img_id + ext)
            if os.path.exists(possible_path):
                img_path = possible_path
                break

        if img_path is None:
            warnings.warn(f"Image not found for ID {img_id}, returning dummy.")
            return torch.zeros(3, 224, 224), label

        try:
            image = Image.open(img_path).convert("RGB")
        except (UnidentifiedImageError, OSError):
            warnings.warn(f"Corrupted image: {img_path}, returning dummy.")
            return torch.zeros(3, 224, 224), label

        if self.transform:
            image = self.transform(image)

        return image, label

<a name="4"></a>
## 4 - Data Augmentation

Without data augmentation, the EfficientNet model achieved a validation accuracy of around 65–70%. However, after incorporating data augmentation techniques, the accuracy significantly improved to approximately 85–90%, representing a substantial performance boost.

In [None]:
train_transforms = transforms.Compose([
    transforms.Resize((224, 224)),
    transforms.RandomHorizontalFlip(),
    transforms.ToTensor(),
    transforms.Normalize([0.5]*3, [0.5]*3)
])
val_transforms = transforms.Compose([
    transforms.Resize((224, 224)),
    transforms.ToTensor(),
    transforms.Normalize([0.5]*3, [0.5]*3)
])

<a name="5"></a>
## 5 - Dataloader

In [None]:
train_dataset = CoinDataset(train_df, TRAIN_IMG_DIR, transform=train_transforms)
val_dataset = CoinDataset(val_df, TRAIN_IMG_DIR, transform=val_transforms)

train_loader = DataLoader(train_dataset, batch_size=BATCH_SIZE, shuffle=True)
val_loader = DataLoader(val_dataset, batch_size=BATCH_SIZE)

<a name="6"></a>
## 6 - Load pre-trained EfficientNet B0 model

<a name="6.1"></a>
### 6.1 - Setup Model Architecture

Initially, I implemented a custom Convolutional Neural Network (CNN) from scratch using layers such as Conv2D, MaxPooling2D, Dropout, and BatchNormalization. However, due to the large size and complexity of the dataset, the model took an extremely long time to train—approximately 5 to 6 hours—and achieved a very low validation accuracy of only 10–15%.

In my next approach, I leveraged transfer learning by fine-tuning a pre-trained ResNet50 model trained on the ImageNet dataset. While this model was somewhat faster to train (around 2 to 3 hours), it still failed to produce a significant improvement, yielding a validation accuracy of only 25–30%.

After extensive research and experimentation, I discovered that EfficientNetB0 offered a more optimal trade-off between performance and efficiency. By fine-tuning a pre-trained EfficientNetB0 model on my coin classification dataset, I was able to reduce training time significantly to just 30 minutes. More importantly, the model achieved a substantial boost in validation accuracy, reaching around 85–90%.

- Note : All timings listed here are timings using the Apple Silicon M3 Pro chip

From my research, I found that EfficientNet is a strong choice for the backbone model because it strikes a great balance between accuracy and computational efficiency. It uses a compound scaling method that adjusts depth, width, and resolution in a structured way, which makes it especially effective for fine-grained tasks like coin classification. On top of that, using a pretrained EfficientNet model on ImageNet boosts performance significantly through transfer learning — even when the dataset is relatively small.

In [None]:
num_classes = len(label_to_idx)
model = timm.create_model("efficientnet_b0", pretrained=True, num_classes=num_classes)
model.to(device)

<a name="6.2"></a>
### 6.2 - Setup Loss and Optimizer

- Loss Function: CrossEntropyLoss is used for multi-class classification, which is standard for categorical labels.
- Optimizer: Adam optimizer was chosen for its adaptive learning capabilities and fast convergence on deep networks.

In [None]:
criterion = nn.CrossEntropyLoss()
optimizer = optim.Adam(model.parameters(), lr=LEARNING_RATE)

<a name="7"></a>
## 7 - Setup Pipeline

In [None]:
best_val_accuracy = 0.0
best_model_path = 'model.pth'

<a name="7.1"></a>
### 7.1 - Validation pipeline

In [None]:
def validate():
    model.eval()
    total_loss = 0
    correct = 0

    with torch.no_grad():
        for images, labels in val_loader:
            images, labels = images.to(device), labels.to(device)
            outputs = model(images)
            loss = criterion(outputs, labels)

            total_loss += loss.item()
            correct += (outputs.argmax(1) == labels).sum().item()

    val_loss = total_loss / len(val_loader)
    val_accuracy = correct / len(val_dataset)
    print(f"Validation Loss: {val_loss:.4f}, Validation Accuracy: {val_accuracy:.4f}")
    return val_loss, val_accuracy

<a name="7.2"></a>
### 7.2 - Training pipeline

In [None]:
def train_model():
    best_val_accuracy = 0.0
    best_model_path = 'model.pth'

    for epoch in range(EPOCHS):
        model.train()
        total_loss = 0
        correct = 0

        for images, labels in train_loader:
            images, labels = images.to(device), labels.to(device)
            optimizer.zero_grad()
            outputs = model(images)
            loss = criterion(outputs, labels)
            loss.backward()
            optimizer.step()

            total_loss += loss.item()
            correct += (outputs.argmax(1) == labels).sum().item()

        acc = correct / len(train_dataset)
        print(f"[Epoch {epoch+1}] Loss: {total_loss:.4f}, Training Accuracy: {acc:.4f}")

        _, val_accuracy = validate()

        if val_accuracy > best_val_accuracy:
            best_val_accuracy = val_accuracy
            torch.save(model.state_dict(), best_model_path)
            print(f"New best model saved with validation accuracy: {val_accuracy:.4f}")

<a name="7.3"></a>
### 7.3 - Run training loop, while dealing with corrupted images

In [None]:
train_model()

<a name="7.4"></a>
### 7.4 - Save model for future use

In [None]:
torch.save(model.state_dict(), "model.pth")
print("Model saved")

- Challenges Faced During Training and How They Were Overcome:
    - Long training times when building a model from scratch
    - Low accuracy with ResNet50 despite longer training
- Fixed by:
    - Switching to EfficientNetB0 (smaller + better-performing model)
    - Adding data augmentation to improve generalization

<a name="8"></a>
## 8 - Evaluation

In [None]:
def classification_report_eval(model, dataloader, class_names, device):
    model.eval()
    y_true, y_pred = [], []

    with torch.no_grad():
        for images, labels in dataloader:
            images, labels = images.to(device), labels.to(device)
            outputs = model(images)
            preds = outputs.argmax(1)

            y_true.extend(labels.cpu().numpy())
            y_pred.extend(preds.cpu().numpy())

    print("Classification Report:")
    print(classification_report(y_true, y_pred, target_names=class_names))
    return y_true, y_pred

def plot_confusion_matrix(y_true, y_pred, class_names):
    cm = confusion_matrix(y_true, y_pred)
    plt.figure(figsize=(12, 10))
    sns.heatmap(cm, annot=True, fmt="d", cmap="Blues",
                xticklabels=class_names, yticklabels=class_names)
    plt.xlabel("Predicted Label")
    plt.ylabel("True Label")
    plt.title("Confusion Matrix")
    plt.tight_layout()
    plt.show()

<a name="8.1"></a>
### 8.1 - Classification Report

In [None]:
model.load_state_dict(torch.load("model.pth", map_location=device))

y_true, y_pred = classification_report_eval(model, val_loader, list(idx_to_label.values()), device)

<a name="8.2"></a>
### 8.2 - Confusion Matrix

In [None]:
plot_confusion_matrix(y_true, y_pred, list(idx_to_label.values()))

This approach aims to strike a balance between accuracy, training efficiency, and practical deployment. EfficientNet, combined with well-tuned hyperparameters and error handling, provides a robust pipeline for coin classification.

What I’d Improve with More Time/Data
- Tune learning rate, batch size, and augmentation pipeline.
- Collect more class-balanced data.
- Try OCR feature extraction + CNN fusion for text-heavy coins.
- Use Test-Time Augmentation (TTA) or ensembling to further boost accuracy.
- Try larger EfficientNet variants (B2/B3)
- Apply label smoothing or learning rate schedules
- Increase dataset size or perform semi-supervised training

<a name="9"></a>
## 9 - Inference

<a name="9.1"></a>
### 9.1 - Test image loader

In [None]:
class TestCoinDataset(Dataset):
    def __init__(self, dataframe, img_dir, transform=None):
        self.dataframe = dataframe
        self.img_dir = img_dir
        self.transform = transform
        self.supported_exts = ['.jpg', '.jpeg', '.png', '.webp']

    def __len__(self):
        return len(self.dataframe)

    def __getitem__(self, idx):
        img_id = str(self.dataframe.iloc[idx, 0])
        for ext in self.supported_exts:
            img_path = os.path.join(self.img_dir, img_id + ext)
            if os.path.exists(img_path):
                break
        else:
            warnings.warn(f"Test image {img_id} not found. Returning dummy.")
            return torch.zeros(3, 224, 224), img_id

        try:
            image = Image.open(img_path).convert("RGB")
        except (UnidentifiedImageError, OSError):
            warnings.warn(f"Corrupted test image {img_id}. Returning dummy.")
            return torch.zeros(3, 224, 224), img_id

        if self.transform:
            image = self.transform(image)
        return image, img_id

<a name="9.2"></a>
### 9.2 - Run Inference

In [None]:
# Load trained model
model.load_state_dict(torch.load("model.pth", map_location=device))
model.eval()

# Load test set
test_df = pd.read_csv(TEST_CSV)
test_dataset = TestCoinDataset(test_df, TEST_IMG_DIR, transform=val_transforms)
test_loader = DataLoader(test_dataset, batch_size=BATCH_SIZE)

# Run inference
predictions = {}
with torch.no_grad():
    for images, ids in test_loader:
        images = images.to(device)
        outputs = model(images)
        preds = outputs.argmax(1).cpu().numpy()
        for img_id, pred in zip(ids, preds):
            label_str = idx_to_label[pred]
            predictions[int(img_id)] = label_str

<a name="9.3"></a>
### 9.3 - Save Predictions

In [None]:
# Load original sample
submission_df = pd.read_csv(OUTLINE_CSV)

# Replace placeholder with predictions
submission_df["Class"] = submission_df["Id"].map(predictions).fillna("unknown")

submission_df.to_csv("submission.csv", index=False)
print("submission.csv created!")