βBalance the data. Sharpen the model. Farm insight, not just accuracy.β
This repository contains a classification comparison experiment developed for COS711 Assignment 3. The quest explores how model choice and enhancement strategies affect classification performance, with a focus on understanding why improvements succeed or fail.
Two main builds are explored:
- βοΈ A baseline classifier
- π An enhanced classifier (ResNet-based)
Rather than chasing raw scores, the quest focuses on diagnosing weaknesses, learning from failed upgrades, and identifying clear evolution paths.
- Python 3
- NumPy & Pandas β data handling
- Matplotlib / Seaborn β visual diagnostics
- Scikit-learn β evaluation metrics
- PyTorch / Torchvision β deep learning models
- Dataset loaded and split into train / validation / test sets
- Labels provided for supervised learning
Basic preprocessing was applied to enable model training and evaluation.
-
Baseline Model
- Simple classifier used as a reference point
-
Enhanced Model
- Deeper architecture using ResNet
- Intended to improve feature extraction and generalization
Both builds use the same dataset splits and evaluation metrics to ensure a fair comparison.
Evaluation focuses on:
- Overall accuracy
- Confusion matrices
- Qualitative inspection of class-wise behaviour
The enhanced model shows only marginal improvement, highlighting that architectural upgrades alone are not always sufficient.
Enhancement attempts focused on:
- Increasing model depth
- Leveraging pretrained-style architectural ideas (ResNet)
While improvements were observed, gains were limited, suggesting bottlenecks outside the model architecture itself.
The main quest is complete, but several side quests remain to unlock the buildβs full potential.
- Limited Exploratory Data Analysis (EDA)
- Class imbalance not explicitly addressed
Why it matters:
- Models may optimize for majority classes
- Overall accuracy can mask poor minority-class performance
- Classification results plateau early
- Marginal gains from architectural enhancement
Why it matters:
- Indicates data-level or loss-function-level limitations
- Confusion matrices presented in text form
- Hard to visually diagnose class-wise errors
Why it matters:
- Visual diagnostics speed up failure analysis
Planned upgrades for the next iteration:
-
Add explicit EDA (class distributions, samples per class)
-
Apply class imbalance handling:
- Class-weighted loss
- Oversampling / data augmentation
-
Replace text confusion matrices with heatmap visualizations
-
Tune hyperparameters (learning rate, batch size, epochs)
These upgrades target root causes, not just surface-level performance.
π§© Main Quest: Classification Comparison π― Objective: Understand performance bottlenecks π Outcome: Functional comparison with clear evolution paths
Wadalisa Oratile Molokwe Honours Student | Network Engineer & System Administrator
GitHub quest log β built for learning, reflection, and long-term evolution.