# Classifier v2 – Experimental Food Image Classification

Author: Zhuo Han Khoo  
Unit: NutriHelp AI Project  
Status: Experimental / Not Deployed


## 1. Introduction

This notebook documents the development and evaluation of **Classifier v2**, an experimental food image classifier built during Weeks 3–7 of the project.

The purpose of this work was to explore whether a newly trained classifier could outperform the existing production model. This notebook focuses on:
- Dataset preparation and augmentation
- Training experiments
- Evaluation and stress testing
- Final decision-making based on results


## 2. Dataset Overview

All datasets used in this notebook were collected and prepared during this semester and are stored under the `ai_evaluation/` folder.

### Dataset Versions
- `dataset_v2/`: Base cleaned dataset
- `dataset_v2_augmented/`: Augmented version for robustness
- `dataset_v2_processed/`: Standardised and filtered dataset
- `dataset_v3/`: Experimental restructuring

The datasets include a mix of Western and Asian food images. However, coverage remains uneven due to limited availability of certain food classes.


## 3. Model Training

Multiple training iterations were conducted using different dataset versions and augmentation strategies.

Key challenges encountered:
- Limited sample size for several classes
- Class imbalance
- Absence of the original production dataset from previous trimesters

Despite retraining efforts, the model struggled to generalise reliably across all food categories.


## 4. Evaluation & Stress Testing

Classifier v2 was evaluated using:
- Stress test images
- A curated evaluation set grouped into:
  - Asian
  - Western
  - Mixed
  - Unclear images

Evaluation focused on:
- Prediction correctness
- Confidence behaviour
- Failure cases caused by missing classes


## 5. Results Summary

Overall findings:
- Classifier v2 did not consistently outperform the existing production model
- Accuracy degraded significantly for classes with limited data
- Misclassifications were common when the food type was not present in `classes.json`

Given these results, deploying Classifier v2 would introduce regression risk.


## 6. Decision & Rationale

Based on evaluation results, **Classifier v2 was not integrated into the backend**.

Instead, the project prioritised:
- Stability
- Backward compatibility
- Safer improvements at the pipeline level

Backend enhancements were implemented to improve user experience without changing the model itself.


## 7. Production Improvements (Implemented Instead)

The following improvements were made to the existing production classifier pipeline:
- Multi-image input support
- Explicit confidence scores
- Top-K predictions
- Unclear image detection
- User guidance when confidence is low

These changes allow the system to behave more responsibly without retraining the model.


## 8. Lessons Learned

- Model accuracy alone is not sufficient for deployment
- Dataset completeness is critical for food classification
- Pipeline-level improvements can deliver real value with lower risk
- Knowing when *not* to deploy a model is an important engineering decision


## 9. Future Work

Future improvements depend on:
- Access to a complete and balanced image dataset
- Careful addition of new food classes
- Re-evaluation of classifier v2 once data quality improves

For now, classifier v2 remains an experimental reference.
