### Problem Definition

In this project, we build an image classifier that predicts the ripeness level of bananas from a single RGB image.

The model predicts one of three classes:
- **Unripe** - mostly green peel
- **Ripe** - mostly yellow peel, suitable for eating
- **Overripe** - dark spotted/brown peel, very soft and close to or beyond ideal eating point

#### Real-world motivation

Banana ripeness is a practical problem in agriculture, retail and at home. Being able to automatically estimate ripeness from images can help with:

- **Quality control** in supermarkets (sorting bananas into “ready to sell today” vs “too green” vs “too late”).
- **Food waste reduction**, by detecting overripe fruit earlier and discounting or redirecting it to other uses (e.g. baking, smoothies).
- **Assisting consumers** in choosing bananas according to their preference (some people like greener bananas, some prefer very ripe).

In this project we do not try to solve the full industrial problem, but we build a small, reproducible prototype that shows how a deep learning model can classify banana ripeness from images using transfer learning.

#### Expected Challenges

- **Ambiguous boundaries between classes**  
  There is no sharp line between “ripe” and “overripe” – some bananas are in-between, which makes labels somewhat subjective.

- **Lighting and background conditions**  
  Images may be taken under different illumination (indoor, outdoor, shadows, warm/cold light) and on different backgrounds, which changes the perceived color.

- **Multiple bananas in one image**  
  Some images may contain more than one banana with slightly different ripeness levels. The dataset label is still a single class for the whole image.

- **Color similarity between classes**  
  Slightly yellow-green bananas can look similar to ripe ones, and dark-spotted ripe bananas can look similar to overripe ones, which may cause confusion for the model.

- **Dataset shift to real world**  
  The training images are relatively clean and focused on bananas. In real supermarket shelves, bananas might be partially occluded, far from the camera, or mixed with other fruits. The model may not generalize perfectly to such settings.


### Dataset source

We used the public **Banana Classification** dataset from Kaggle, which contains 4 categories: *Unripe*, *Ripe*, *Overripe* and *Rotten*.  
In this project we only use the first three categories (Unripe, Ripe, Overripe) to match 3-class ripeness problem. We removed the Rotten class because it overlaps visually with Overripe and would reduce clarity of the classification problem

Dataset citation:
Thakar, A. (2024). *Banana Classification* [Dataset]. Kaggle. https://www.kaggle.com/datasets/atrithakar/banana-classification