# Avocado Ripeness Classification
**Author**: Danny Goodlow  
**Date**: 2025-06-21

## Introduction
This project aims to classify avocado ripeness using numerical features such as firmness, color hue/saturation, and sound levels. Accurate ripeness prediction can help reduce food waste and support supply chain decisions.

In [None]:
import pandas as pd
df = pd.read_csv('avocado_ripeness_dataset.csv')
df.head()

In [None]:
from sklearn.preprocessing import LabelEncoder

df['color_category'] = LabelEncoder().fit_transform(df['color_category'])
df['ripeness'] = LabelEncoder().fit_transform(df['ripeness'])

X = df.drop('ripeness', axis=1)
y = df['ripeness']

In [None]:
from sklearn.model_selection import train_test_split
from sklearn.ensemble import RandomForestClassifier
from sklearn.metrics import classification_report, confusion_matrix

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

model = RandomForestClassifier(n_estimators=100, random_state=42)
model.fit(X_train, y_train)

y_pred = model.predict(X_test)
print(classification_report(y_test, y_pred))
print(confusion_matrix(y_test, y_pred))

## Results
- The Random Forest model achieved **100% accuracy** on the test set.
- No misclassifications occurred across all ripeness classes.

## Discussion
This indicates strong separation between classes and that key features like firmness and hue were highly predictive. While this performance is exceptional, it may be due to the limited dataset size or highly structured input.

Future improvements could include:
- Expanding dataset with more samples and lighting conditions
- Using image data if available for deeper models (CNNs)


## Summary
- Applied a supervised learning approach (Random Forest)
- Achieved 100% classification accuracy on avocado ripeness
- Ready for GitHub submission and practical integration