This project is a full-stack application for classifying garbage images into categories using deep learning. It combines a FastAPI backend (Python) with a React frontend for an interactive user experience.
- Garbage Classification Project
-
Interactive Frontend: Responsive React UI for image upload, result visualization, and classification history chart.
-
Efficient Backend: FastAPI server serving predictions from a trained Keras CNN model.
-
Model Integration: Seamless use of TensorFlow/Keras models for real-time inference.
-
Classification History: Bar chart of previous classifications using localStorage.
-
Recyclability Status: Visual indication (logo, color) of whether the predicted class is recyclable.
-
PDF Report Generation: (Planned) Downloadable report of classification results.
Frontend (React) <-> FastAPI Backend <-> Keras Model
| | |
User Uploads Receives API Model Predicts
Image/Views UI Requests/Results Class & Probabilities
- User uploads an image via the frontend.
- Frontend sends the image to the FastAPI backend.
- Backend preprocesses the image and runs it through the CNN model.
- Backend returns the predicted class and probabilities.
- Frontend displays the result, recyclability status, and updates the classification history chart.
Project/
├── backend/
│ ├── main.py
│ ├── models/
│ └── util/
├── frontend/
│ ├── src/
│ │ ├── pages/
│ │ ├── components/
│ │ ├── services/
│ │ └── styles/
│ └── public/
├── data/
│ └── raw/
├── notebooks/
│ ├── preprocessing.ipynb
│ └── CNNModel.ipynb
├── requirements.txt
├── README.md
└── ...
- Backend: Python, FastAPI, TensorFlow/Keras
- Frontend: React, Chart.js, react-chartjs-2, Vite
- Other: Jupyter Notebooks, localStorage
- Image Resizing: All input images are resized to 224x224 pixels to match the input requirements of MobileNetV2.
- Image Enhancement: CLAHE (Contrast Limited Adaptive Histogram Equalization) is applied to the luminance channel to improve contrast and detail.
- Color Space Conversion: Images are converted from BGR to YCrCb for enhancement, then back to RGB for model input.
- Normalization: Pixel values are scaled to the [0, 1] range for stable training.
- Data Augmentation: Random rotations, flips, and other augmentations are applied to increase dataset diversity and reduce overfitting.
- Label Encoding: Class labels are encoded using scikit-learn’s LabelEncoder for compatibility with Keras.
- Base Model: MobileNetV2 (pretrained on ImageNet, used as feature extractor)
- Input shape: (224, 224, 3)
- All layers frozen by default; optionally, last N layers can be unfrozen for fine-tuning.
- Custom Head:
- Global Average Pooling
- Dropout (0.2)
- Dense layer with softmax activation for multi-class classification
- Training Details:
- Optimizer: Adam
- Loss: Categorical Crossentropy
- Metrics: Accuracy, Precision, Recall, AUC, Top-3 Accuracy
- K-Fold Cross Validation (5 folds) for robust evaluation
- Early Stopping and ReduceLROnPlateau callbacks for efficient training
- Ensemble: Final predictions can be made using a soft-voting ensemble of the best models from each fold.
- Python 3.11+
- Node.js & npm
- Navigate to the
backend
directory:bash cd backend
- Install Python dependencies:
bash pip install -r ../requirements.txt
- Start the FastAPI server:
bash uvicorn main:app --host localhost --port 8089 --reload
- Navigate to the
frontend
directory:bash cd frontend
- Install Node.js dependencies:
bash npm install
- Start the React development server:
bash npm run dev
- Open the frontend in your browser (usually at
http://localhost:5173
). - Upload an image to classify.
- View the predicted class, recyclability status, and previous classification chart.
- (Optional) Download a PDF report of the result (if implemented).
- K-Fold Cross Validation: The dataset is split into 5 folds; each fold trains a fresh model and evaluates on its validation split.
-
- Ensemble method with soft voting was used to get the final model combining all the best model from the cross validation folds.
- Performance Metrics: Accuracy, precision, recall, AUC, and top-3 accuracy are tracked per fold.
- Confusion Matrix: Visualized for overall and per-fold results to analyze misclassifications.
- Learning Curves: Training and validation accuracy/loss curves are plotted for each fold and averaged.
- Overall Results
Class | Precision | Recall | F1-Score |
---|---|---|---|
battery | 0.95 | 0.93 | 0.94 |
biological | 0.99 | 0.98 | 0.98 |
brown-glass | 0.93 | 0.86 | 0.90 |
cardboard | 0.95 | 0.92 | 0.94 |
clothes | 0.99 | 0.98 | 0.99 |
green-glass | 0.97 | 0.91 | 0.94 |
metal | 0.89 | 0.81 | 0.85 |
paper | 0.92 | 0.94 | 0.93 |
plastic | 0.80 | 0.82 | 0.81 |
shoes | 0.97 | 0.99 | 0.98 |
trash | 0.90 | 0.96 | 0.93 |
white-glass | 0.77 | 0.89 | 0.83 |
accuracy | 0.92 | ||
macro avg | 0.92 | 0.92 | 0.92 |
weighted avg | 0.92 | 0.92 | 0.92 |

Contributions are welcome! Please fork the repository and submit a pull request. For major changes, open an issue first to discuss what you would like to change.
This project is for educational purposes as part of the Digital Image Processing course.