Dishware Instance Segmentation

Instance segmentation of dishware in cluttered kitchen scenes using a fine-tuned Faster R-CNN with Feature Pyramid Network (FPN). Developed for APS360 (Applied Deep Learning) at the University of Toronto, Sep-Dec 2025.

Overview

Detecting and segmenting individual dishware items in cluttered kitchen environments is a challenging computer vision task with applications in robotics, smart kitchens, and assistive technology. This project fine-tunes a Faster R-CNN model with a ResNet-50 backbone to perform instance segmentation on plates, bowls, cups, and utensils. The model is benchmarked against a YOLOv8n baseline, achieving a significant improvement in detection accuracy.

Model Architecture

The model builds on Faster R-CNN with the following components:

Backbone: ResNet-50 with a Feature Pyramid Network (FPN) for multi-scale feature extraction.
Region Proposal Network (RPN): Custom anchor sizes of 16, 32, 64, 128, 256, and 512 pixels to accommodate dishware ranging from small utensils to large plates.
RoI Align: Precise region-of-interest alignment, avoiding the quantization artifacts of RoI Pooling.
Dual Prediction Heads: One branch for bounding box regression and classification, and a second branch for pixel-level mask generation.

The baseline comparison model is YOLOv8n (nano), a lightweight single-stage detector.

Dataset

Source: 245 images curated from COCO, LVIS, and Open Images datasets.
Split: 70/15/15 stratified train/validation/test split.
Augmentation: Random horizontal flip, random crop, and brightness jitter applied during training.

Results

Model	AP50	mAP (0.50:0.95)
Faster R-CNN (ours)	0.270	0.134
YOLOv8n (baseline)	0.144	0.121

Key findings:

The model achieves strong detection performance on plates, bowls, and cups.
Small utensils remain challenging due to scale variation and occlusion.
On unseen data, the model produces an average of 13.17 detections per image at a confidence threshold of 0.60, demonstrating reasonable generalization.

Getting Started

Google Colab (Recommended)

Upload dishware_segmentation.ipynb to Google Colab.
Select a GPU runtime (Runtime > Change runtime type > GPU).
Run all cells sequentially.

Local Setup

# Clone the repository
git clone https://github.com/BidoCodeHub/dishware-instance-segmentation.git
cd dishware-instance-segmentation

# Create a virtual environment
python3 -m venv venv
source venv/bin/activate

# Install dependencies
pip install torch torchvision matplotlib pycocotools

Then open the notebook with Jupyter:

jupyter notebook dishware_segmentation.ipynb

Requirements:

Python 3.8+
PyTorch 1.12+
torchvision 0.13+
matplotlib
pycocotools

Usage

The notebook is organized into the following sections:

Environment Setup - Install and import required libraries.
Dataset Preparation - Download, preprocess, and augment the dishware images.
Model Definition - Configure Faster R-CNN with custom anchors and FPN.
Training - Fine-tune the model on the training set with validation monitoring.
Evaluation - Compute AP50 and mAP metrics on the test set.
Inference and Visualization - Run predictions on new images and visualize segmentation masks.
Baseline Comparison - Train and evaluate YOLOv8n under the same conditions.

Acknowledgments

This project was completed as part of APS360 (Applied Deep Learning) at the University of Toronto. We thank the course instructors and teaching assistants for their guidance.

The training data is sourced from the COCO, LVIS, and Open Images datasets.

License

This project is licensed under the MIT License. See LICENSE for details.

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
dishware_segmentation.ipynb		dishware_segmentation.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Dishware Instance Segmentation

Overview

Model Architecture

Dataset

Results

Getting Started

Google Colab (Recommended)

Local Setup

Usage

Acknowledgments

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Dishware Instance Segmentation

Overview

Model Architecture

Dataset

Results

Getting Started

Google Colab (Recommended)

Local Setup

Usage

Acknowledgments

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages