Skip to content

Image classification of recyclable trash. Passion project for Metis data science bootcamp.

Notifications You must be signed in to change notification settings

LKchemposer/IsItRecyc-CNN

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

13 Commits
 
 
 
 
 
 
 
 

Repository files navigation

IsItRecyc-CNN

This project is Khoa Lam's passion project at the Metis data science bootcamp in NYC. Recycling contamination is not only an environmental but also an economic issue as recycling companies often redirect contaminated bales of recyclables to landfills. As a result, it increases human waste output and costs businesses resources. Here, I used a convolutional neural network (CNN) to predict if an object is recyclable from its image. My project aims to help consumers minimize recycling contamination. This goal is a shared goal with other projects and organizations (e.g., TrashNet, Multilayer Hybrid Deep-Learning Method for Waste Classification and Recycling, and ZenRobotics). This project, however, differs in that it uses mixed image sources (i.e., digital images and photographs), whereas many other projects use only photos. The final CNN was trained on the AWS server and achieved F0.5 = 0.90 for recyclability, and averaged AUC = 0.75 for material classification (with 60/20/20 train-validate-test split). Lastly, the model was deployed into a Dash web app (currently defunct) on AWS Elastic Beanstalk. Presentation of this project can be found here.

demo.mov

Dataset

The dataset (in zip files) is now accessible in a GDrive.

Image sources for this project include:

  1. Google Image Search, URLs from Google Custom Search API (code in getting-urls notebook)
  2. TrashNet
  3. A subset of Caltech 256 Image Dataset
  4. A subset of Flickr Material Database (FMD)

Currently, the dataset consists of 11045 images separated into 8 categories:

  1. Recyclables: 7543 images
    1. Glass (e.g., jars, bottles): 729 images
    2. Metal (e.g., cans, aluminum foil): 1747 images
    3. Paper (e.g., cardboard, books): 3230 images
    4. Plastic (e.g., soda bottles, food containers): 1837 images
  2. Non-recyclables: 3502 images
    1. Glass (e.g., lightbulbs, mirror): 531 images
    2. Plastics (e.g., styrofoam, sports balls): 1850 images
    3. Tanglers (e.g., wire, cable): 290 images
    4. Other (e.g., battery, ceramic): 831 images

This model has two distinct outputs: (1) recyclability (binary output), and (2) material classification (categorical output). Recyclability is trained with F0.5 as the metric, as F0.5 weighs precision twice as much as recall (minimize true recyclable contamination). Material classification is trained with AUC to balance separation of one class from others.

Notes

The code presented here is slightly simplified to be run on a local machine. To train the full dataset (~11000 images), an AWS Deep Learning AMI is recommended.

Python packages required: pandas, numpy, seaborn, matplotlib, keras, tensowflow, sklearn, PIL, cv2

About

Image classification of recyclable trash. Passion project for Metis data science bootcamp.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published