Automated Recyclable Contamination Detection in Industrial Processing Facilities
The following is my very first research project, which attempts to use deep-learning to automate recyclable sorting. Below the abstract and implementation steps are presented. The full paper can be found here.
Waste production per capita has been consistently increasing, generating more municipal solid waste every year. Processing plants can help address this abundance by repurposing recyclables into raw materials through the use of specialized machinery and human sorters. New contamination policies, however, call for cleaner raw materials, thereby requiring additional processing time, making it more challenging for facilities to recycle profitably. Items that previously could have been recycled are now being directly sent to landfills, exacerbating the global landfill crisis. Additionally, there are concerns for human sorters with respect to operating efficiency and health care (pathogen exposure and hearing loss). To address these issues, a deep-learning-based system capable of real-time segmentation and contamination detection was created. In order to build a dataset, conveyor belt streams containing a variety of recyclables were recorded directly overhead at sixty frames per second. Using the VGG Image Annotation tool, each item present in a given frame was segmented and classified as contamination or non-contamination. A pretrained Mask-RCNN, a convolutional neural network optimized for instance segmentation tasks, was finetuned over this footage for 1000 iterations. In early-stage testing with a smaller-line containing only high-density polyethylene (e.g. milk and detergent containers), the model was capable of identifying and segmenting nearly all items at the same rate as humans (~1.3 seconds). Further, the model was 96.1% accurate in detecting total contamination, a metric higher than that of human sorters. These results indicate that the system has the potential to be deployed to active processing plants.
As mentioned in the abstract, the dataset was comprised of frames from conveyor belt footage as pictured below.
Data was then converted to a collection of frames and labeled using the VGG Image Annotation tool.
Then, split into corresponding subdirectories:
. └── train │ ├── 000.jpg │ ├── ... │ └── via_region_data.json └── val ├── 00.jpg ├── ... └── via_region_data.json
Model Configuration & Training
cfg = get_cfg() cfg.merge_from_file(model_zoo.get_config_file("COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x.yaml")) cfg.DATASETS.TRAIN = ("footage_train",) cfg.DATASETS.TEST = () cfg.DATALOADER.NUM_WORKERS = 2 cfg.MODEL.WEIGHTS = model_zoo.get_checkpoint_url("COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x.yaml") cfg.SOLVER.IMS_PER_BATCH = 2 cfg.SOLVER.BASE_LR = 0.00025 cfg.SOLVER.MAX_ITER = 1000 cfg.MODEL.ROI_HEADS.BATCH_SIZE_PER_IMAGE = 128 cfg.MODEL.ROI_HEADS.NUM_CLASSES = 2
Inference & Evaluation
Average Precision (AP): 0.961 Average Recall (AR): 0.955 F1 Score: 0.958