# Material Recognition using Transfer Learning on Yolov5

The goal of this project is to train a convolutional neural network to classify color photographs of surfaces into one of ten common material categories: fabric, foliage, glass, leather, metal, paper, plastic, stone, water, and wood. Some tasks to consider:

1. Modify some previously published architectures e.g., increase the network depth, reducing their parameters, etc.
2. Try data augmentation to increase the number of training images
3. Try a larger dataset, Materials in Context Database (MINC)

Dataset: Flickr Material Database (FMD)

The notebook here will focus on transfer learning using yolov5. The overview of this notebook is as follows:

1. Setting up and preparing the notebook environment
2. Download the ImageNet trained models pretrained on ImageNet using YOLOv5 Utils
3. Train-val-test split
4. Train on train dataset for all 5 models (n, s, m, l, x)
5. Validate on validation dataset for all 5 models (n, s, m, l, x)
6. Predict on test dataset for all 5 models (n, s, m, l, x)
7. Calculate test accuracy score for all 5 models (n, s, m, l, x)

# Setup

In [4]:
import os
!git clone https://github.com/ultralytics/yolov5  # clone
os.chdir("/kaggle/working")
%cd yolov5
%pip install -qr requirements.txt  # install

import torch
import utils
display = utils.notebook_init()  # checks

YOLOv5 🚀 v6.2-237-g55e9516 Python-3.7.12 torch-1.11.0 CUDA:0 (Tesla P100-PCIE-16GB, 16281MiB)


Setup complete ✅ (2 CPUs, 15.6 GB RAM, 3964.5/4030.6 GB disk)


# Download the ImageNet trained models pretrained on ImageNet using YOLOv5 Utils

In [5]:
from utils.downloads import attempt_download

p5 = ['n', 's', 'm', 'l', 'x']  # P5 models
cls = [f'{x}-cls' for x in p5]  # classification models

for x in cls:
    attempt_download(f'weights/yolov5{x}.pt')

Downloading https://github.com/ultralytics/yolov5/releases/download/v6.2/yolov5n-cls.pt to weights/yolov5n-cls.pt...


  0%|          | 0.00/4.87M [00:00<?, ?B/s]


Downloading https://github.com/ultralytics/yolov5/releases/download/v6.2/yolov5s-cls.pt to weights/yolov5s-cls.pt...


  0%|          | 0.00/10.5M [00:00<?, ?B/s]


Downloading https://github.com/ultralytics/yolov5/releases/download/v6.2/yolov5m-cls.pt to weights/yolov5m-cls.pt...


  0%|          | 0.00/24.9M [00:00<?, ?B/s]


Downloading https://github.com/ultralytics/yolov5/releases/download/v6.2/yolov5l-cls.pt to weights/yolov5l-cls.pt...


  0%|          | 0.00/50.9M [00:00<?, ?B/s]


Downloading https://github.com/ultralytics/yolov5/releases/download/v6.2/yolov5x-cls.pt to weights/yolov5x-cls.pt...


  0%|          | 0.00/92.0M [00:00<?, ?B/s]




# Train-val-test split

In [7]:
!pip install split-folders

Collecting split-folders
  Downloading split_folders-0.5.1-py3-none-any.whl (8.4 kB)
Installing collected packages: split-folders
Successfully installed split-folders-0.5.1
[0m

In [6]:
import os
os.chdir("/kaggle/input/")

In [8]:
import splitfolders
splitfolders.ratio('fmddataset/image', output="/kaggle/working/yolov5/output", seed=1337, ratio=(.7, 0.3)) 

Copying files: 1003 files [00:04, 205.97 files/s]


In [17]:
splitfolders.ratio('../input/testdata/test/data/test', output="/kaggle/working/yolov5/output", seed=1337, ratio=(0, 0, 1)) 

Copying files: 30 files [00:02, 11.98 files/s]


# Train on train dataset for all 5 models (n, s, m, l, x)

In [9]:
os.chdir("/kaggle/working/")
from utils.downloads import attempt_download

%cd yolov5

p5 = ['n', 's', 'm', 'l', 'x']  # P5 models
cls = [f'{x}-cls' for x in p5]  # classification models

for x in cls:
    attempt_download(f'weights/yolov5{x}.pt')
    
for x in p5:
    !python classify/train.py --model yolov5{x}-cls.pt --data output --epochs 100 --img 224 --pretrained weights/yolov5{x}-cls.pt

/kaggle/working/yolov5
[34m[1mwandb[0m: (1) Create a W&B account
[34m[1mwandb[0m: (2) Use an existing W&B account
[34m[1mwandb[0m: (3) Don't visualize my results
[34m[1mwandb[0m: Enter your choice: (30 second timeout) 
[34m[1mwandb[0m: W&B disabled due to login timeout.
[34m[1mclassify/train: [0mmodel=yolov5n-cls.pt, data=output, epochs=100, batch_size=64, imgsz=224, nosave=False, cache=None, device=, workers=8, project=runs/train-cls, name=exp, exist_ok=False, pretrained=weights/yolov5n-cls.pt, optimizer=Adam, lr0=0.001, decay=5e-05, label_smoothing=0.1, cutoff=None, dropout=None, verbose=False, seed=0, local_rank=-1
[34m[1mgithub: [0mup to date with https://github.com/ultralytics/yolov5 ✅
YOLOv5 🚀 v6.2-237-g55e9516 Python-3.7.12 torch-1.11.0 CUDA:0 (Tesla P100-PCIE-16GB, 16281MiB)

[34m[1mTensorBoard: [0mStart with 'tensorboard --logdir runs/train-cls', view at http://localhost:6006/
[34m[1malbumentations: [0mRandomResizedCrop(p=1.0, height=224, width=224, 

# Validate on validation dataset for all 5 models (n, s, m, l, x)

In [10]:
!python classify/val.py --weights runs/train-cls/exp/weights/best.pt --data ./output/

[34m[1mclassify/val: [0mdata=./output/, weights=['runs/train-cls/exp/weights/best.pt'], batch_size=128, imgsz=224, device=, workers=8, verbose=True, project=runs/val-cls, name=exp, exist_ok=False, half=False, dnn=False
YOLOv5 🚀 v6.2-237-g55e9516 Python-3.7.12 torch-1.11.0 CUDA:0 (Tesla P100-PCIE-16GB, 16281MiB)

Fusing layers... 
Model summary: 117 layers, 1221274 parameters, 0 gradients, 2.9 GFLOPs
validating: 100%|██████████| 3/3 [00:01<00:00,  2.33it/s]                       
                   Class      Images    top1_acc    top5_acc
                     all         302       0.444       0.868
                  fabric          31       0.194       0.806
                 foliage          31       0.677       0.839
                   glass          30       0.467       0.933
                 leather          30       0.367         0.9
                   metal          30       0.333       0.867
                   paper          30       0.433       0.767
                 plastic 

In [11]:
!python classify/val.py --weights runs/train-cls/exp2/weights/best.pt --data ./output/

[34m[1mclassify/val: [0mdata=./output/, weights=['runs/train-cls/exp2/weights/best.pt'], batch_size=128, imgsz=224, device=, workers=8, verbose=True, project=runs/val-cls, name=exp, exist_ok=False, half=False, dnn=False
YOLOv5 🚀 v6.2-237-g55e9516 Python-3.7.12 torch-1.11.0 CUDA:0 (Tesla P100-PCIE-16GB, 16281MiB)

Fusing layers... 
Model summary: 117 layers, 4179498 parameters, 0 gradients, 10.4 GFLOPs
validating: 100%|██████████| 3/3 [00:01<00:00,  2.33it/s]                       
                   Class      Images    top1_acc    top5_acc
                     all         302       0.397       0.834
                  fabric          31       0.129       0.677
                 foliage          31        0.71       0.903
                   glass          30       0.267       0.967
                 leather          30         0.3       0.833
                   metal          30       0.333       0.867
                   paper          30       0.233       0.767
                 plasti

In [12]:
!python classify/val.py --weights runs/train-cls/exp3/weights/best.pt --data ./output/

[34m[1mclassify/val: [0mdata=./output/, weights=['runs/train-cls/exp3/weights/best.pt'], batch_size=128, imgsz=224, device=, workers=8, verbose=True, project=runs/val-cls, name=exp, exist_ok=False, half=False, dnn=False
YOLOv5 🚀 v6.2-237-g55e9516 Python-3.7.12 torch-1.11.0 CUDA:0 (Tesla P100-PCIE-16GB, 16281MiB)

Fusing layers... 
Model summary: 166 layers, 11679002 parameters, 0 gradients, 30.6 GFLOPs
validating: 100%|██████████| 3/3 [00:01<00:00,  2.12it/s]                       
                   Class      Images    top1_acc    top5_acc
                     all         302       0.368       0.831
                  fabric          31      0.0323       0.871
                 foliage          31       0.645       0.839
                   glass          30         0.3       0.733
                 leather          30       0.167         0.8
                   metal          30       0.367       0.867
                   paper          30       0.233       0.767
                 plast

In [15]:
!python classify/val.py --weights runs/train-cls/exp4/weights/best.pt --data ./output/

[34m[1mclassify/val: [0mdata=./output/, weights=['runs/train-cls/exp4/weights/best.pt'], batch_size=128, imgsz=224, device=, workers=8, verbose=True, project=runs/val-cls, name=exp, exist_ok=False, half=False, dnn=False
YOLOv5 🚀 v6.2-237-g55e9516 Python-3.7.12 torch-1.11.0 CUDA:0 (Tesla P100-PCIE-16GB, 16281MiB)

Fusing layers... 
Model summary: 215 layers, 25267786 parameters, 0 gradients, 68.3 GFLOPs
validating: 100%|██████████| 3/3 [00:01<00:00,  1.72it/s]                       
                   Class      Images    top1_acc    top5_acc
                     all         302       0.364       0.811
                  fabric          31      0.0968       0.935
                 foliage          31       0.677       0.742
                   glass          30         0.3         0.8
                 leather          30       0.167       0.667
                   metal          30         0.2       0.833
                   paper          30         0.1       0.933
                 plast

In [16]:
!python classify/val.py --weights runs/train-cls/exp5/weights/best.pt --data ./output/

[34m[1mclassify/val: [0mdata=./output/, weights=['runs/train-cls/exp5/weights/best.pt'], batch_size=128, imgsz=224, device=, workers=8, verbose=True, project=runs/val-cls, name=exp, exist_ok=False, half=False, dnn=False
YOLOv5 🚀 v6.2-237-g55e9516 Python-3.7.12 torch-1.11.0 CUDA:0 (Tesla P100-PCIE-16GB, 16281MiB)

Fusing layers... 
Model summary: 264 layers, 46804410 parameters, 0 gradients, 128.9 GFLOPs
validating: 100%|██████████| 3/3 [00:01<00:00,  1.51it/s]                       
                   Class      Images    top1_acc    top5_acc
                     all         302       0.311       0.778
                  fabric          31       0.161       0.548
                 foliage          31        0.71       0.806
                   glass          30         0.1       0.733
                 leather          30         0.5       0.833
                   metal          30         0.2       0.967
                   paper          30       0.133       0.733
                 plas

# Predict on test dataset for all 5 models (n, s, m, l, x)

In [19]:
materials = ['fabric', 'foliage', 'glass', 'leather', 'metal', 'paper', 'plastic', 'stone', 'water', 'wood']
for x in materials:
    !python classify/predict.py --weights runs/train-cls/exp/weights/best.pt --source ./output/test/{x}

[34m[1mclassify/predict: [0mweights=['runs/train-cls/exp/weights/best.pt'], source=./output/test/fabric, data=data/coco128.yaml, imgsz=[224, 224], device=, view_img=False, save_txt=True, nosave=False, augment=False, visualize=False, update=False, project=runs/predict-cls, name=exp, exist_ok=False, half=False, dnn=False, vid_stride=1
YOLOv5 🚀 v6.2-237-g55e9516 Python-3.7.12 torch-1.11.0 CUDA:0 (Tesla P100-PCIE-16GB, 16281MiB)

Fusing layers... 
Model summary: 117 layers, 1221274 parameters, 0 gradients, 2.9 GFLOPs
image 1/3 /kaggle/working/yolov5/output/test/fabric/IMG_1909.JPG: 224x224 paper 0.40, leather 0.26, fabric 0.16, metal 0.10, wood 0.03, 6.7ms
image 2/3 /kaggle/working/yolov5/output/test/fabric/Image1.jpeg: 224x224 paper 0.46, metal 0.16, leather 0.12, plastic 0.07, fabric 0.06, 3.5ms
image 3/3 /kaggle/working/yolov5/output/test/fabric/Image2.jpeg: 224x224 wood 0.36, stone 0.31, leather 0.21, fabric 0.05, metal 0.03, 3.5ms
Speed: 0.3ms pre-process, 4.6ms inference, 0.1ms NM

In [20]:
for x in materials:
    !python classify/predict.py --weights runs/train-cls/exp2/weights/best.pt --source ./output/test/{x}

[34m[1mclassify/predict: [0mweights=['runs/train-cls/exp2/weights/best.pt'], source=./output/test/fabric, data=data/coco128.yaml, imgsz=[224, 224], device=, view_img=False, save_txt=True, nosave=False, augment=False, visualize=False, update=False, project=runs/predict-cls, name=exp, exist_ok=False, half=False, dnn=False, vid_stride=1
YOLOv5 🚀 v6.2-237-g55e9516 Python-3.7.12 torch-1.11.0 CUDA:0 (Tesla P100-PCIE-16GB, 16281MiB)

Fusing layers... 
Model summary: 117 layers, 4179498 parameters, 0 gradients, 10.4 GFLOPs
image 1/3 /kaggle/working/yolov5/output/test/fabric/IMG_1909.JPG: 224x224 stone 0.27, paper 0.17, leather 0.16, wood 0.12, fabric 0.07, 3.6ms
image 2/3 /kaggle/working/yolov5/output/test/fabric/Image1.jpeg: 224x224 stone 0.29, leather 0.24, fabric 0.12, paper 0.12, plastic 0.08, 3.6ms
image 3/3 /kaggle/working/yolov5/output/test/fabric/Image2.jpeg: 224x224 stone 0.44, leather 0.24, fabric 0.17, wood 0.09, paper 0.02, 3.6ms
Speed: 0.3ms pre-process, 3.6ms inference, 0.0ms 

In [21]:
for x in materials:
    !python classify/predict.py --weights runs/train-cls/exp3/weights/best.pt --source ./output/test/{x}

[34m[1mclassify/predict: [0mweights=['runs/train-cls/exp3/weights/best.pt'], source=./output/test/fabric, data=data/coco128.yaml, imgsz=[224, 224], device=, view_img=False, save_txt=True, nosave=False, augment=False, visualize=False, update=False, project=runs/predict-cls, name=exp, exist_ok=False, half=False, dnn=False, vid_stride=1
YOLOv5 🚀 v6.2-237-g55e9516 Python-3.7.12 torch-1.11.0 CUDA:0 (Tesla P100-PCIE-16GB, 16281MiB)

Fusing layers... 
Model summary: 166 layers, 11679002 parameters, 0 gradients, 30.6 GFLOPs
image 1/3 /kaggle/working/yolov5/output/test/fabric/IMG_1909.JPG: 224x224 stone 0.21, paper 0.18, leather 0.16, metal 0.09, wood 0.08, 5.9ms
image 2/3 /kaggle/working/yolov5/output/test/fabric/Image1.jpeg: 224x224 paper 0.25, plastic 0.15, metal 0.12, leather 0.11, stone 0.10, 8.4ms
image 3/3 /kaggle/working/yolov5/output/test/fabric/Image2.jpeg: 224x224 stone 0.47, wood 0.21, leather 0.16, fabric 0.07, water 0.04, 5.8ms
Speed: 0.3ms pre-process, 6.7ms inference, 0.1ms N

In [22]:
for x in materials:
    !python classify/predict.py --weights runs/train-cls/exp4/weights/best.pt --source ./output/test/{x}

[34m[1mclassify/predict: [0mweights=['runs/train-cls/exp4/weights/best.pt'], source=./output/test/fabric, data=data/coco128.yaml, imgsz=[224, 224], device=, view_img=False, save_txt=True, nosave=False, augment=False, visualize=False, update=False, project=runs/predict-cls, name=exp, exist_ok=False, half=False, dnn=False, vid_stride=1
YOLOv5 🚀 v6.2-237-g55e9516 Python-3.7.12 torch-1.11.0 CUDA:0 (Tesla P100-PCIE-16GB, 16281MiB)

Fusing layers... 
Model summary: 215 layers, 25267786 parameters, 0 gradients, 68.3 GFLOPs
image 1/3 /kaggle/working/yolov5/output/test/fabric/IMG_1909.JPG: 224x224 stone 0.31, water 0.15, wood 0.10, fabric 0.09, metal 0.09, 13.6ms
image 2/3 /kaggle/working/yolov5/output/test/fabric/Image1.jpeg: 224x224 water 0.28, stone 0.13, paper 0.12, fabric 0.11, glass 0.10, 6.5ms
image 3/3 /kaggle/working/yolov5/output/test/fabric/Image2.jpeg: 224x224 stone 0.45, wood 0.18, leather 0.10, water 0.08, fabric 0.06, 6.5ms
Speed: 0.4ms pre-process, 8.9ms inference, 0.1ms NMS 

In [23]:
for x in materials:
    !python classify/predict.py --weights runs/train-cls/exp5/weights/best.pt --source ./output/test/{x}

[34m[1mclassify/predict: [0mweights=['runs/train-cls/exp5/weights/best.pt'], source=./output/test/fabric, data=data/coco128.yaml, imgsz=[224, 224], device=, view_img=False, save_txt=True, nosave=False, augment=False, visualize=False, update=False, project=runs/predict-cls, name=exp, exist_ok=False, half=False, dnn=False, vid_stride=1
YOLOv5 🚀 v6.2-237-g55e9516 Python-3.7.12 torch-1.11.0 CUDA:0 (Tesla P100-PCIE-16GB, 16281MiB)

Fusing layers... 
Model summary: 264 layers, 46804410 parameters, 0 gradients, 128.9 GFLOPs
image 1/3 /kaggle/working/yolov5/output/test/fabric/IMG_1909.JPG: 224x224 leather 0.24, metal 0.15, water 0.12, stone 0.12, paper 0.11, 8.9ms
image 2/3 /kaggle/working/yolov5/output/test/fabric/Image1.jpeg: 224x224 metal 0.17, paper 0.17, water 0.12, glass 0.11, fabric 0.11, 8.8ms
image 3/3 /kaggle/working/yolov5/output/test/fabric/Image2.jpeg: 224x224 stone 0.28, wood 0.25, leather 0.19, fabric 0.09, metal 0.06, 9.3ms
Speed: 0.3ms pre-process, 9.0ms inference, 0.0ms NM

# Calculate accuracy score for all 5 models (n, s, m, l, x)

In [24]:
test_accuracy_exp = 11/30
test_accuracy_exp2 = 7/30
test_accuracy_exp3 = 9/30
test_accuracy_exp4 = 8/30
test_accuracy_exp5 = 6/30
