# Create PCB Defect Detection using Yolov5

This example shows you how to use the `dlpy.mzmodel` subpackage to create a Yolov5 model to perform defect detection. The `dlpy.mzmodel` subpackage leverages the [SAS Deep Learning Model Zoo](https://go.documentation.sas.com/doc/en/pgmsascdc/latest/casdlmzpg/titlepage.htm) utilities to manage deep learning models on the CAS server.

## Table of Contents
1. [Set Up Environment](#setup)
2. [Load and Visualize Printed Circuit Board Defect Images](#prepare)
3. [Build the Model](#build)
4. [Train the Model](#train)
5. [Score the Model and Visualize Scoring Results](#score)
5. [Register Model to Model Repository](#register)

## Step 1: Set Up Environment <a id="setup"><a>

First, import the various Python and SAS DLPy packages that will be used in this notebook session. Begin by importing the SAS Scripting Wrapper for Analytics Transfer (SWAT). SWAT is the Python interface to SAS CAS. Here is more information about [starting a SAS CAS session with the SWAT package](https://sassoftware.github.io/python-swat/getting-started.html).

SASCTL, SAS-DLPy, and SAS SWAT are all different SAS software packages used for data analysis, modeling and deep learning.

- SASCTL is a Python package that provides a high-level interface to SAS Viya REST APIs for managing and monitoring SAS Viya environments.

- SAS-DLPy is a Python package that provides a high-level interface to SAS Viya Deep Learning APIs for building and training deep learning models.

- SAS SWAT (SAS Scripting Wrapper for Analytics Transfer) is a Python package that provides a low-level interface to SAS Viya APIs for data preparation, exploration, and modeling.

All of these packages are designed to work with SAS Viya, a cloud-native and in-memory analytics platform that provides distributed computing, machine learning and deep learning capabilities. SAS Viya enables organizations to process large amounts of data, build and deploy models at scale, and integrate with other systems and languages.

In [None]:
import sys
import getpass
import os
import matplotlib.pyplot as plt
from matplotlib.pyplot import imread
from IPython.display import Image, display

import swat as sw
import sasctl
sys.path.append('python-dlpy')
import dlpy
from dlpy.utils import *
from dlpy.mzmodel import *
from dlpy.splitting import *

os.environ["CAS_CLIENT_SSL_CA_LIST"] = "https.crt"

In [None]:
print("I am using DLPY", dlpy.__version__)

In [None]:
username = "<your username>"
password = getpass.getpass("Enter Password: ")

# Create a CAS session instance and provide connection information to your running CAS server.
s = sw.CAS('https://gtptest.apdemo.sas.com/cas-shared-gputmp-http', username=username, password=password)

## Step 2: Load and Visualize Printed Circuit Board Defect Images <a id="prepare"></a>

Data is available at https://github.com/Ixiaohuihuihui/Tiny-Defect-Detection-for-PCB (Forked from: Feature Pyramid Networks for Object Detection).

### 2.1: Visualize the 6 different defect classes

In [None]:
# Loads sample images
s.loadTable(path="samples.sashdat", caslib="public", casout={"name":"samples", "caslib":"casuser", "replace":True})


# Display and visualize the images in "samples" image table
display_object_detections(s, 
                          'samples', 
                          'yolo', 
                          max_objects=5, 
                          num_plot=6, 
                          n_col=2
                          )

### 2.2: Load images for Model Building 

In [None]:
# We will load the image table (around 200 over images) into memory to start the model building process.

s.loadTable(path="workshop_images.sashdat", caslib="public", casout={"name":"workshop_images", "caslib":"casuser", "replace":True})

workshop_images = s.CASTable(name="workshop_images")

### 2.3: Examine Summary Statistics of the Training Data

In [None]:
# Split the images into our training and tesing set

column_list = workshop_images.columns.to_list()
column_list.remove("_image_")
column_list.remove("_id_")

train_set, test_set = two_way_split(workshop_images,
                                   test_rate=20,
                                   stratify_by='_Object0_',
                                   train_name='pcb_train',
                                   test_name='pcb_test',
                                   columns=column_list
                                   )


In [None]:
# To view a table in-memory, we can run a "fetch" function.

train_set.fetch(to=2)

In [None]:
# To inspect and verify the metadata summary for the images in train_set, you can run "image_summary" from "trainSetTbl". 
# It is important to make sure the image size is 640 x 640.

trainSetTbl = ImageTable.from_table(train_set)
trainSetTbl.image_summary

### 2.4: Examine Object Class Distributions

In [None]:
# After using an imageTable to verify the subsetted training data image contents, 
# we can examine the distribution the object image classes CAS train/test table.

# Object class distribution for "train_set".

s.simple.freq(table={'name':train_set,'vars':[{'name':'_Object0_'}]})

In [None]:
# Object class distribution for "test_set".

s.simple.freq(table={'name':test_set,'vars':[{'name':'_Object0_'}]})

## Step 3: Build the model <a id="build"><a>

### 3.1: Model Description

YOLOv5 is a real-time object detection algorithm that is an evolution of the popular YOLO (You Only Look Once) series of algorithms. The algorithm was developed by Ultralytics and was released in June 2020. YOLOv5 is a deep learning model that uses convolutional neural networks (CNNs) to detect and classify objects in real-time.

Overall, YOLOv5 is an accurate and efficient object detection algorithm that is suitable for a wide range of applications. Its combination of accuracy, speed, and versatility make it an attractive option for developers and researchers working on object detection problems.

In [None]:
# We will be building the Yolov5 model, here's how the architecture looks like.

display(Image(filename='YOLOv5-1 Network Architecture.png'))

Architecture Source: [article](https://iq.opengenus.org/yolov5/)

### 3.2: Get Anchor Boxes from training data

Find the most utilized anchor box dimensions in the training data trainset and save the resulting anchor box values. Anchor box values are a list of scalar value pairs that represent the normalized box sizes in width and height for objects to be detected. The normalized box sizes are scalar quantities because they are calculated by dividing the box size (pixels) by the grid size (pixels).

In order, items in an anchor box list represent:

- AnchorBox1_width,
- AnchorBox1_height,
- AnchorBox2_width,
- AnchorBox2_height,
- ...
- AnchorBoxN_width,
- AnchorBoxN_height

With `n_anchors=9`, there should be nine anchor box value pairs.

Use the SAS DLPy  `get_anchors()` function to retrieve an array that specifies the best `n_anchors` value pairs.

In [None]:
anchor_boxes = get_anchors(s, train_set, 'yolo', image_size=640, n_anchors=9)
anchors = ' '.join(map(str, anchor_boxes))

#anchors = '10 13 16 30 33 23 30 61 62 45 59 119 116 90 156 198 373 326' 

The output returns nine anchor box value pairs. It is easier to visualize the anchor box shapes by grouping the output into 9 `width x height` coordinate pairs. 

As follows:

In [None]:
# Pair up elements of the tuple
pairs = []
for i in range(0, len(anchor_boxes), 2):
    pairs.append((anchor_boxes[i], anchor_boxes[i+1]))

# Print the pairs
print("Coordinate pairs:",pairs)

# Get smallest and largest tuples in terms of difference between elements
smallest_tuple = min(pairs, key=lambda x: abs(x[0] - x[1]))
largest_tuple = max(pairs, key=lambda x: abs(x[0] - x[1]))

# Print the results
print("\n")
print("Smallest tuple:", smallest_tuple)
print("Largest tuple:", largest_tuple)


The results show a range of normalized scalar anchor box sizes and shapes and shows the smallest and largest `width x height` coordinate pairs. 

These anchor box value pairs saved to the python variable `anchors` will be used in the upcoming model definition.

### 3.3: Define the Object Detection YOLOv5 Model Architecture

Now we will use SAS DLPy to define the YOLOv5 model architecture. The table `anchor_boxes` contains the anchor shapes.

In [None]:
model = MZModel(conn=s, 
                model_name = "YoloV5",
                model_type = "TorchNative", 
                dataset_type= "OBJDETECT", 
                anchors = anchors,
                num_classes=6,
                caslib="casuser",
                model_path="/shared-data/pcb_defects/models/traced_yolov5s.pt",
                model_subtype = "SMALL"                
                )

## Step 4: Train the Model <a id="train"></a>

Use the `train()` method of the `MZModel` class to train the model. Use `inputs` to specify the column that contains the raw images and `targets` to specify the column that contains the annotation images. Pass your optimizer and gpu settings.

### 4.1 Define your hyperparameters tuning parameters

In [None]:
optimizer=Optimizer(seed=12345, 
                    algorithm=SGDSolver(lr=0.001),
                    batch_size=10,
                    max_epochs=10                   
                    )

### 4.2 Define your targets

In [None]:
obj_det_targets = ['_nObjects_',
 '_Object0_',
 '_Object0_x',
 '_Object0_y',
 '_Object0_width',
 '_Object0_height',
 '_Object1_',
 '_Object1_x',
 '_Object1_y',
 '_Object1_width',
 '_Object1_height',
 '_Object2_',
 '_Object2_x',
 '_Object2_y',
 '_Object2_width',
 '_Object2_height',
 '_Object3_',
 '_Object3_x',
 '_Object3_y',
 '_Object3_width',
 '_Object3_height',
 '_Object4_',
 '_Object4_x',
 '_Object4_y',
 '_Object4_width',
 '_Object4_height']

### 4.3 Add your image transformation (if needed)

In [None]:
model.add_image_transformation(image_size='640', image_resize_type="RETAIN_ASPECTRATIO")

### 4.4 Start training process

We are looking out for steady decrease in our losses throughout the epochs. As the task is to classify and detect the position of defects, keep a look out for the box loss and class loss.

In [None]:
model.train(table=train_set.name, 
            inputs='_image_', 
            targets=obj_det_targets,
            index_variable=['_Object0_', '_Object1_', '_Object2_', '_Object3_', '_Object4_'], 
            log_level=4, 
            gpu = [0],
            seed=1234, 
            optimizer=optimizer,
            batch_size=128)

## Step 5: Score the Model and Visualize Scoring Results <a id="score"></a>

### 5.1: Score the model on the test set

Use the `modelzoo cas action set` method to score the test data. The score results are written to `tableOut`. The `tableOut` value is a `CASTable` that contains the labels, the annotated image predictions, and the filename column.

In [None]:
parameters = DLPyDict(logLevel=log_level_map[4], table=train_set.name, inputs="_image_", targets=obj_det_targets,
                              model=model.model_table, batch_size=10,
                              indexvariables=model.index_variable, inputIndexmap=model.index_map,
                              options=dict(yaml=str(model.documents_score), label=model.label_name + "_score"),
                              tableOut=dict(name="pcb_scored",replace=True), copyVars=["_image_"])

rt = model.conn.retrieve('dlModelZoo.dlmzscore', _messagelevel='note', **parameters)


### 5.2 Visualize scored results

In [None]:
score_results = s.CASTable("pcb_scored", where="_nObjects_ > 0")
score_results.head()

### 5.3 Save our model for deployment

In [None]:
model.save_to_astore(path='.', 
                     file_name="pcb_yolov5s")

## Step 6: Register Model to model repository <a id="register"></a>

In [None]:
from sasctl import Session, register_model
from sasctl.services import model_repository

with Session("https://gtptest.apdemo.sas.com", username, password):
    register_model(model= s.CASTable("MODEL_1P4OXD_ASTORE"), 
                   name = 'Yolov5_Original', 
                   project = 'dc1b7958-c6df-4153-835f-fdba1afbcbcc',
                   repository='Public',
                   version='latest')
    

#### We have come to the end of the model development hands-on. As a good practice, please remember to terminate your session.

In [None]:
s.terminate()