# <u>Pastry Object Detection</u>

**Objective** : The main objective of this notebook is to exhibit the working of SSD-MobileNet trained on Pastry dataset. We'll show 2 methods for object detection here,first we'll do object detection on single Image and then we'll do webcam object detection

### Set required parameters

In [18]:
# Path to pre-trained model
SAVED_MODEL_PATH = r'C:/Users/dipesh/Desktop/fiverr-hasijayawardana/Pastry Detection/saved_model'

# Path to 'label_map.pbtxt' file
PATH_TO_LABELS = 'C:/Users/dipesh/Desktop/fiverr-hasijayawardana/Pastry Detection/label_map.pbtxt'

# Object detection threshold
THRESHOLD = 0.5

# 1. Detection on Single Image

### 1.1 Importing Dependencies

In [19]:
import tensorflow as tf
import cv2
import time
from object_detection.utils import label_map_util
from object_detection.utils import visualization_utils as viz_utils
import numpy as np

### 1.2 Create detection function
This function will be responsible to detection objects in the given Image. It'll take the below specified arguments and returns a numpy array of output Image along with list of top 10 detected object with scores.

In [20]:
def detect_objects(image_path, saved_model_path, labelMap_path, min_threshold=0.5):
    """
    Parameters
    ----------
    image_path : Path to input image for the model
    saved_model_path : Path to pre-trained model
    labelMap_path : Path to 'label_map.pbtxt' file
    min_threshold : Minimum decision threshold for classification

    Returns : Numpy array of output Image & List of Detected Objects with scores
    -------
    """

    # -------------------------------------------------------------------------------

    # Object Classes
    OBJECT_LABELS = {1: 'Cutlet', 2: 'Egg Patis', 3: 'Fish Bun', 4: 'Fish Roti', 5: 'Kimbula Bun',
                     6: 'Puff Pastry', 7: 'Roll', 8: 'Sausage Bun', 9: 'Uludu Wade', 10: 'Vegetable Roti'}

    print('Loading model...this will take a minute')
    start_time = time.time()

    # LOAD SAVED MODEL AND BUILD DETECTION FUNCTION
    detect_fn = tf.saved_model.load(saved_model_path)

    end_time = time.time()
    elapsed_time = end_time - start_time
    print('Done ! Loading model took {} seconds'.format(round(elapsed_time, 3)))

    # -------------------------------------------------------------------------------

    # LOAD LABEL MAP DATA FOR PLOTTING
    category_index = label_map_util.create_category_index_from_labelmap(labelMap_path,
                                                                        use_display_name=True)

    # -------------------------------------------------------------------------------

    print('Running inference for {}... '.format(image_path), end='')

    image = cv2.imread(image_path)

    # The input needs to be a tensor, convert it using `tf.convert_to_tensor`.
    input_tensor = tf.convert_to_tensor(image)

    # The model expects a batch of images, so add an axis with `tf.newaxis`.
    input_tensor = input_tensor[tf.newaxis, ...]

    # input_tensor = np.expand_dims(image_np, 0)
    detections = detect_fn(input_tensor)

    # All outputs are batches tensors.
    # Convert to numpy arrays, and take index [0] to remove the batch dimension.
    # We're only interested in the first num_detections.
    num_detections = int(detections.pop('num_detections'))
    detections = {key: value[0, :num_detections].numpy()
                  for key, value in detections.items()}
    detections['num_detections'] = num_detections

    # detection_classes should be ints.
    detections['detection_classes'] = detections['detection_classes'].astype(np.int64)

    image_with_detections = image.copy()

    # -------------------------------------------------------------------------------

    detected_objects = {}

    # print first 10 objects detected with scores
    print("\n----------TOP 10 DETECTED OBJECTS WITH SCORES----------")
    for i in range(11):
        obj_class = OBJECT_LABELS.get(detections['detection_classes'][i])
        obj_score = detections['detection_scores'][i]
        print(obj_class + " : " + str(round(obj_score, 3)))

        # add data to dictionary
        detected_objects[i] = {obj_class: str(round(obj_score, 3))}

    print("NOTE : Objects with score below threshold are not drawn.")
    print("-------------------------------------------------------")

    # -------------------------------------------------------------------------------

    # SET MIN_SCORE_THRESH BASED ON YOU MINIMUM THRESHOLD FOR DETECTIONS
    viz_utils.visualize_boxes_and_labels_on_image_array(
        image_with_detections,
        detections['detection_boxes'],
        detections['detection_classes'],
        detections['detection_scores'],
        category_index,
        use_normalized_coordinates=True,
        max_boxes_to_draw=100,
        min_score_thresh=min_threshold,
        agnostic_mode=False)

    image_with_detections = np.array(image_with_detections)

    return image_with_detections, detected_objects

### 1.3 Run Inference on Image

#### NOTE : The output Image will be display in a window outside the notebook 

In [21]:
# Path to Input Image
IMAGE_PATH = 'C:/Users/dipesh/Desktop/fiverr-hasijayawardana/train/IMG_0008.JPG'

# Path to save the output Image with name and extension
SAVE_PATH = r"C:/Users/dipesh/Desktop/fiverr-hasijayawardana/outputs/output.jpeg"

# Numpy array of output Image and Dictionary of Top 10 detected objects with scores
Output_Image, Detected_Objects = detect_objects(IMAGE_PATH, SAVED_MODEL_PATH, PATH_TO_LABELS, THRESHOLD)

print(Detected_Objects)

Loading model...this will take a minute
Done ! Loading model took 53.592 seconds
Running inference for C:/Users/dipesh/Desktop/fiverr-hasijayawardana/train/IMG_0008.JPG... 
----------TOP 10 DETECTED OBJECTS WITH SCORES----------
Sausage Bun : 0.998
Sausage Bun : 0.994
Roll : 0.983
Egg Patis : 0.166
Cutlet : 0.159
Kimbula Bun : 0.149
Kimbula Bun : 0.117
Sausage Bun : 0.111
Roll : 0.107
Roll : 0.106
Uludu Wade : 0.089
NOTE : Objects with score below threshold are not drawn.
-------------------------------------------------------
{0: {'Sausage Bun': '0.998'}, 1: {'Sausage Bun': '0.994'}, 2: {'Roll': '0.983'}, 3: {'Egg Patis': '0.166'}, 4: {'Cutlet': '0.159'}, 5: {'Kimbula Bun': '0.149'}, 6: {'Kimbula Bun': '0.117'}, 7: {'Sausage Bun': '0.111'}, 8: {'Roll': '0.107'}, 9: {'Roll': '0.106'}, 10: {'Uludu Wade': '0.089'}}


In [22]:
# Display the output Image 

cv2.imshow("OUTPUT", Output_Image)
cv2.waitKey()

-1

# 2. Webcam Detection

### 2.1 Create function to process webcam frames

This is a pure function which take an 416x416 Image frame as Input and return an output Image with bounding boxes drawn for objects above set threshold value.

In [23]:
def process_frame(image, detect_function, labelMap_path, min_threshold=0.5):
    """
    Parameters
    ----------
    image : Numpy array Image frame
    detect_function : Loaded model detection function
    labelMap_path : Path to 'label_map.pbtxt' file
    min_threshold : Minimum decision threshold for classification

    Returns : Numpy array of output Image
    -------
    """

    # -------------------------------------------------------------------------------

    # LOAD LABEL MAP DATA FOR PLOTTING
    category_index = label_map_util.create_category_index_from_labelmap(labelMap_path,
                                                                        use_display_name=True)

    # The input needs to be a tensor, convert it using `tf.convert_to_tensor`.
    input_tensor = tf.convert_to_tensor(image)

    # The model expects a batch of images, so add an axis with `tf.newaxis`.
    input_tensor = input_tensor[tf.newaxis, ...]

    # input_tensor = np.expand_dims(image_np, 0)
    detections = detect_function(input_tensor)

    # All outputs are batches tensors.
    # Convert to numpy arrays, and take index [0] to remove the batch dimension.
    # We're only interested in the first num_detections.
    num_detections = int(detections.pop('num_detections'))
    detections = {key: value[0, :num_detections].numpy()
                  for key, value in detections.items()}
    detections['num_detections'] = num_detections

    # detection_classes should be ints.
    detections['detection_classes'] = detections['detection_classes'].astype(np.int64)

    # -------------------------------------------------------------------------------

    # SET MIN_SCORE_THRESH BASED ON YOU MINIMUM THRESHOLD FOR DETECTIONS
    viz_utils.visualize_boxes_and_labels_on_image_array(
        image,
        detections['detection_boxes'],
        detections['detection_classes'],
        detections['detection_scores'],
        category_index,
        use_normalized_coordinates=True,
        max_boxes_to_draw=100,
        min_score_thresh=min_threshold,
        agnostic_mode=False)

    return image


### 2.2 Create function for Webcam Detection
This function is for the webcam object detection. When you execute this function,It'll automatically activate the webcam and start detecting objects frame by frame. The current speed is 3FPS,but can be further increased by running on GPU/TPU and reducing preprocessing/postprocessing of the Images.

**NOTE** : If you are using a webcam try changing "cv2.VideoCapture(0)" with "cv2.VideoCapture(1)"

In [24]:
def detect_webcam(detection_function, labelMap_path, min_threshold=0.5):

    print("Starting webcam...")

    # define a video capture object
    vid = cv2.VideoCapture(0)

    while (True):

        # Capture the video frame by frame
        ret, frame = vid.read()

        # ------------------------------------------------------------

        image = cv2.resize(frame, (416, 416))

        # # Numpy array of output Image and Dictionary of Top 10 detected objects with scores
        Output_Image = process_frame(image, detection_function, labelMap_path, min_threshold)

        # ------------------------------------------------------------

        # resizing the output image
        frame = cv2.resize(Output_Image, (620, 620))

        # Display the resulting frame
        cv2.imshow('frame', frame)

        # the 'q' button is set as the
        # quitting button you may use any
        # desired button of your choice
        if cv2.waitKey(1) & 0xFF == ord('q'):
            break

    # After the loop release the cap object
    vid.release()
    # Destroy all the windows
    cv2.destroyAllWindows()

### 2.3 Start Webcam !

In [25]:

# LOAD SAVED MODEL AND BUILD DETECTION FUNCTION
print("Loading model...this will take a minute")
detect_fn = tf.saved_model.load(SAVED_MODEL_PATH)

detect_webcam(detect_fn, PATH_TO_LABELS, THRESHOLD)

Loading model...this will take a minute
Starting webcam...
