# Chapter 7: Building Custom Object Detectors

This Jupyter Notebook allows you to interactively edit and run a subset of the code samples from the corresponding chapter in our book, *Learning OpenCV 5 Computer Vision with Python 3*.

Any Jupyter server should be capable of running the Notebook, even if the sample input images files are not available in the server's local filesystem. For example, you can run the Notebook in Google Colab by opening the following link in your Web browser: https://colab.research.google.com/github/PacktPublishing/Learning-OpenCV-5-Computer-Vision-with-Python-Fourth-Edition/blob/main/chapter07/chapter07.ipynb. Specifically, this link opens the Notebook's latest version, hosted on GitHub.

For additional code samples and instructions, please refer to the book and to the GitHub repository at https://github.com/PacktPublishing/Learning-OpenCV-5-Computer-Vision-with-Python-Fourth-Edition.

## Upgrading OpenCV and running the compatibility script

**IMPORTANT:** Run the scripts in this section first and run them in order; otherwise, code in subsequent sections may fail or hang.

If you are running this Notebook in Google Colab or another environment where OpenCV might not be up-to-date, run the following command to upgrade the OpenCV pip package:

In [None]:
!pip install opencv-contrib-python --upgrade

If the preceding command's output includes a prompt to restart the kernel, do restart it.

Now, run the following script, which provides a compatibility layer between OpenCV and Jupyter:

In [None]:
# %load ../compat/jupyter_compat.py
import os

import cv2
import numpy
import PIL.Image

from IPython import display
from urllib.request import urlopen


def cv2_imshow(winname, mat):
    mat = mat.clip(0, 255).astype('uint8')
    if mat.ndim == 3:
        if mat.shape[2] == 4:
            mat = cv2.cvtColor(mat, cv2.COLOR_BGRA2RGBA)
        else:
            mat = cv2.cvtColor(mat, cv2.COLOR_BGR2RGB)
    display.display(PIL.Image.fromarray(mat))

cv2.imshow = cv2_imshow


def cv2_waitKey(delay=0):
    return -1

cv2.waitKey = cv2_waitKey


def cv2_imread(filename, flags=cv2.IMREAD_COLOR):
    if os.path.exists(filename):
        image = cv2._imread(filename, flags)
    else:
        url = f'https://github.com/PacktPublishing/Learning-OpenCV-5-Computer-Vision-with-Python-Fourth-Edition/raw/main/*/{filename}'
        resp = urlopen(url)
        image = numpy.asarray(bytearray(resp.read()), dtype='uint8')
        image = cv2.imdecode(image, flags)
    return image

# Cache the original implementation of `imread`, if we have not already
# done so on a previous run of this cell.
if '_imread' not in dir(cv2):
    cv2._imread = cv2.imread

cv2.imread = cv2_imread


What did we just do? We imported OpenCV and we replaced some of OpenCV's I/O functions with our own functions that do not rely on a windowed environment or on a local filesystem.

## Detecting people with HOG descriptors

Let's start by detecting people using HOG descriptors.

Run the following script, which uses HOG with OpenCV's default person detector to finds people in an image of a hayfield:

In [None]:
# %load detect_people_hog.py
import cv2

OPENCV_MAJOR_VERSION = int(cv2.__version__.split('.')[0])
OPENCV_MINOR_VERSION = int(cv2.__version__.split('.')[1])

def is_inside(i, o):
    ix, iy, iw, ih = i
    ox, oy, ow, oh = o
    return ix > ox and ix + iw < ox + ow and \
        iy > oy and iy + ih < oy + oh

hog = cv2.HOGDescriptor()
hog.setSVMDetector(cv2.HOGDescriptor_getDefaultPeopleDetector())

img = cv2.imread('../images/haying.jpg')

if OPENCV_MAJOR_VERSION >= 5 or \
        (OPENCV_MAJOR_VERSION == 4 and OPENCV_MINOR_VERSION >= 6):
    # OpenCV 4.6 or a later version is being used.
    found_rects, found_weights = hog.detectMultiScale(
        img, winStride=(4, 4), scale=1.02, groupThreshold=1.9)
else:
    # OpenCV 4.5 or an earlier version is being used.
    # The groupThreshold parameter used to be named finalThreshold.
    found_rects, found_weights = hog.detectMultiScale(
        img, winStride=(4, 4), scale=1.02, finalThreshold=1.9)

found_rects_filtered = []
found_weights_filtered = []
for ri, r in enumerate(found_rects):
    for qi, q in enumerate(found_rects):
        if ri != qi and is_inside(r, q):
            break
    else:
        found_rects_filtered.append(r)
        found_weights_filtered.append(found_weights[ri])

for ri, r in enumerate(found_rects_filtered):
    x, y, w, h = r
    cv2.rectangle(img, (x, y), (x + w, y + h), (0, 255, 255), 2)
    text = '%.2f' % found_weights_filtered[ri]
    cv2.putText(img, text, (x, y - 20),
                cv2.FONT_HERSHEY_SIMPLEX, 1, (0, 255, 255), 2)

cv2.imshow('Women in Hayfield Detected', img)
cv2.imwrite('./women_in_hayfield_detected.png', img)
cv2.waitKey(0)


You probably see that some but not all of the people were detected. You may see false positive detections too. Try fine-tuning the parameters of `detectMultiScale` to see how the detection results are affected.

You can also try the following variant of the script, which replaces OpenCV's default person detector with the Daimler person detector and adjusts the parameters of `detectMultiScale`:

In [None]:
# %load detect_people_hog_daimler.py
import cv2

OPENCV_MAJOR_VERSION = int(cv2.__version__.split('.')[0])
OPENCV_MINOR_VERSION = int(cv2.__version__.split('.')[1])

def is_inside(i, o):
    ix, iy, iw, ih = i
    ox, oy, ow, oh = o
    return ix > ox and ix + iw < ox + ow and \
        iy > oy and iy + ih < oy + oh

hog = cv2.HOGDescriptor((48, 96), (16, 16), (8, 8), (8, 8), 9)
hog.setSVMDetector(cv2.HOGDescriptor_getDaimlerPeopleDetector())

img = cv2.imread('../images/haying.jpg')

if OPENCV_MAJOR_VERSION >= 5 or \
        (OPENCV_MAJOR_VERSION == 4 and OPENCV_MINOR_VERSION >= 6):
    # OpenCV 4.6 or a later version is being used.
    found_rects, found_weights = hog.detectMultiScale(
        img, winStride=(8, 8), scale=1.04, groupThreshold=6.0)
else:
    # OpenCV 4.5 or an earlier version is being used.
    # The groupThreshold parameter used to be named finalThreshold.
    found_rects, found_weights = hog.detectMultiScale(
        img, winStride=(8, 8), scale=1.04, finalThreshold=6.0)

found_rects_filtered = []
found_weights_filtered = []
for ri, r in enumerate(found_rects):
    for qi, q in enumerate(found_rects):
        if ri != qi and is_inside(r, q):
            break
    else:
        found_rects_filtered.append(r)
        found_weights_filtered.append(found_weights[ri])

for ri, r in enumerate(found_rects_filtered):
    x, y, w, h = r
    cv2.rectangle(img, (x, y), (x + w, y + h), (0, 255, 255), 2)
    text = '%.2f' % found_weights_filtered[ri]
    cv2.putText(img, text, (x, y - 20),
                cv2.FONT_HERSHEY_SIMPLEX, 1, (0, 255, 255), 2)

cv2.imshow('Women in Hayfield Detected', img)
cv2.imwrite('./women_in_hayfield_detected_daimler.png', img)
cv2.waitKey(0)


You probably see that the Daimler person detector did slightly better at detecting people in the distance, as its window size (48x96) is smaller than the default person detector's window size (64x128). Again, feel free to experiment with the parameters of `detectMultiScale`.

## Classifying cars v. non-cars using SIFT, BoW, and SVM

Now, let's train and test a custom image classifier using SIFT descriptors, BoW descriptors, and an SVM classifier.

First, we need a set of training images. Run the following commands to download and extract a copy of the UIUC Image Database for Car Detection:

In [None]:
!wget -O CarData.tar.gz https://github.com/gcr/arc-evaluator/raw/master/CarData.tar.gz
!tar -xvzf CarData.tar.gz

Run the following script to train and test the classifier:

In [None]:
# %load detect_car_bow_svm.py
import cv2
import numpy as np
import os

if not os.path.isdir('CarData'):
    print('CarData folder not found. Please download and unzip '
          'https://github.com/gcr/arc-evaluator/raw/master/CarData.tar.gz '
          'into the same folder as this script.')
    exit(1)

BOW_NUM_TRAINING_SAMPLES_PER_CLASS = 10
SVM_NUM_TRAINING_SAMPLES_PER_CLASS = 110

BOW_NUM_CLUSTERS = 40

sift = cv2.SIFT_create()

FLANN_INDEX_KDTREE = 1
index_params = dict(algorithm=FLANN_INDEX_KDTREE, trees=5)
search_params = dict(checks=50)
flann = cv2.FlannBasedMatcher(index_params, search_params)

bow_kmeans_trainer = cv2.BOWKMeansTrainer(BOW_NUM_CLUSTERS)
bow_extractor = cv2.BOWImgDescriptorExtractor(sift, flann)

def get_pos_and_neg_paths(i):
    pos_path = 'CarData/TrainImages/pos-%d.pgm' % (i+1)
    neg_path = 'CarData/TrainImages/neg-%d.pgm' % (i+1)
    return pos_path, neg_path

def add_sample(path):
    img = cv2.imread(path, cv2.IMREAD_GRAYSCALE)
    keypoints, descriptors = sift.detectAndCompute(img, None)
    if descriptors is not None:
        bow_kmeans_trainer.add(descriptors)

for i in range(BOW_NUM_TRAINING_SAMPLES_PER_CLASS):
    pos_path, neg_path = get_pos_and_neg_paths(i)
    add_sample(pos_path)
    add_sample(neg_path)

voc = bow_kmeans_trainer.cluster()
bow_extractor.setVocabulary(voc)

def extract_bow_descriptors(img):
    features = sift.detect(img)
    return bow_extractor.compute(img, features)

training_data = []
training_labels = []
for i in range(SVM_NUM_TRAINING_SAMPLES_PER_CLASS):
    pos_path, neg_path = get_pos_and_neg_paths(i)
    pos_img = cv2.imread(pos_path, cv2.IMREAD_GRAYSCALE)
    pos_descriptors = extract_bow_descriptors(pos_img)
    if pos_descriptors is not None:
        training_data.extend(pos_descriptors)
        training_labels.append(1)
    neg_img = cv2.imread(neg_path, cv2.IMREAD_GRAYSCALE)
    neg_descriptors = extract_bow_descriptors(neg_img)
    if neg_descriptors is not None:
        training_data.extend(neg_descriptors)
        training_labels.append(-1)

svm = cv2.ml.SVM_create()

svm.train(np.array(training_data), cv2.ml.ROW_SAMPLE,
          np.array(training_labels))

for test_img_path in ['CarData/TestImages/test-0.pgm',
                      'CarData/TestImages/test-1.pgm',
                      '../images/car.jpg',
                      '../images/haying.jpg',
                      '../images/statue.jpg',
                      '../images/woodcutters.jpg']:
    img = cv2.imread(test_img_path)
    gray_img = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
    descriptors = extract_bow_descriptors(gray_img)
    prediction = svm.predict(descriptors)
    if prediction[1][0][0] == 1.0:
        text = 'car'
        color = (0, 255, 0)
    else:
        text = 'not car'
        color = (0, 0, 255)
    cv2.putText(img, text, (10, 30), cv2.FONT_HERSHEY_SIMPLEX, 1,
                color, 2, cv2.LINE_AA)
    cv2.imshow(test_img_path, img)
cv2.waitKey(0)


You probably see that some classification results are correct and others are incorrect. Try fine-tuning the parameters, including the number of training samples and the number of BoW clusters, to see how the classification results are affected.

## Using a sliding window and NMS to detect cars' positions

Building on the previous section's sample, we are going to use our classifier in combination with a sliding window approach and non-maximum suppression (NMS) in order to detect the positions of cars in test images.

First, run the following script, which defines a function to filters a list of rectangles based on NMS:

In [None]:
# %load non_max_suppression.py
# import the necessary packages
import numpy as np

# Malisiewicz et al.
# Python port by Adrian Rosebrock
# https://www.pyimagesearch.com/2015/02/16/faster-non-maximum-suppression-python/
def non_max_suppression_fast(boxes, overlapThresh):
    # if there are no boxes, return an empty list
    if len(boxes) == 0:
        return []

    # initialize the list of picked indexes 
    pick = []

    # grab the coordinates of the bounding boxes
    x1 = boxes[:,0]
    y1 = boxes[:,1]
    x2 = boxes[:,2]
    y2 = boxes[:,3]
    scores = boxes[:,4]
    # compute the area of the bounding boxes and sort the bounding
    # boxes by the score/probability of the bounding box
    area = (x2 - x1 + 1) * (y2 - y1 + 1)
    idxs = np.argsort(scores)[::-1]

    # keep looping while some indexes still remain in the indexes
    # list
    while len(idxs) > 0:
        # grab the last index in the indexes list and add the
        # index value to the list of picked indexes
        last = len(idxs) - 1
        i = idxs[last]
        pick.append(i)

        # find the largest (x, y) coordinates for the start of
        # the bounding box and the smallest (x, y) coordinates
        # for the end of the bounding box
        xx1 = np.maximum(x1[i], x1[idxs[:last]])
        yy1 = np.maximum(y1[i], y1[idxs[:last]])
        xx2 = np.minimum(x2[i], x2[idxs[:last]])
        yy2 = np.minimum(y2[i], y2[idxs[:last]])

        # compute the width and height of the bounding box
        w = np.maximum(0, xx2 - xx1 + 1)
        h = np.maximum(0, yy2 - yy1 + 1)

        # compute the ratio of overlap
        overlap = (w * h) / area[idxs[:last]]

        # delete all indexes from the index list that have
        idxs = np.delete(idxs, np.concatenate(([last],
            np.where(overlap > overlapThresh)[0])))

    # return only the bounding boxes that were picked
    return boxes[pick]


The preceding script does not (on its own) produce any output but it does define an NMS function that we will import and use.

Run the following script, which trains another car classifier and then detects cars in test images using a sliding window and NMS:

In [None]:
# %load detect_car_bow_svm_sliding_window.py
import cv2
import numpy as np
import os

# When running in Jupyter, the `non_max_suppression_fast` function should
# already be in the global scope. Otherwise, import it now.
if 'non_max_suppression_fast' not in globals():
    from non_max_suppression import non_max_suppression_fast

if not os.path.isdir('CarData'):
    print('CarData folder not found. Please download and unzip '
          'https://github.com/gcr/arc-evaluator/raw/master/CarData.tar.gz '
          'into the same folder as this script.')
    exit(1)

BOW_NUM_TRAINING_SAMPLES_PER_CLASS = 10
SVM_NUM_TRAINING_SAMPLES_PER_CLASS = 110

BOW_NUM_CLUSTERS = 12
SVM_SCORE_THRESHOLD = 2.2
NMS_OVERLAP_THRESHOLD = 0.4

sift = cv2.SIFT_create()

FLANN_INDEX_KDTREE = 1
index_params = dict(algorithm=FLANN_INDEX_KDTREE, trees=5)
search_params = dict(checks=50)
flann = cv2.FlannBasedMatcher(index_params, search_params)

bow_kmeans_trainer = cv2.BOWKMeansTrainer(BOW_NUM_CLUSTERS)
bow_extractor = cv2.BOWImgDescriptorExtractor(sift, flann)

def get_pos_and_neg_paths(i):
    pos_path = 'CarData/TrainImages/pos-%d.pgm' % (i+1)
    neg_path = 'CarData/TrainImages/neg-%d.pgm' % (i+1)
    return pos_path, neg_path

def add_sample(path):
    img = cv2.imread(path, cv2.IMREAD_GRAYSCALE)
    keypoints, descriptors = sift.detectAndCompute(img, None)
    if descriptors is not None:
        bow_kmeans_trainer.add(descriptors)

for i in range(BOW_NUM_TRAINING_SAMPLES_PER_CLASS):
    pos_path, neg_path = get_pos_and_neg_paths(i)
    add_sample(pos_path)
    add_sample(neg_path)

voc = bow_kmeans_trainer.cluster()
bow_extractor.setVocabulary(voc)

def extract_bow_descriptors(img):
    features = sift.detect(img)
    return bow_extractor.compute(img, features)

training_data = []
training_labels = []
for i in range(SVM_NUM_TRAINING_SAMPLES_PER_CLASS):
    pos_path, neg_path = get_pos_and_neg_paths(i)
    pos_img = cv2.imread(pos_path, cv2.IMREAD_GRAYSCALE)
    pos_descriptors = extract_bow_descriptors(pos_img)
    if pos_descriptors is not None:
        training_data.extend(pos_descriptors)
        training_labels.append(1)
    neg_img = cv2.imread(neg_path, cv2.IMREAD_GRAYSCALE)
    neg_descriptors = extract_bow_descriptors(neg_img)
    if neg_descriptors is not None:
        training_data.extend(neg_descriptors)
        training_labels.append(-1)

svm = cv2.ml.SVM_create()
svm.setType(cv2.ml.SVM_C_SVC)
svm.setC(50)

svm.train(np.array(training_data), cv2.ml.ROW_SAMPLE,
          np.array(training_labels))

def pyramid(img, scale_factor=1.05, min_size=(100, 40),
            max_size=(600, 240)):
    h, w = img.shape
    min_w, min_h = min_size
    max_w, max_h = max_size
    while w >= min_w and h >= min_h:
        if w <= max_w and h <= max_h:
            yield img
        w /= scale_factor
        h /= scale_factor
        img = cv2.resize(img, (int(w), int(h)),
                         interpolation=cv2.INTER_AREA)

def sliding_window(img, step=20, window_size=(100, 40)):
    img_h, img_w = img.shape
    window_w, window_h = window_size
    for y in range(0, img_w, step):
        for x in range(0, img_h, step):
            roi = img[y:y+window_h, x:x+window_w]
            roi_h, roi_w = roi.shape
            if roi_w == window_w and roi_h == window_h:
                yield (x, y, roi)

for test_img_path in ['CarData/TestImages/test-0.pgm',
                      'CarData/TestImages/test-1.pgm',
                      '../images/car.jpg',
                      '../images/haying.jpg',
                      '../images/statue.jpg',
                      '../images/woodcutters.jpg']:
    img = cv2.imread(test_img_path)
    gray_img = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
    pos_rects = []
    for resized in pyramid(gray_img):
        for x, y, roi in sliding_window(resized):
            descriptors = extract_bow_descriptors(roi)
            if descriptors is None:
                continue
            prediction = svm.predict(descriptors)
            if prediction[1][0][0] == 1.0:
                raw_prediction = svm.predict(
                    descriptors, flags=cv2.ml.STAT_MODEL_RAW_OUTPUT)
                score = -raw_prediction[1][0][0]
                if score > SVM_SCORE_THRESHOLD:
                    h, w = roi.shape
                    scale = gray_img.shape[0] / float(resized.shape[0])
                    pos_rects.append([int(x * scale),
                                      int(y * scale),
                                      int((x+w) * scale),
                                      int((y+h) * scale),
                                      score])
    pos_rects = non_max_suppression_fast(
        np.array(pos_rects), NMS_OVERLAP_THRESHOLD)
    for x0, y0, x1, y1, score in pos_rects:
        cv2.rectangle(img, (int(x0), int(y0)), (int(x1), int(y1)),
                      (0, 255, 255), 2)
        text = '%.2f' % score
        cv2.putText(img, text, (int(x0), int(y0) - 20),
                    cv2.FONT_HERSHEY_SIMPLEX, 1, (0, 255, 255), 2)
    cv2.imshow(test_img_path, img)
cv2.waitKey(0)


You probably see that some detection results are correct and others are incorrect. Try fine-tuning the parameters, including the number of training samples, the number of BoW clusters, and the `pyramid` and `sliding_window` parameters, to see how the detection results are affected.

## Training a custom HOG model to detect cars' positions

The previous section's exercise may have left you wondering whether we can achieve better results by training our own HOG detector for cars. Indeed, we can and we will!

Using the same car database as our previous two samples, the following script extracts HOG features from the training images, uses the HOG features to train an SVM classifier, and builds a HOG detector around the SVM classifier:

In [None]:
# %load detect_car_hog_svm.py
import cv2
import numpy as np
import os

OPENCV_MAJOR_VERSION = int(cv2.__version__.split('.')[0])
OPENCV_MINOR_VERSION = int(cv2.__version__.split('.')[1])

if not os.path.isdir('CarData'):
    print('CarData folder not found. Please download and unzip '
          'https://github.com/gcr/arc-evaluator/raw/master/CarData.tar.gz '
          'into the same folder as this script.')
    exit(1)

HOG_WINDOW_SIZE = (96, 48)
HOG_WEIGHT_THRESHOLD = 0.45

SVM_NUM_TRAINING_SAMPLES_PER_CLASS = 300

hog = cv2.HOGDescriptor(HOG_WINDOW_SIZE, (16, 16), (8, 8), (8, 8), 9)

def get_pos_and_neg_paths(i):
    pos_path = 'CarData/TrainImages/pos-%d.pgm' % (i+1)
    neg_path = 'CarData/TrainImages/neg-%d.pgm' % (i+1)
    return pos_path, neg_path

def extract_hog_descriptors(img):
    resized = cv2.resize(img, HOG_WINDOW_SIZE, cv2.INTER_CUBIC)
    return hog.compute(resized, (16, 16), (0, 0))

training_data = []
training_labels = []
for i in range(SVM_NUM_TRAINING_SAMPLES_PER_CLASS):
    pos_path, neg_path = get_pos_and_neg_paths(i)
    pos_img = cv2.imread(pos_path, cv2.IMREAD_GRAYSCALE)
    pos_descriptors = extract_hog_descriptors(pos_img)
    if pos_descriptors is not None:
        training_data.append(pos_descriptors)
        training_labels.append(1)
    neg_img = cv2.imread(neg_path, cv2.IMREAD_GRAYSCALE)
    neg_descriptors = extract_hog_descriptors(neg_img)
    if neg_descriptors is not None:
        training_data.append(neg_descriptors)
        training_labels.append(-1)

svm = cv2.ml.SVM_create()
svm.setDegree(3)
criteria = (cv2.TERM_CRITERIA_MAX_ITER + cv2.TERM_CRITERIA_EPS, 1000, 1e-3)
svm.setTermCriteria(criteria)
svm.setKernel(cv2.ml.SVM_LINEAR)
svm.setNu(0.5)
svm.setP(0.1)
svm.setC(0.01)
svm.setType(cv2.ml.SVM_EPS_SVR)

svm.train(np.array(training_data), cv2.ml.ROW_SAMPLE,
          np.array(training_labels))

support_vectors = np.transpose(svm.getSupportVectors())
rho, _, _ = svm.getDecisionFunction(0)
svm_detector = np.append(support_vectors, [[-rho]], 0)
hog.setSVMDetector(svm_detector)

def is_inside(i, o):
    ix, iy, iw, ih = i
    ox, oy, ow, oh = o
    return ix > ox and ix + iw < ox + ow and \
        iy > oy and iy + ih < oy + oh

for test_img_path in ['CarData/TestImages/test-0.pgm',
                      'CarData/TestImages/test-1.pgm',
                      '../images/car.jpg',
                      '../images/haying.jpg',
                      '../images/statue.jpg',
                      '../images/woodcutters.jpg']:
    img = cv2.imread(test_img_path)

    if OPENCV_MAJOR_VERSION >= 5 or \
            (OPENCV_MAJOR_VERSION == 4 and OPENCV_MINOR_VERSION >= 6):
        # OpenCV 4.6 or a later version is being used.
        found_rects, found_weights = hog.detectMultiScale(
            img, winStride=(8, 8), scale=1.03, groupThreshold=2.0)
    else:
        # OpenCV 4.5 or an earlier version is being used.
        # The groupThreshold parameter used to be named finalThreshold.
        found_rects, found_weights = hog.detectMultiScale(
            img, winStride=(8, 8), scale=1.03, finalThreshold=2.0)

    found_rects_filtered = []
    found_weights_filtered = []
    for ri, r in enumerate(found_rects):
        if found_weights[ri] < HOG_WEIGHT_THRESHOLD:
            continue
        for qi, q in enumerate(found_rects):
            if found_weights[qi] < HOG_WEIGHT_THRESHOLD:
                continue
            if ri != qi and is_inside(r, q):
                break
        else:
            found_rects_filtered.append(r)
            found_weights_filtered.append(found_weights[ri])

    for ri, r in enumerate(found_rects_filtered):
        x, y, w, h = r
        cv2.rectangle(img, (x, y), (x + w, y + h), (0, 255, 255), 2)
        text = '%.2f' % found_weights_filtered[ri]
        cv2.putText(img, text, (x, y - 20),
                    cv2.FONT_HERSHEY_SIMPLEX, 1, (0, 255, 255), 2)

    cv2.imshow(test_img_path, img)
cv2.waitKey(0)


Note that a sliding window approach is built into the `detectMultiScale` method of `cv2.HOGDescriptor`, so we do not need a custom implementation of sliding windows or NMS in this case.

You should see that the detection results are much more reliable in this latest sample. Try fine-tuning the parameters, including the number of training samples, the window size, the HOG weight threshold, and the parameters of `detectMultiScale`, to see how the detection results are affected.


## Summary

That is all for now! Please refer to the book for additional details on these samples.