# Refer to OpenCV Social Distancing Detector (https://www.pyimagesearch.com/2020/06/01/opencv-social-distancing-detector/)

## Packages needed to run the opencv social distancing detector
1. install opencv-python(cv2),imutils for image processing
2. install ffmpeg for displaying the converted video file

In [2]:
!pip install imutils

Collecting imutils
  Downloading imutils-0.5.4.tar.gz (17 kB)
Building wheels for collected packages: imutils
  Building wheel for imutils (setup.py) ... [?25ldone
[?25h  Created wheel for imutils: filename=imutils-0.5.4-py3-none-any.whl size=25860 sha256=7eeac748aca883993e288348096b41b262b82a4a0c857af2aa5b5564f7ed2c18
  Stored in directory: /home/sandy/.cache/pip/wheels/86/d7/0a/4923351ed1cec5d5e24c1eaf8905567b02a0343b24aa873df2
Successfully built imutils
Installing collected packages: imutils
Successfully installed imutils-0.5.4


In [4]:
!pip install opencv-python

Collecting opencv-python
  Downloading opencv_python-4.5.3.56-cp37-cp37m-manylinux2014_x86_64.whl (49.9 MB)
[K     |████████████████████████████████| 49.9 MB 21 kB/s s eta 0:00:01   |▍                               | 604 kB 6.7 MB/s eta 0:00:08     |▋                               | 1.0 MB 6.7 MB/s eta 0:00:08     |█▍                              | 2.1 MB 6.7 MB/s eta 0:00:08     |██                              | 3.2 MB 6.7 MB/s eta 0:00:07     |███▍                            | 5.3 MB 6.7 MB/s eta 0:00:07     |████                            | 6.3 MB 6.7 MB/s eta 0:00:07     |█████                           | 7.7 MB 6.7 MB/s eta 0:00:07     |█████▉                          | 9.1 MB 6.7 MB/s eta 0:00:07     |██████▏                         | 9.6 MB 4.7 MB/s eta 0:00:09     |██████▊                         | 10.4 MB 4.7 MB/s eta 0:00:09     |███████▏                        | 11.1 MB 4.7 MB/s eta 0:00:09     |███████▋                        | 11.9 MB 4.7 MB/s eta 0:00:09     |████████ 

### Import Packages

In [3]:
# import the necessary packages
from scipy.spatial import distance as dist
import matplotlib.pyplot as plt
import numpy as np
import argparse
import imutils
import cv2
import os
import time

### Function to display images in Jupyter Notebooks and Google Colab

In [4]:
def plt_imshow(title, image):
    # convert the image frame BGR to RGB color space and display it
	image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
	plt.imshow(image)
	plt.title(title)
	plt.grid(False)
	plt.show()

### Our configuration file
hold on base dir for yolo model file, mininam confidence level for extracted object, threshold of non-maxima suppression(NMS) and predefined distance in pixel

In [5]:
class Config:
    # base path to YOLO directory
    MODEL_PATH = "yolo-coco"

    # initialize minimum probability to filter weak detections along with
    # the threshold when applying non-maxima suppression
    MIN_CONF = 0.3
    NMS_THRESH = 0.3

    # boolean indicating if NVIDIA CUDA GPU should be used
    USE_GPU = False

    # define the minimum safe distance (in pixels) that two people can be
    # from each other
    MIN_DISTANCE = 50

# instantiate our Config object
config = Config()

### Detecting people in images and video streams with OpenCV

In [6]:
def detect_people(frame, net, ln, personIdx=0):
    # grab the dimensions of the frame and  initialize the list of
    # results
    (H, W) = frame.shape[:2]
    results = []

    # construct a blob from the input frame and then perform a forward
    # pass of the YOLO object detector, giving us our bounding boxes
    # and associated probabilities
    blob = cv2.dnn.blobFromImage(frame, 1 / 255.0, (416, 416),swapRB=True, crop=False)
    net.setInput(blob)
    start = time.time()
    layerOutputs = net.forward(ln)
    end = time.time()
    
    # show timing information on YOLO for a frame 
    print("[INFO] YOLO took {:.6f} seconds".format(end - start))

    # initialize our lists of detected bounding boxes, centroids, and
    # confidences, respectively
    boxes = []
    centroids = []
    confidences = []

    # loop over each of the layer outputs
    for output in layerOutputs:
        # loop over each of the detections
        for detection in output:
            # extract the class ID and confidence (i.e., probability)
            # of the current object detection
            scores = detection[5:]
            classID = np.argmax(scores)
            confidence = scores[classID]

            # filter detections by (1) ensuring that the object
            # detected was a person and (2) that the minimum
            # confidence is met
            if classID == personIdx and confidence > config.MIN_CONF:
                # scale the bounding box coordinates back relative to # the size of the image, keeping in mind that YOLO
# actually returns the center (x, y)-coordinates of# the bounding box followed by the boxes' width and height
                box = detection[0:4] * np.array([W, H, W, H])
                (centerX, centerY, width, height) = box.astype("int")

                # use the center (x, y)-coordinates to derive the top# and left corner of the bounding box
                x = int(centerX - (width / 2))
                y = int(centerY - (height / 2))

                # update our list of bounding box coordinates,# centroids, and confidences
                boxes.append([x, y, int(width), int(height)])
                centroids.append((centerX, centerY))
                confidences.append(float(confidence))

    # apply non-maxima suppression to suppress weak, overlapping# bounding boxes
    idxs = cv2.dnn.NMSBoxes(boxes, confidences, config.MIN_CONF, config.NMS_THRESH)

    # ensure at least one detection exists
    if len(idxs) > 0:
        # loop over the indexes we are keeping
        for i in idxs.flatten():
            # extract the bounding box coordinates
            (x, y) = (boxes[i][0], boxes[i][1])
            (w, h) = (boxes[i][2], boxes[i][3])

            # update our results list to consist of the person# prediction probability, bounding box coordinates,
            # and the centroid
            r = (confidences[i], (x, y, x + w, y + h), centroids[i])
            results.append(r)

    # return the list of results
    return results

### Implementing a social distancing detector with OpenCV and deep learning

In [7]:

# hard coded arguments and values
args = {
    "input": "sample.mp4",
    "imageinput": "test1.jpg",
    "output": "output.avi",
    "display": 0
}

In [8]:
# load the COCO class labels our YOLO model was trained on
labelsPath = os.path.sep.join([config.MODEL_PATH, "coco.names"])
LABELS = open(labelsPath).read().strip().split("\n")

print(" labels: " + str(LABELS))

# derive the paths to the YOLO weights and model configuration
weightsPath = os.path.sep.join([config.MODEL_PATH, "yolov3.weights"])
configPath = os.path.sep.join([config.MODEL_PATH, "yolov3.cfg"])

print(" weights path: " + weightsPath)
print("config path: " + configPath)

 labels: ['person', 'bicycle', 'car', 'motorbike', 'aeroplane', 'bus', 'train', 'truck', 'boat', 'traffic light', 'fire hydrant', 'stop sign', 'parking meter', 'bench', 'bird', 'cat', 'dog', 'horse', 'sheep', 'cow', 'elephant', 'bear', 'zebra', 'giraffe', 'backpack', 'umbrella', 'handbag', 'tie', 'suitcase', 'frisbee', 'skis', 'snowboard', 'sports ball', 'kite', 'baseball bat', 'baseball glove', 'skateboard', 'surfboard', 'tennis racket', 'bottle', 'wine glass', 'cup', 'fork', 'knife', 'spoon', 'bowl', 'banana', 'apple', 'sandwich', 'orange', 'broccoli', 'carrot', 'hot dog', 'pizza', 'donut', 'cake', 'chair', 'sofa', 'pottedplant', 'bed', 'diningtable', 'toilet', 'tvmonitor', 'laptop', 'mouse', 'remote', 'keyboard', 'cell phone', 'microwave', 'oven', 'toaster', 'sink', 'refrigerator', 'book', 'clock', 'vase', 'scissors', 'teddy bear', 'hair drier', 'toothbrush']
 weights path: yolo-coco/yolov3.weights
config path: yolo-coco/yolov3.cfg


In [9]:
# initialize a list of colors to represent each possible class label
np.random.seed(42)
COLORS = np.random.randint(0, 255, size=(len(LABELS), 3),dtype="uint8")

In [10]:
# load our YOLO object detector trained on COCO dataset (80 classes)
print("[INFO] loading YOLO from disk...")
net = cv2.dnn.readNetFromDarknet(configPath, weightsPath)

[INFO] loading YOLO from disk...


In [11]:
# check if we are going to use GPU
if config.USE_GPU:
    # set CUDA as the preferable backend and target
    print("[INFO] setting preferable backend and target to CUDA...")
    net.setPreferableBackend(cv2.dnn.DNN_BACKEND_CUDA)
    net.setPreferableTarget(cv2.dnn.DNN_TARGET_CUDA)

In [12]:
# determine only the *output* layer names that we need from YOLO
ln = net.getLayerNames()
ln = [ln[i[0] - 1] for i in net.getUnconnectedOutLayers()]


## Image processing
1) Detect the person in the image
2) Get pariwise of person to measure distance between person of each pair
3) If the distance over threshold, put in voilate

In [8]:
# load our input image and grab its spatial dimensions
image = cv2.imread(args["imageinput"])
(H, W) = image.shape[:2]

#load image and detect the person in it
image = imutils.resize(image, width=700)
results = detect_people(image, net, ln,personIdx=LABELS.index("person"))

# initialize the set of indexes that violate the minimum social distance
violate = set()

# ensure there are *at least* two people detections (required in order to compute our pairwise distance maps)
if len(results) >= 2:
    # extract all centroids from the results and compute the Euclidean distances between all pairs of the centroids
    centroids = np.array([r[2] for r in results])
    D = dist.cdist(centroids, centroids, metric="euclidean")

    # loop over the upper triangular of the distance matrix
    for i in range(0, D.shape[0]):
        for j in range(i + 1, D.shape[1]):
            # check to see if the distance between any two centroid pairs is less than the configured number of pixels
            if D[i, j] < config.MIN_DISTANCE:
                # update our violation set with the indexes of the centroid pairs
                violate.add(i)
                violate.add(j)

# loop over the results
for (i, (prob, bbox, centroid)) in enumerate(results):
    # extract the bounding box and centroid coordinates, then initialize the color of the annotation
    (startX, startY, endX, endY) = bbox
    (cX, cY) = centroid
    color = (0, 255, 0)

    # if the index pair exists within the violation set, then update the color
    if i in violate:
        color = (0, 0, 255)

    # draw (1) a bounding box around the person and (2) the centroid coordinates of the person,
    cv2.rectangle(image, (startX, startY), (endX, endY), color, 2)
    cv2.circle(image, (cX, cY), 5, color, 1)

    # draw the total number of social distancing violations on the output frame
text = "Social Distancing Violations: {}".format(len(violate))
cv2.putText(image, text, (10, image.shape[0] - 25),cv2.FONT_HERSHEY_SIMPLEX, 0.85, (0, 0, 255), 3)


# show the output image
#cv2.imshow("Image", image)

[INFO] YOLO took 1.428015 seconds


array([[[232, 253, 251],
        [231, 251, 252],
        [231, 251, 252],
        ...,
        [164, 177, 193],
        [162, 177, 193],
        [164, 180, 196]],

       [[235, 253, 252],
        [234, 253, 254],
        [234, 253, 254],
        ...,
        [166, 177, 194],
        [165, 178, 194],
        [165, 180, 196]],

       [[237, 254, 253],
        [237, 254, 253],
        [237, 252, 254],
        ...,
        [169, 178, 195],
        [169, 178, 195],
        [168, 179, 196]],

       ...,

       [[ 27,  30,  51],
        [ 37,  40,  62],
        [ 46,  51,  72],
        ...,
        [  7,   4,   4],
        [  5,   3,   3],
        [  2,   0,   0]],

       [[ 18,  27,  37],
        [ 24,  33,  44],
        [ 35,  44,  54],
        ...,
        [ 13,   7,   8],
        [  9,   4,   5],
        [  6,   1,   1]],

       [[ 18,  26,  39],
        [ 17,  25,  38],
        [ 23,  30,  45],
        ...,
        [ 20,  14,  15],
        [ 17,  11,  11],
        [ 12,   7,   6]]

In [9]:
# show the output image
cv2.imshow("Image", image)
cv2.waitKey(0)

113

## Video processing
1) Detect the person in each frame of video
2) Get pariwise of person to measure distance between person of each pair
3) If the distance over threshold, put in voilate

In [19]:
# initialize the video stream and pointer to output video file
print("[INFO] accessing video stream...")
vs = cv2.VideoCapture(args["input"] if args["input"] else 0)
writer = None

[INFO] accessing video stream...


In [20]:
#initial frames number of the video
frameNo=0

videoDetectStart = time.time()

# loop over the frames from the video stream
while True:
    # read the next frame from the file
    (grabbed, frame) = vs.read()
    
    frameNo = frameNo +1 

    # if the frame was not grabbed, then we have reached the end of the stream
    if not grabbed:
        break

    # resize the frame and then detect people (and only people) in it
    frame = imutils.resize(frame, width=700) 
    results = detect_people(frame, net, ln,personIdx=LABELS.index("person"))

    # initialize the set of indexes that violate the minimum social distance
    violate = set()

    # ensure there are *at least* two people detections (required in order to compute our pairwise distance maps)
    if len(results) >= 2:
        # extract all centroids from the results and compute the Euclidean distances between all pairs of the centroids
        centroids = np.array([r[2] for r in results])
        D = dist.cdist(centroids, centroids, metric="euclidean")

        # loop over the upper triangular of the distance matrix
        for i in range(0, D.shape[0]):
            for j in range(i + 1, D.shape[1]):
                # check to see if the distance between any two centroid pairs is less than the configured number of pixels
                if D[i, j] < config.MIN_DISTANCE:
                    # update our violation set with the indexes of the centroid pairs
                    violate.add(i)
                    violate.add(j)

    # loop over the results
    for (i, (prob, bbox, centroid)) in enumerate(results):
        # extract the bounding box and centroid coordinates, then initialize the color of the annotation
        (startX, startY, endX, endY) = bbox
        (cX, cY) = centroid
        color = (0, 255, 0)

        # if the index pair exists within the violation set, then update the color
        if i in violate:
            color = (0, 0, 255)

        # draw (1) a bounding box around the person and (2) the centroid coordinates of the person,
        cv2.rectangle(frame, (startX, startY), (endX, endY), color, 2)
        cv2.circle(frame, (cX, cY), 5, color, 1)

    # draw the total number of social distancing violations on the output frame
    text = "Social Distancing Violations: {}".format(len(violate))
    cv2.putText(frame, text, (10, frame.shape[0] - 25),cv2.FONT_HERSHEY_SIMPLEX, 0.85, (0, 0, 255), 3)

    # check to see if the output frame should be displayed to our screen
    if args["display"] > 0:
        # show the output frame
        cv2.imshow("Frame", frame)
        key = cv2.waitKey(1) & 0xFF

        # if the `q` key was pressed, break from the loop
        if key == ord("q"):
            break

    # if an output video file path has been supplied and the video writer has not been initialized, do so now
    if args["output"] != "" and writer is None:
        # initialize our video writer
        fourcc = cv2.VideoWriter_fourcc(*"MJPG")
        writer = cv2.VideoWriter(args["output"], fourcc, 25, (frame.shape[1], frame.shape[0]), True)

    # if the video writer is not None, write the frame to the output video file
    if writer is not None:
        writer.write(frame)

videoDetectEnd = time.time()
# show timing information on YOLO for a video 
print("Video [INFO] YOLO took {:.6f} seconds".format(videoDetectEnd - videoDetectStart))

print("Total frames processed in this Video: " + frameNo)
    
# do a bit of cleanup
vs.release()

# check to see if the video writer point needs to be released
if writer is not None:
    writer.release()

[INFO] YOLO took 0.900452 seconds
[INFO] YOLO took 0.920882 seconds
[INFO] YOLO took 0.894843 seconds
[INFO] YOLO took 0.874938 seconds
[INFO] YOLO took 0.876225 seconds
[INFO] YOLO took 0.895451 seconds
[INFO] YOLO took 0.878620 seconds
[INFO] YOLO took 0.881421 seconds
[INFO] YOLO took 0.900224 seconds
[INFO] YOLO took 0.906002 seconds
[INFO] YOLO took 0.938484 seconds
[INFO] YOLO took 1.149958 seconds
[INFO] YOLO took 1.135099 seconds
[INFO] YOLO took 0.979864 seconds
[INFO] YOLO took 1.277631 seconds
[INFO] YOLO took 0.966535 seconds
[INFO] YOLO took 0.944442 seconds
[INFO] YOLO took 1.039900 seconds
[INFO] YOLO took 0.955700 seconds
[INFO] YOLO took 1.393111 seconds
[INFO] YOLO took 1.090088 seconds
[INFO] YOLO took 0.939010 seconds
[INFO] YOLO took 0.938540 seconds
[INFO] YOLO took 1.609737 seconds
[INFO] YOLO took 1.044897 seconds
[INFO] YOLO took 1.003771 seconds
[INFO] YOLO took 0.999167 seconds
[INFO] YOLO took 1.978830 seconds
[INFO] YOLO took 0.971215 seconds
[INFO] YOLO to

[INFO] YOLO took 0.984962 seconds
[INFO] YOLO took 0.974856 seconds
[INFO] YOLO took 1.202391 seconds
[INFO] YOLO took 1.114230 seconds
[INFO] YOLO took 1.061789 seconds
[INFO] YOLO took 0.990754 seconds
[INFO] YOLO took 1.318717 seconds
[INFO] YOLO took 1.181762 seconds
[INFO] YOLO took 1.012625 seconds
[INFO] YOLO took 1.022085 seconds
[INFO] YOLO took 0.973999 seconds
[INFO] YOLO took 0.979049 seconds
[INFO] YOLO took 0.973958 seconds
[INFO] YOLO took 0.976421 seconds
[INFO] YOLO took 0.950987 seconds
[INFO] YOLO took 0.997562 seconds
[INFO] YOLO took 0.977876 seconds
[INFO] YOLO took 0.980693 seconds
[INFO] YOLO took 0.996753 seconds
[INFO] YOLO took 1.013901 seconds
[INFO] YOLO took 0.947817 seconds
[INFO] YOLO took 0.971985 seconds
[INFO] YOLO took 1.024125 seconds
[INFO] YOLO took 1.097629 seconds
[INFO] YOLO took 0.969833 seconds
[INFO] YOLO took 0.975636 seconds
[INFO] YOLO took 0.972971 seconds
[INFO] YOLO took 1.004187 seconds
[INFO] YOLO took 0.943191 seconds
[INFO] YOLO to

[INFO] YOLO took 0.954045 seconds
[INFO] YOLO took 0.951152 seconds
[INFO] YOLO took 0.965474 seconds
[INFO] YOLO took 0.955280 seconds
[INFO] YOLO took 0.946671 seconds
[INFO] YOLO took 0.976818 seconds
[INFO] YOLO took 1.051600 seconds
[INFO] YOLO took 0.966964 seconds
[INFO] YOLO took 0.972646 seconds
[INFO] YOLO took 0.955259 seconds
[INFO] YOLO took 0.970750 seconds
[INFO] YOLO took 0.954793 seconds
[INFO] YOLO took 0.951214 seconds
[INFO] YOLO took 0.951810 seconds
[INFO] YOLO took 0.980818 seconds
[INFO] YOLO took 0.972908 seconds
[INFO] YOLO took 0.952830 seconds
[INFO] YOLO took 0.973637 seconds
[INFO] YOLO took 0.957894 seconds
[INFO] YOLO took 0.965593 seconds
[INFO] YOLO took 0.958110 seconds
[INFO] YOLO took 0.950152 seconds
[INFO] YOLO took 0.938612 seconds
[INFO] YOLO took 0.979268 seconds
[INFO] YOLO took 0.987801 seconds
[INFO] YOLO took 0.940298 seconds
[INFO] YOLO took 0.948478 seconds
[INFO] YOLO took 0.940343 seconds
[INFO] YOLO took 0.945567 seconds
[INFO] YOLO to

[INFO] YOLO took 0.955323 seconds
[INFO] YOLO took 0.982442 seconds
[INFO] YOLO took 1.123532 seconds
[INFO] YOLO took 1.140537 seconds
[INFO] YOLO took 1.010129 seconds
[INFO] YOLO took 1.010044 seconds
[INFO] YOLO took 0.980783 seconds
[INFO] YOLO took 1.069958 seconds
[INFO] YOLO took 0.991270 seconds
[INFO] YOLO took 1.057566 seconds
[INFO] YOLO took 0.954875 seconds
[INFO] YOLO took 0.949095 seconds
[INFO] YOLO took 0.974266 seconds
[INFO] YOLO took 0.954992 seconds
[INFO] YOLO took 0.957869 seconds
[INFO] YOLO took 0.947376 seconds
[INFO] YOLO took 0.948147 seconds
[INFO] YOLO took 0.944998 seconds
[INFO] YOLO took 1.036540 seconds
[INFO] YOLO took 0.955186 seconds
[INFO] YOLO took 0.950860 seconds
[INFO] YOLO took 0.981833 seconds
[INFO] YOLO took 0.969373 seconds
[INFO] YOLO took 0.992992 seconds
[INFO] YOLO took 1.025693 seconds
[INFO] YOLO took 0.963231 seconds
[INFO] YOLO took 0.959649 seconds
[INFO] YOLO took 0.954592 seconds
[INFO] YOLO took 0.957277 seconds
[INFO] YOLO to

Note that the above code block may take time to execute. If you are interested to view the video within Colab just execute the following code blocks. Note that it may be time-consuming.

Our output video is produced in `.avi` format. First, we need to convert it to `.mp4` format.

In [23]:
print("Total frames processed in this Video {} ".format(frameNo))

Total frames processed in this Video 928 


In [12]:
!pip install ffmpeg



In [16]:
!ffmpeg -i output.avi outputp.mp4

ffmpeg version 4.2.4-1ubuntu0.1 Copyright (c) 2000-2020 the FFmpeg developers
  built with gcc 9 (Ubuntu 9.3.0-10ubuntu2)
  configuration: --prefix=/usr --extra-version=1ubuntu0.1 --toolchain=hardened --libdir=/usr/lib/x86_64-linux-gnu --incdir=/usr/include/x86_64-linux-gnu --arch=amd64 --enable-gpl --disable-stripping --enable-avresample --disable-filter=resample --enable-avisynth --enable-gnutls --enable-ladspa --enable-libaom --enable-libass --enable-libbluray --enable-libbs2b --enable-libcaca --enable-libcdio --enable-libcodec2 --enable-libflite --enable-libfontconfig --enable-libfreetype --enable-libfribidi --enable-libgme --enable-libgsm --enable-libjack --enable-libmp3lame --enable-libmysofa --enable-libopenjpeg --enable-libopenmpt --enable-libopus --enable-libpulse --enable-librsvg --enable-librubberband --enable-libshine --enable-libsnappy --enable-libsoxr --enable-libspeex --enable-libssh --enable-libtheora --enable-libtwolame --enable-libvidstab --enable-libvorbis --enable-l

In [17]:
#@title Display video inline
from IPython.display import HTML
from base64 import b64encode

mp4 = open("outputp.mp4", "rb").read()
dataURL = "data:video/mp4;base64," + b64encode(mp4).decode()
HTML("""
<video width=400 controls>
      <source src="%s" type="video/mp4">
</video>
""" % dataURL)