<h1 style="font-size:30px;">Application: Social Distancing Monitor</h1>

In a world hit by a Pandemic, it is important to be careful and take precautions for our own as well as the greater good. A few ways to stay safe include getting vaccinated if you can, wearing a mask, and keeping a Social Distance of 6ft (or two meters). In this notebook, we will be implementing a Social Distancing Monitor using OpenCV's DNN Module and a MobileNet-SSD object detection model.

Consider the following frame from a video feed. Our goal is to understand if, and who, is unsafely close together. Try running the python script accompanying this notebook to try it out for yourself.

<br>
<center>
<img src="https://opencv.org/wp-content/uploads/2021/09/c0-m15-input-frame.png" alt="Waiting Room">
</center>
<br>

In [None]:
if 'google.colab' in str(get_ipython()):
    print("Downloading Code to Colab Environment")
    !wget https://www.dropbox.com/sh/uro596fmm67in3b/AABurDoQj5tS94EgUDQXkcBaa?dl=1 -O module-code.zip -q --show-progress
    !unzip -qq module-code.zip
    !pip install --upgrade opencv-contrib-python
    %cd Applications/
else:
    pass

In [None]:
import cv2
import numpy as np

# 1. Load MobilenetSSD Model

The model is obtained from the following repo:
https://github.com/chuanqi305/MobileNet-SSD

The model is trained on the Pascal VOC 2012 data, but it was pre-trained on the COCO dataset. Thus, it gives very good accuracy as well as it performs very fast. This model was trained using the Caffe framework.

We load the MobilenetSSD model using the OpenCV's DNN Module. This function takes 2 arguments:
1. The `MobileNetSSD_deploy.prototxt` protobuf text file
2. The `MobileNetSSD_deploy.caffemodel` binary file

This function returns the loaded model object.

In [None]:
configFile = 'MobileNetSSD_deploy.prototxt'
modelFile = 'MobileNetSSD_deploy.caffemodel'
net = cv2.dnn.readNetFromCaffe(configFile, modelFile)

# 2. Get Detections
We now get the detections from the loaded MobilenetSSD model. We start with finding the detection from the layer outputs obtained from the MobilenetSSD network. We then check if the detection is a person. If yes, we update the bounding box list and the centroids list.

This function takes 2 inputs:
1. The particular frame
2. The MobilenetSSD Pretrained Network

After processing, it returns a tuple with the following:
1. Confidence
2. Bounding box values
3. Centroid values

As always, we need to use `blobFromImage` with settings matching the training data used to create the model. In this case, we can find this in the `train.prototxt` file from the [github repository](https://github.com/chuanqi305/MobileNet-SSD/blob/master/train.prototxt).


In [None]:
def detect(frame, network):
    """Detects any humans and returns their bounding box and centers."""
    results = []
    h, w = frame.shape[:2]

    # Pre-processing: mean subtraction and scaling to match model's training set.
    blob = cv2.dnn.blobFromImage(frame, 0.007843, (300, 300), [127.5, 127.5, 127.5])
    network.setInput(blob)

    # Run an inference of the model, passing blob through the network.
    network_output = network.forward()

    # Loop over all results.
    for i in np.arange(0, network_output.shape[2]):
        class_id = network_output[0, 0, i, 1]
        confidence = network_output[0, 0, i, 2]

        # Filter for only detected people (classID 15) and high confidence.
        # https://github.com/chuanqi305/MobileNet-SSD/blob/master/demo.py#L21
        if confidence > 0.7 and class_id == 15:
            # Remap 0-1 position outputs to size of image for bounding box.
            box = network_output[0, 0, i, 3:7] * np.array([w, h, w, h])
            box = box.astype('int')

            # Calculate the person center from the bounding box.
            center_x = int((box[0] + box[2]) / 2)
            center_y = int((box[1] + box[3]) / 2)

            results.append((confidence, box, (center_x, center_y)))
    return results

# 3. Detect Violations from Set of Detections
For detecting violations, we calculate the distance matrix for each pair of detection and then update the distance threshold for that pair based as 1.2 times the pixel width of the smallest detection. If the pair violates social distancing as per the distance threshold, it is marked as a violation. 

This function takes an input, which is the result of the frame passed through the neural network, and returns the violations set as the output.

You can tune the multiplying factor to suit your application.

In [None]:
def detect_violations(results):
    """Detects if there are any people who are unsafely close together."""
    violations = set()
    # Multiplier on the pixel width of the smallest detection.
    fac = 1.2

    if len(results) >= 2:
        # Width is right edge minus left.
        boxes_width = np.array([abs(int(r[1][2] - r[1][0])) for r in results])
        centroids = np.array([r[2] for r in results])
        distance_matrix = euclidean_dist(centroids, centroids)
        
        # For each starting detection...
        for row in range(distance_matrix.shape[0]):
            # Compare distance with every other remaining detection.
            for col in range(row + 1, distance_matrix.shape[1]):
                # Presume unsafe if closer than 1.2x (fac) width of a person apart.
                ref_distance = int(fac * min(boxes_width[row], boxes_width[col]))

                if distance_matrix[row, col] < ref_distance:
                    violations.add(row)
                    violations.add(col)
    return violations

# 4. Calculate Distance Matrix
We use the euclidean formula to calculate distance i.e., 

$ Distance = \sqrt{(x_{A}-x_{B}) ^{2} + (y_{A}-y_{B}) ^{2}} $ 

This function takes 2 numpy arrays as the input and returns the Distance matrix as the output

In [None]:
def euclidean_dist(A, B):
    """Calculates pair-wise distance between each centroid combination.

    Returns a matrix of len(A) by len(B)."""
    p1 = np.sum(A**2, axis=1)[:, np.newaxis]
    p2 = np.sum(B**2, axis=1)
    p3 = -2 * np.dot(A, B.T)
    return np.round(np.sqrt(p1 + p2 + p3), 2)

# 5. Write Video to File Directory
To write video to the file directory, we use the `cv2.VideoWriter()` function. This function writes a video file frame by frame to the file directory. 

```python
fourcc = cv2.VideoWriter_fourcc(*'MP4V')
writer = cv2.VideoWriter(OUTPUT_PATH, fourcc, 25, (frame.shape[1], frame.shape[0]), True)
```


# 6. Loading and Running model on the Video
We have to now load the required video file and run all the above mentioned functions on it. We start will loading the pre-trained Network. We then read the video frame by frame in a While loop and perform the required operations on each frame. This includes getting detections in each frame, detecting violations, drawing a bounding box around each detection and marking them Safe/Unsafe and finally displaying and saving the output video.

In [None]:
SHOW_VIDEO = 0
INPUT_PATH = 'input.mp4'
OUTPUT_PATH = 'output.mp4'

cap = cv2.VideoCapture(INPUT_PATH)

writer = None

prev_frame_time = 0
new_frame_time = 0
counter = 0

print("Processing frames please wait ...")

while cap.isOpened():
    ret, frame = cap.read()

    if not ret:
        break

    # Detect Boxes.
    results = detect(frame, network=net)

    # Detect boxes too close (i.e. the violations).
    violations = detect_violations(results)

    t, _ = net.getPerfProfile()
    label = 'Inference time: %.2f ms' % (t * 1000.0 / cv2.getTickFrequency())

    # Plot all bounding boxes and whether they are in violation
    for index, (prob, bounding_box, centroid) in enumerate(results):
        start_x, start_y, end_x, end_y = bounding_box

        # Color red if violation, otherwise color green.
        color = (0, 0, 255) if index in violations else (0, 255, 0)
        cv2.rectangle(frame, (start_x, start_y), (end_x, end_y), color, 2)

        cv2.putText(
            frame, label,
            (2, frame.shape[0] - 4),
            cv2.FONT_HERSHEY_TRIPLEX, 0.4, (255, 255, 255))
        cv2.putText(
            frame, 'Not Safe' if index in violations else 'Safe',
            (start_x, start_y - 10),
            cv2.FONT_HERSHEY_SIMPLEX, 0.6, color, 2)

        cv2.putText(
            frame, f'Num Violations: {len(violations)}',
            (10, frame.shape[0] - 25),
            fontFace=cv2.FONT_HERSHEY_PLAIN,
            fontScale=1.0, color=(0, 0, 255), thickness=1)

    if SHOW_VIDEO:
        cv2.imshow('frame', frame)
        if cv2.waitKey(1) & 0xFF == ord('q'):
            break

    if writer is None:
        fourcc = cv2.VideoWriter_fourcc(*'mp4v')
        writer = cv2.VideoWriter(
            OUTPUT_PATH, fourcc, 25, (frame.shape[1], frame.shape[0]), True)

    if writer:
        writer.write(frame)

cap.release()
writer.release()
print(f'Finished Writing Video to {OUTPUT_PATH}')
cv2.destroyAllWindows()
print('Cleared all windows...')


# 7. View the results

See below the resulting output of the analysis.

In [None]:
from moviepy import VideoFileClip
clip = VideoFileClip("./output.mp4")
clip.display_in_notebook(width=800)