# Motion Detection in Videos

We are going to cover the topics of background subtraction and erosion, and explain how these techniques can be used to isolate foreground objects and therefore detect motion.

In [15]:
import cv2
import numpy as np

We are going to build a model of the background scene of the video based on some recent number of video frames and then compare that model to some current frame.

We can create a foreground mask that quantifies the difference between the background model and the current frame and therefore the highlighted portions of the foreground mask can be interpreted as those regions of the video that contain motion.

To remove the noisy pixels from the foreground mask we use an operation called *erosion*, when we apply erosion to this foreground mask we can achieve the eroded foreground mask which is totally black.

---

There's several functions in OpenCV that we'll be using to implement the erosion.

+ createBackgroundSubtractorKNN(): Creates KNN background subtract and takes no required arguments, but it has three optional arguments

1. history: Number of previous frames in the video stream, used to create a model for the background scene

Some methods of the class are:

1. apply(): Create a foregound mask, takes one required argument which is an image

Once we have the foregound mask is to apply an erosion operation to it, so we are going to take the foreground mask and pass to the **erode** function along with the second required argument which is the kernel and that will produce for us an eroded foreground.


Once we have an eroded foreground mask we want to identify all the non-zero pixels in that mask, so that we can identify the region where motion is occurring, for that purpose, we are going to use the **finNonZero** function and it takes one required argument which is the mask that we produced above and its going to return an array of of coordinates of all the non-zero pixels in the eroded foregrouns mask.

Finally we can use the **boundingRect** function and we are going to pass that funtion the array of coordinates that we computed above. This will return a bounding box which encompasses all of the non-zero pixels in the eroded foreground mask, this return a tuple of the coordinates of the rectangle.

In [16]:
# Create video capture and video writer object
input_video = "../motion_test.mp4"
video_cap = cv2.VideoCapture(input_video)
if not video_cap.isOpened():
    print(f"Unable to open: {input_video}")

In [17]:
frame_w = int(video_cap.get(cv2.CAP_PROP_FRAME_WIDTH))
frame_h = int(video_cap.get(cv2.CAP_PROP_FRAME_HEIGHT))
fps = int(video_cap.get(cv2.CAP_PROP_FPS))

size = (frame_w, frame_h)
size_quad = (int(2 * frame_w), int(2 * frame_h))

video_out_quad = cv2.VideoWriter("Video_out_quad.mp4", cv2.VideoWriter_fourcc(*"XVID"), fps, size_quad)

In [18]:
# Execution and Analysis
# Convenience function for annotating video frames
def drawBannerText(frame, text, banner_height_percent=0.08, font_scale=.8, text_color=(0, 255, 0), font_thickness=2):
    # Draw a black filled banner across the top of the image frame.
    # percent: set the banner height as a percentage of the frame height.
    banner_height = int(banner_height_percent * frame.shape[0])
    cv2.rectangle(frame, (0, 0), (frame.shape[1], banner_height), (0, 0, 0), thickness=-1)
    
    # Draw text on banner.
    left_offset = 20
    location = (left_offset, int(10 + (banner_height_percent * frame.shape[0]) / 2))
    cv2.putText(frame, text, location, cv2.FONT_HERSHEY_SIMPLEX, font_scale, text_color, font_thickness, cv2.LINE_AA)

In [19]:
bg_sub = cv2.createBackgroundSubtractorKNN(history=200)

In [20]:
# Process video
ksize = (5, 5)
red = (0, 0, 255)
yellow = (0, 255, 255)

# Quad view that will be built.
# ------------------------------
# frame_fg_mask         :  frame
# frame_fg_mask_erode   :  frame_eorde

while True:
    ret, frame = video_cap.read()
    
    if frame is None:
        break
    else:
        frame_erode = frame.copy()
        
        
    # Tage 1: Motion area based on foreground mask.
    fg_mask = bg_sub.apply(frame)
    motion_area = cv2.findNonZero(fg_mask)  # Return an array of pixel coordinates for all non-zero pixels
    x, y, w, h = cv2.boundingRect(motion_area)  # Give a bounding box that encompasses all of the non-zero pixels
    
    # Stage 2: Motion area based on foreground mask (with erosion)
    fg_mask_erode = cv2.erode(fg_mask, np.ones(ksize, np.uint8))
    motion_area_erode = cv2.findNonZero(fg_mask_erode)
    xe, ye, we, he = cv2.boundingRect(motion_area_erode)
    
    # Draw bounding box for motion area based on foreground mask
    if motion_area is not None:
        cv2.rectangle(frame, (x, y), (x + w, y + h), red, thickness=6)
        
    # Draw bounding box for motion area based on foreground mask (with erosion)
    if motion_area_erode is not None:
        cv2.rectangle(frame_erode, (xe, ye), (xe + we, ye + he), red, thickness=6)
        
    # Convert foreground masks to color so we can build a composite video with color annotations.
    frame_fg_mask = cv2.cvtColor(fg_mask, cv2.COLOR_GRAY2BGR)
    frame_fg_mask_erode = cv2.cvtColor(fg_mask_erode, cv2.COLOR_GRAY2BGR)
    
    # Annotate each video frame.
    drawBannerText(frame_fg_mask, "Foreground Mask")
    drawBannerText(frame_fg_mask_erode, "Foreground Mask Eroded")
    
    # Build quad view.
    frame_top = np.hstack([frame_fg_mask, frame])
    frame_bot = np.hstack([frame_fg_mask_erode, frame_erode])
    frame_composite = np.vstack([frame_top, frame_bot])
    
    # Create composite video with intermediate results (quad grid).
    fc_h, fc_w, _ = frame_composite.shape
    cv2.line(frame_composite, (0, int(fc_h / 2)), (fc_w, int(fc_h / 2)), yellow, thickness=1, lineType=cv2.LINE_AA)
    
    # Write cideo files.
    video_out_quad.write(frame_composite)
    
video_cap.release()
video_out_quad.release()