## Task 03. Basic Motion Detection

### Goals:
* Basic motion detection 

Now we just want to detect whether exists movement or not in a video frame.  In motion detection, we tend to make the following assumption:
The background of our video stream is largely static and unchanging over consecutive frames of a video. Therefore, if we can model the background, we monitor it for substantial changes. If there is a substantial change, we can detect it, this change normally corresponds to motion on our video. 

Obviously in the real-world this assumption can easily fail. Due to shadowing, reflections, lighting conditions, and any other possible change in the environment, our background can look quite different in various frames of a video. And if the background appears to be different, it can throw our algorithms off. That’s why the most successful background subtraction/foreground detection systems utilize fixed mounted cameras and in controlled lighting conditions.

In our case, we will take an image with our cam that will contain no motion, that is, just background. Given this, we can model the background of our video stream using only this still image of the video.

Let's start by importing the necessary packages and defining resize function.

In [2]:
# import the necessary packages
import numpy as np
import time
import cv2
import datetime

#Convenience resize function
def resize(image, width=None, height=None, inter=cv2.INTER_AREA):
    # initialize the dimensions of the image to be resized and
    # grab the image size
    dim = None
    (h, w) = image.shape[:2]

    # if both the width and height are None, then return the
    # original image
    if width is None and height is None:
        return image

    # check to see if the width is None
    if width is None:
        # calculate the ratio of the height and construct the
        # dimensions
        r = height / float(h)
        dim = (int(w * r), height)

    # otherwise, the height is None
    else:
        # calculate the ratio of the width and construct the
        # dimensions
        r = width / float(w)
        dim = (width, int(h * r))

    # resize the image
    resized = cv2.resize(image, dim, interpolation=inter)

    # return the resized image
    return resized

Again, we start looping over the frames taken by our camera. However, now we are doing a big assumption: A frame taken in our cam will contain no motion and just background, therefore, we can model the background of our video stream using only this frame of the video.

We’ll also define a string named text and initialize it to indicate that the room we are monitoring is “Unoccupied”. If there is indeed activity in the room, we can update this string.

We’ll first resize the image down to have a width of 500 pixels — there is no need to process the large, raw images straight from the video stream. We’ll also convert the image to grayscale since color has no bearing on our motion  detection algorithm.  It’s important to understand that even consecutive frames of a video stream will not be identical! Due to tiny variations in the digital camera sensors, no two frames will be 100% the same, some pixels will most certainly have different intensity values. That said, we need to account for this and apply Gaussian smoothing to average pixel intensities. This helps smooth out high frequency noise that could throw our motion detection algorithm off.

Once we press the `q` key we will store the last picture taken with the camera, which should be just background.

In [11]:
VIDEODEV = 0
camera = cv2.VideoCapture(VIDEODEV); assert camera.isOpened()
time.sleep(0.25)
firstFrame = None

# loop over the frames of the video
while True:
    # grab the current frame and initialize the occupied/unoccupied
    # text
    (grabbed, frame) = camera.read()
    text = "Unoccupied"
 
    # resize the frame, convert it to grayscale, and blur it
    frame = resize(frame, width=500)
    gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)
    gray = cv2.GaussianBlur(gray, (21, 21), 0)
    # draw the text and timestamp on the frame
    cv2.putText(frame, "Room Status: {}".format(text), (10, 20),cv2.FONT_HERSHEY_SIMPLEX, 0.5, (0, 0, 255), 2)
    cv2.putText(frame, datetime.datetime.now().strftime("%A %d %B %Y %I:%M:%S%p"),(10, frame.shape[0] - 10), cv2.FONT_HERSHEY_SIMPLEX, 0.35, (0, 0, 255), 1)
 
    # show the frame and record if the user presses a key
    cv2.imshow("Security Feed", gray)
    key = cv2.waitKey(1) & 0xFF
    # if the `q` key is pressed, break from the lop
    if key == ord("q"):
        break
    
firstFrame = gray
camera.release()
cv2.destroyAllWindows()

Now that we have our background modeled via the firstFrame  variable, we can utilize it to compute the difference between the initial frame and subsequent new frames from the video stream.

Computing the difference between two frames is a simple subtraction, where we take the absolute value of their corresponding pixel intensity differences:

$$\Delta = |background\_model – current\_frame|$$

Notice that the background of the image will be clearly black. However, regions that contain motion will be much lighter. This implies that larger frame deltas indicate that motion is taking place in the image.

Next, we’ll threshold the frameDelta  to reveal regions of the image that only have significant changes in pixel intensity values. If the delta is less than 25, we discard the pixel and set it to black (i.e. background). If the delta is greater than 25, we’ll set it to white (i.e. foreground).

Given this thresholded image, it’s simple to apply contour detection to to find the outlines of these white regions. We start looping over each of the contours, where we’ll filter the contours with area smaller than min-area. Otherwise, if the contour area is larger than min-area, we’ll draw the bounding box surrounding the foreground and motion region. We’ll also update our text  status string to indicate that the room is “Occupied”.

In [12]:
min_area = 500
diffDelta = 25
camera = cv2.VideoCapture(VIDEODEV); assert camera.isOpened()
time.sleep(0.25)
# loop over the frames of the video
while True:
    # grab the current frame and initialize the occupied/unoccupied
    # text
    (grabbed, frame) = camera.read()
    text = "Unoccupied"
    
    frame = resize(frame, width=500)
    gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)
    gray = cv2.GaussianBlur(gray, (21, 21), 0)
    
    # compute the absolute difference between the current frame and
    # first frame
    frameDelta = cv2.absdiff(firstFrame, gray)
    thresh = cv2.threshold(frameDelta, diffDelta, 255, cv2.THRESH_BINARY)[1]
 
    # dilate the thresholded image to fill in holes, then find contours
    # on thresholded image
    thresh = cv2.dilate(thresh, None, iterations=2)
    cnts = cv2.findContours(thresh.copy(), cv2.RETR_EXTERNAL,cv2.CHAIN_APPROX_SIMPLE)[-2]
 
    # loop over the contours
    for c in cnts:
        # if the contour is too small, ignore it
        if cv2.contourArea(c) < min_area:
            continue
        # compute the bounding box for the contour, draw it on the frame,
        # and update the text
        (x, y, w, h) = cv2.boundingRect(c)
        cv2.rectangle(frame, (x, y), (x + w, y + h), (0, 255, 0), 2)
        text = "Occupied"

    # draw the text and timestamp on the frame
    cv2.putText(frame, "Room Status: {}".format(text), (10, 20),cv2.FONT_HERSHEY_SIMPLEX, 0.5, (0, 0, 255), 2)
    cv2.putText(frame, datetime.datetime.now().strftime("%A %d %B %Y %I:%M:%S%p"),(10, frame.shape[0] - 10), cv2.FONT_HERSHEY_SIMPLEX, 0.35, (0, 0, 255), 1)
 
    # show the frame and record if the user presses a key
    cv2.imshow("Security Feed", frame)
    cv2.imshow("Thresh", thresh)
    cv2.imshow("Frame Delta", frameDelta)
    key = cv2.waitKey(1) & 0xFF
 
    # if the `q` key is pressed, break from the lop
    if key == ord("q"):
        break

camera.release()
cv2.destroyAllWindows()