# Motion Detection using OpenCV (A beginner's guide) 

 

This notebook is a step-by-step explanation of a simple OpenCV Motion Detection application. 

This was a  part of a computer vision project I did during my internship at Honda of Canada MFG. 

I hope you are familiar with some of OpenCV's functions. If not, then don't worry I will try to explain everything in detail as much as possible. 


## High Level Explanantion:

The general idea behind the application is that it takes the first frame from the camera  as a reference. 

This reference frame is a static background against which we are trying to detect motion.  

The program will run & detect changes between the first reference frame and the frames that follow. 

Basically, 

We are going to compare pixel difference "delta" between the first static reference frame & other frames.

If the difference "delta" is more than the threshold (that we have defined) it will trigger to draw a box around the areas where the pixel differnence is observed.


We will also record times at which the object enters & exits the frame.we will store these values in a Pandas dataframe. And finally, We will visualize these times using Bokeh

# Section 1: Getting started with OpenCV

### Webcam basics: 

__1. Turn on your webcam using OpenCV and Python __

To do this we will use the __.VideoCapture()__ method from OpenCV. The input argument is the camera(s) connected to your computer. In our case it is only 1 camera so we will enter 0. If you had 2 or 3 cameras you'd put in 1 or 2.

__2. Release the camera__ 

If you run the first line of code: __video_object = cv2.VideoCapture__(0) below you would notice the light turns on your camera. That is because we have intialized the camera. The light would not turn off. To turn off the camera we would like to use the: 
__.release() method__. This will turn off the camera or will _release_ it.

__3. Holding the camera for a certain amount of time__

Now you'd ask that what if I want to run the camera for a certain amount of time? For that we will import _time_ and use the __.sleep() method__ on "time" object to tell Python for how long we'd like to hold the camera before it is realeased.

So far so good...



In [1]:
import cv2 # import OpenCV
import time # import time to hold the camera for a certain amount of time

video_object = cv2.VideoCapture(0) # intialize the camera by creating a video_object

time.sleep(3) # hold the camera for 3 seconds 

video_object.release( ) # release the camera 

__4. Displaying the first frame__: One "frame" at a time

The way OpenCV works is that it recursively shows each frame. The first frame after, then the next, then the next, so on and so forth. 
Let's just start by displaying the first frame. And I know you're already thinking in terms of writing a loop to show all the frames as video feed...but just hang on yet! 

To display anything on your computer screen you would like to use yet another method __.read()__ on your video_object
This outputs two 2 things: 
1. __A Boolean__, True indicates the camera is turned on & works (can be later used to check if the feed is running etc.) 
2. __A numpy array__ which is basically a representation of the frame as an array of pixel values. This numpy array is very useful as we can perform operations on it directly. 

Finally to show what we captured, we will use the __.imshow()__ method. This takes 2 arguments: 
1. Name of the window that will pop up, & 
2. The frame that you captured using the __.read()__ method


__5. Closing the display window:__

We can't just let the window hang there & freeze, we would like to close the window or allow the user to press any key & stop the script. This is very important because if you don't include __.waitKey()__ method along with __.destroyAllWindows()__ then your python kernel might crash

In [1]:
import cv2 # import OpenCV
import time # import time to hold the camera for a certain amount of time

video_object = cv2.VideoCapture(0) # intialize the camera by creating a video_object

check, frame = video_object.read() # check is a boolean, frame is a numpy array 
                                   
time.sleep(3) # hold the camera for 3 seconds 

cv2.imshow("First Frame Captured", frame) # a window is created named: 1st arguement, & displays: 2nd argument


cv2.waitKey(0) # user presses any key which is represented by 0 as an argument

video_object.release() # the camera is released as soon as waitKey is pressed

cv2.destroyAllWindows() # All windows are closed


print(type(check)) # make sure check is boolean 
print(type(frame)) # frame is a numpy array

<class 'bool'>
<class 'numpy.ndarray'>


So, now we can turn on the webcam, we understand the basic methods in OpenCV. One important thing we learned here is that Python is processing the webcam feed as single images. What does that mean? 
    It means that we can apply other methods such as converting the images to gray scale, performing computations etc... this is good. 
    
    
## __Video Feed!__

Ok so now, we want the camera to run a live feed. The simplest way to do it is put the entire block of code above inside a __While loop__ & set it to __True__

However we need to make a few modifications first: 

We can start off first by removing __time.sleep()__

Store __.waitKey()__ in a variable called key

If the value of this variable is set to something then we stop the video feed.

It will make sense when you see the code. Look at the code below first & then read this description in the next Markdown



In [2]:
import cv2 # import OpenCV

video_object = cv2.VideoCapture(0) # intialize the camera by creating a video_object

while True:
    check, frame = video_object.read() # save the frame in a variable called frame
    
    gray_frame = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY) # convert image to gray scale: because it is 1 frame at a time, we can apply color transformation 

    cv2.imshow("Video Feed inside while loop", gray_frame) # create a window but this time we pass the gray_frame as argument to show    
    
    key = cv2.waitKey(1000)  # wait 1 second or 1000 ms before jumping back to the start of the loop
    
    if key == ord('q'):   # if the user pressed 'q' on their keyboard break the loop 
        break
    

video_object.release() # as soon as 'q' is pressed webcam is released 

cv2.destroyAllWindows() # All windows are closed

__What did we just do?__

OK now, so this is how the script works: 

1. We initialize the camera feed in an object called video_object... Basically turns the camera on

2. We say __while True:__  

Save the frame

Convert it into a gray-scale-image using the __.cvtColor()__ method. 

Show the image

Wait for 1000 milli seconds == 1 second 

Go back to the start of the loop & start over again 

3. If the user decided to press 'q' on his keyboard then release the camera and close all the windows 

Simple! :) 

Now, you can run the script... but there is  __LAG!!__ 

I feel ya. Here is a small quiz for you: What value in the script can we change to make the feed more smooth? 
.
.
.
.
.
You guessed it ! 
We can lower the waitKey() to a lower number so the while loop runs faster & we get more frames/second

Try using the __cv2.waitKey(1)__ & see how it goes. 

Note: waitKey() only accepts integers! 


# Section 2: Motion Detection! 

## Basic architecture of the application: 

1. So we would like to achieve motion detection through pixel difference computation.

2. We would start off by capturing the first frame in our application as our __static background__ 

3. We will store this static background (first frame) as a numpy array, which is basically a big matrix

4. We will apply matrix subtraction on all the frames that follow. 

5. If the subtraction is more than a certain value (threshold). We will say motion is detected & draw a box around it. 

LET'S PROCEED!

### Store the first_frame! 

We would start by storing the first frame. It is a little tricky we will have to use the _continue_ statement

So we intialize a variable named __first_frame__  to _None_

Inside our _while loop_ we write an __if loop__ 

On the first iteration: if the value of first_frame is None then take the value from gray_frame & store it in the first_frame variable. 

Go back to the start of the loop & then run the 2nd iteration.

We add a _continue_ because we don't want to go execute the lines below before grabbing the frames that follow. 

So: 1st iteration the first_frame variable is assinged gray_frame from __.read()__ method which is the first frame

_continue_ statement sends us back to the start of the while loop so we can grab the 2nd frame for comparison computation.

2nd iteration: the if statement is False because the value of first_frame is __NOT None__ as we assigned it a value in the first iteration 

The if loop is not executed & we go down doing our computations: 

## Calculate the delta_frame!: 

Before we apply any delta computations we would like to do some transformations. These are important to get accurate results & to remove noise. 

The first transformation we will apply to all the frames is a:

__.GaussianBlur()__ which takes in three arguments: 

1. The frame to apply the transformation on, 

2. The kernel size (as a tuple of width & height) 

3. Standard deviation of the blur. You can read more on OpenCV blurs on: https://docs.opencv.org/3.1.0/d4/d13/tutorial_py_filtering.html

If you dont wan't to read then we will be using kernel size = (21,21), standard_deviation = 0 which are acceptable numbers for our application. 


Now, we will finally compute the delta_frame using: 

__.absdiff() method__ This takes 2 arguments which are the the subtraction matrices you'd like to calculate the difference on

Finally, you'd like to show the delta_frame for your own understanding.

To do this we will create a new window using __.imshow()__ method.

__PLEASE NOTE: DISAPPEAR FROM THE WEBCAM WINDOW BEFORE RUNNING THE SCRIPT & THEN SHOWING UP AS AN OBJECT __ to see the delta_frame properly

You will see that in the delta_frame window the background is eliminated & only you appear as a negative.... You get the idea.



In [21]:
import cv2 

first_frame = None # initalize the first frame to None. This will be our reference static background  

video_object = cv2.VideoCapture(0)


while True:
    
    check, frame = video_object.read() 
    
    gray_frame = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY) # convert frame to grayscale
    
    gray_frame = cv2.GaussianBlur(gray_frame,(21,21),0) # apply blur to remove noise & smoothing
    
    if first_frame is None: 
        first_frame = gray_frame
        continue  # grab the first frame on 1st iteration & go back to the start of the loop
    
    delta_frame = cv2.absdiff(first_frame, gray_frame)  # calculate the difference between the first frame & the frames that follow 
     
        
    cv2.imshow("delta_frame", delta_frame)    # show the delta frame feed  
    cv2.imshow("Video Feed inside while loop", gray_frame)     # just the normal gray scale & blurred feed
    
    key = cv2.waitKey(1)  
    if key == ord('q'):    
        break
    

video_object.release()  
cv2.destroyAllWindows() 

## Calulate the Threshold!

Now that we have our delta_frame we would like calculate the threshold.

The idea is that delta_frame is a numpy matrix with integers as values acquired from the subtraction of the first_frame with the current frame. 

The higher these values in the matrix the greater the difference between the first_frame & the current frame...so something changed between the two

You can check out the delta_frame matrix by simply typing 
__print(delta_frame)__ 



To calculate the threshold we use: 

__.threshold() method__ the method takes in 4 arguments: 

1. The threshold matrix in our case delta_frame
2. The threshold limit. I set it to be 30. You can adjust it to your liking
3. The color to assign to the pixel coordinates that are more than the threshold. I set it to be 255 which converts them to white color
4. The threshold method in our case it is __binary threshold__ 

The __.threshold()__ method returns a tuple with 2 values: 

We need the 2nd value of this tuple which is the actual frame we access it by indexing it by using [1] 


In summary: 
So, if the intensity of the pixel is higher than threshold limit we defined, then the new pixel intensity is set to 255. Otherwise, the pixels are set to 0. 
You can read on thresholding at: https://docs.opencv.org/2.4/doc/tutorials/imgproc/threshold/threshold.html

We can go ahead & use the threshold_frame & draw boxes on it to indicate motion but before we do that we would want to make the white areas smoother. 


To _smooth_ out the threshold_frame even more we will finally use: 

__.dilate() method__  which takes 3 arguments: 

1. The threshold_frame to perform smoothing on.
2. The kernel array for custom dilation. I choose None
3. The number of iterations to perform the smoothing. The higher the smoother. I choose 2 as our value.






In [3]:
import cv2 

first_frame = None 
video_object = cv2.VideoCapture(0)


while True:
    
    check, frame = video_object.read() 
    
    gray_frame = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)
    
    gray_frame = cv2.GaussianBlur(gray_frame,(21,21),0) 
    
    if first_frame is None: 
        first_frame = gray_frame
        continue  
    
    delta_frame = cv2.absdiff(first_frame, gray_frame)
    
    threshold_frame = cv2.threshold(delta_frame,30,255,cv2.THRESH_BINARY)[1] # extracting the threshold_frame 
    
    threshold_frame = cv2.dilate(threshold_frame, None, iterations = 2) # applying threshold smoothing using dilate
        
    cv2.imshow("threshold frame", threshold_frame)
    
    key = cv2.waitKey(1)  
    if key == ord('q'):    
        break
    

video_object.release()  
cv2.destroyAllWindows() 

## Finding & Drawing Contours! 

To find ALL the contours we use the OpenCV's:

__.findContours() method__ the method takes 3 arguments: 

 
1. The frame to find contours from. 
2. The contour retrieval mode. 
3. The approximation method.

It outputs 3 values: 

1. The modified image
2. The contours
3. The hierarchy

You can read more on: https://docs.opencv.org/3.4.0/d4/d73/tutorial_py_contours_begin.html


We are interested in the 2nd output from the findContours() method.

The 2nd output is all the contours in the current frame, but we are only interested in the contours that are let's just say for example bigger than X-pixels. 

So we will have to iterate over the contours & say if the area is bigger than X-pixels then draw them. 

To do so we will use 2 more functions as follows: 


__.boundingRect():__ It takes in the contour as input & returns 4 values: 

x coordinate, y coordinate, width, height of the contour

Then we use: 

__.rectangle()__ method to draw a rectange using these coordinated: 

The inputs are as follows: 

1. The original frame. 
2. x, y the upper left corner of the box 
3. x+w & y+h as the lower right corner of the box 
4. The color of the box in the form of a tuple
5. width 





In [1]:
import cv2 

first_frame = None 
video_object = cv2.VideoCapture(0)


while True:
    
    check, frame = video_object.read() 
    
    gray_frame = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)
    
    gray_frame = cv2.GaussianBlur(gray_frame,(21,21),0) 
    
    if first_frame is None: 
        first_frame = gray_frame
        continue  
    
    ## delta_frame calculation & smoothing
    
    delta_frame = cv2.absdiff(first_frame, gray_frame)
    threshold_frame = cv2.threshold(delta_frame,30,255,cv2.THRESH_BINARY)[1]  
    threshold_frame = cv2.dilate(threshold_frame, None, iterations = 2) 
    
    
    ## Finding & drawing contours 
    
    (_,contours,_) = cv2.findContours(threshold_frame.copy(), cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE) ## finding contours 
    
    for contour in contours:
        
        if cv2.contourArea(contour) < 1000:
            continue
        
        (x,y,w,h) = cv2.boundingRect(contour)
        
        cv2.rectangle(frame, (x, y), (x+w, y+h), (0,255,0), 3)
    
    
    cv2.imshow("Motion Detection", frame)
               
    key = cv2.waitKey(1)  
    if key == ord('q'):    
        break
    

video_object.release()  
cv2.destroyAllWindows() 

# Voila! 

###  NOTE: Disappear infront of the webcam, then press run, appear infront of the webcam & you will see a green box drawn around you. 

### That's the motion detection.  