Image Processing with OpenCV
===

It's time to put together everything we have learned and actually detect an object.

## Problem Definition

Most of what you will be doing with computer vision is determining the location of an object. To practice this, we will take a video of the video game pong and circle the ball in every frame. Then when we play the video, we will see the circle moving with the ball.

To start we need to look at the video and determine which steps need to be taken. Let's play the video the same way that we did in notebook 2.

In [None]:
import cv2
import numpy as np

cap = cv2.VideoCapture("../assets/pong.mp4")

while True:
    ret, frame = cap.read()

    # if frame is read correctly ret is True
    if not ret:
        print("Can't receive frame (stream end?). Exiting ...")
        break
    
    cv2.imshow('Pong.mp4', frame)
    
    # If a key was pressed
    if cv2.waitKey(20) > 0:
        break
        
cap.release()
cv2.destroyAllWindows()

With all object tracking, the goal is to turn the input into a binarized grayscale image where the object we are interested in is 255 and the background is 0. This may seem like alot but we will take it step by step. 

First we will grab a single frame from the video. It is much easier to work with a single frame rather tha a whole video. 

In [None]:
cap = cv2.VideoCapture("../assets/pong.mp4")
ret, frame = cap.read()
cap.release()

cv2.imshow('frame', frame)
cv2.waitKey(0)
cv2.destroyAllWindows()

Since we don't care about color, lets convert the frame to grayscale. If we did care about color, we could grab a color channel or filter by hue. 

In [None]:
frame_gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)

cv2.imshow('frame_gray', frame_gray)
cv2.waitKey(0)
cv2.destroyAllWindows()

Now that our image is grayscale, let's binarize it. We are only interested in the ball which is white and dont care about the score which is gray. Let's set our threshold somewhere between grey and white to eliminate the score.

In [None]:
_, frame_binarized = cv2.threshold(frame_gray, 220, 255, cv2.THRESH_BINARY)

cv2.imshow('frame_binarized', frame_binarized)
cv2.waitKey(0)
cv2.destroyAllWindows()

Now we need to turn everything that is not the ball black. This is the border around the image and the center bar. We will select these regions with numpy slicing and then set them to 0.

In [None]:
frame_filtered = frame_binarized.copy()
frame_filtered[:15] = 0
frame_filtered[-15:] = 0
frame_filtered[:, :16] = 0
frame_filtered[:, -16:] = 0
frame_filtered[:, 311:325] = 0

cv2.imshow('frame_filtered', frame_filtered)
cv2.waitKey(0)
cv2.destroyAllWindows()

Now lets put it all together. We will play the video and process each frame in the way we just processed this frame.

In [None]:
cap = cv2.VideoCapture("../assets/pong.mp4")

while True:
    ret, frame = cap.read()

    # if frame is read correctly ret is True
    if not ret:
        print("Can't receive frame (stream end?). Exiting ...")
        break

    frame_gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)
    _, frame_binarized = cv2.threshold(frame_gray, 220, 255, cv2.THRESH_BINARY)
    frame_filtered = frame_binarized.copy()
    frame_filtered[:15] = 0
    frame_filtered[-15:] = 0
    frame_filtered[:, :16] = 0
    frame_filtered[:, -16:] = 0
    frame_filtered[:, 311:325] = 0
    
    cv2.imshow('Pong.mp4', frame_filtered)
    
    # If a key was pressed
    if cv2.waitKey(20) > 0:
        break
        
cap.release()
cv2.destroyAllWindows()