# **Getting Prerequisites**
Before starting to work on Object Detection module, following components need to be installed :
> -  Python

# **Setting Up The Environment**
For all the other libraries we can use pip or conda to install them. The code is provided below:<br>
> -  pip install OpenCV
> -  pip install wxPython
> -  pip install pynput

# Import Required Libraries

> - OpenCV is used to handle image and video operations
> - NumPy handles mathematical operations
> - wxPython is a cross platform toolkit for creating desktop GUI applications
> - pynput contains classes for controlling and monitoring the mouse

In [1]:
import cv2
import numpy as np
import wx
from pynput.mouse import Button, Controller

Let's create object to handle the mouse and also capture display size.

In [2]:
mouse = Controller()
app = wx.App(False)
sx, sy = wx.GetDisplaySize()

Now, find out the range of HSV values for particular color which we want to use.

In [3]:
lowerBound = np.array([170, 80, 110])
upperBound = np.array([179, 255, 255])

Below code is used to open built-in webcam in particular display ratio.

In [4]:
capx, capy = 320, 240
cap = cv2.VideoCapture(0)
cap.set(3, capx)
cap.set(4, capy)

True

Morphological transformations are some simple operations based on the image shape. It is normally performed on binary images. It needs two inputs, one is our original image, second one is called structuring element or kernel which decides the nature of operation. Two basic morphological operators are Erosion and Dilation. Then its variant forms like Opening, Closing, Gradient etc also comes into play.

Opening is just another name of erosion followed by dilation. It is useful in removing noise.
> <img src="images/opening.png" alt="Alt text that describes the graphic" title="Title text" />

Closing is reverse of Opening, Dilation followed by Erosion. It is useful in closing small holes inside the foreground objects, or small black points on the object.
> <img src="images/closing.png" alt="Alt text that describes the graphic" title="Title text" />

Above operations will be used later in the module. But to use them we need to define variables which contains binary data to handle those operations.

In [5]:
kernelOpen = np.ones((5, 5))
kernelClose = np.ones((20, 20))

We also need to use damping factor. Which will manage our frequency of the shaking hand and will lower the amount for enhancement. So define few variables to use them in damping factor formula.

In [6]:
mLocOld = np.array([0, 0])
mouseLoc = np.array([0, 0])
DampingFactor = 3

pinchFlag = 0

In [7]:
openx, openy, openw, openh = 0, 0, 0, 0

Now, first start capturing each frame and convert it from BGR to HSV color. Use morphological operation OPEN and CLOSE on it.

We will use contours in our module. Contours can be explained simply as a curve joining all the continuous points (along the boundary), having same color or intensity. The contours are a useful tool for shape analysis and object detection and recognition.

### Contours == 1:
If there is only contour detected, then we will use mouse press operation. We will use only mouse press which is holding the left mouse click.

### Contours == 2:
If there are 2 contours detected, then we will use mouse position operation which simply means moving the mouse around on the screen.

In [8]:
while True:
    ret, img = cap.read()
    # img = cv2.resize(img, (340, 220))

    imgHSV = cv2.cvtColor(img, cv2.COLOR_BGR2HSV)

    mask = cv2.inRange(imgHSV, lowerBound, upperBound)
    maskOpen = cv2.morphologyEx(mask, cv2.MORPH_OPEN, kernelOpen)
    maskClose = cv2.morphologyEx(mask, cv2.MORPH_CLOSE, kernelClose)

    maskFinal = maskClose

    conts, h = cv2.findContours(maskFinal.copy(), cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_NONE)

    if len(conts) == 2:

        if pinchFlag == 1:
            pinchFlag = 0
            mouse.release(Button.left)

        x1, y1, w1, h1 = cv2.boundingRect(conts[0])
        x2, y2, w2, h2 = cv2.boundingRect(conts[1])
        cv2.rectangle(img, (x1, y1), (x1 + w1, y1 + h1), (255, 0, 0), 2)
        cv2.rectangle(img, (x2, y2), (x2 + w2, y2 + h2), (255, 0, 0), 2)
        cx1 = x1 + int(w1 / 2)
        cy1 = y1 + int(h1 / 2)
        cx2 = x2 + int(w2 / 2)
        cy2 = y2 + int(h2 / 2)
        cx = int((cx1 + cx2) / 2)
        cy = int((cy1 + cy2) / 2)
        cv2.line(img, (cx1, cy1), (cx2, cy2), (255, 0, 0), 2)
        cv2.circle(img, (cx, cy), 2, (0, 0, 255), 2)

        mouseLoc = mLocOld + ((cx, cy) - mLocOld) / DampingFactor
        mouse.position = (sx - int(mouseLoc[0] * sx / capx), int(mouseLoc[1] * sy / capy))
        while mouse.position != (sx - int(mouseLoc[0] * sx / capx), int(mouseLoc[1] * sy / capy)):
            pass
        mLocOld = mouseLoc
        openx, openy, openw, openh = cv2.boundingRect(np.array([[x1, y1], [x1 + w1, y1 + h1], [x2, y2],
                                                                [x2 + w2, y2 + h2]]))

    elif len(conts) == 1:

        x, y, w, h = cv2.boundingRect(conts[0])

        if pinchFlag == 0:
            if abs((w * h - openw * openh) * 100 / (w * h)) < 30:
                pinchFlag = 1
                mouse.press(Button.left)
                openx, openy, openw, openh = 0, 0, 0, 0

        else:
            cv2.rectangle(img, (x, y), (x + w, y + h), (255, 0, 0), 2)
            cx = int(x + w / 2)
            cy = int(y + h / 2)
            cv2.circle(img, (cx, cy), int((w + h) / 4), (0, 0, 255), 2)

            mouseLoc = mLocOld + ((cx, cy) - mLocOld) / DampingFactor
            mouse.position = (sx - int(mouseLoc[0] * sx / capx), int(mouseLoc[1] * sy / capy))
            while mouse.position != (sx - int(mouseLoc[0] * sx / capx), int(mouseLoc[1] * sy / capy)):
                pass
            mLocOld = mouseLoc

    cv2.imshow("cap", img)

    k = cv2.waitKey(30) & 0xff
    if k == 27:
        break

Now to close the display window we use following code.

In [9]:
cap.release()
cv2.destroyAllWindows()