# OpenCV - Basics
OpenCV is a library of programming functions mainly aimed at real-time computer vision.

Computer vision? Application of deep learning on driving insights from media files, eg. images and videos.

## Imports

In [1]:
import cv2 as cv
import os
import numpy as np

## Reading Images and Video

### Reading Images

imread()
- Takes in a path to an image and returns a matrix of pixels.

imshow()
- Displays the image in a new window.
- Takes in the new of the created window and the matrix of pixels to display (generated using imread())

waitKey()
- Keyboard binding function that waits for a delay for a key to be pressed.
- Passing in 0 means it'll wait for an infinite amount of time for a key to be pressed.

### Reading Video

VideoCapture()
- Passing in an integer to this function would make use of our webcam (commonly 0 is used)
- Passing in a path to the function reads a readily available video.
- For reading videos, we would use a while loop that reads the video frame by frame.
- Use the same method of displaying images to show this frame and assign a key to break out of the while loop.
- In the end, release the video capture and destroy all windows. 

In [2]:
parent_dir = os.getcwd()
photo_dir = os.path.join(parent_dir,"sample_photos")
photos = os.listdir(photo_dir)

# Read Images

img = cv.imread(os.path.join(photo_dir,photos[0]))

cv.imshow("Demo Pic", img)
cv.waitKey(0)

-1

In [3]:
# Read Video (Uncomment to run, commented to prevent accidental opening of webcam)

# capture = cv.VideoCapture(0) # In this case, we use the webcam

# while True:
#     isTrue, frame = capture.read()

#     cv.imshow("Video", frame)

#     if cv.waitKey(20) & 0xFF == ord('d'): # Press d key to break
#         break

# capture.release()
# cv.destroyAllWindows()

## Resizing and Rescaling

It is good practice to resize and/or rescale images/videos to reduce processing required.

shape
- Returns a tuple of the number of rows, columns and channels. (height can be seen as shape\[0] while width can be seen as shape\[1])

resize()
- Takes in the image and the dimensions (in the form of a width,height tuple) that we want to resize the image to.
- The dimensions can be a scale of the original or specific dimensions that do not need to match the original aspect ratio.

interpolation in the resize() Function:

Based on preference but usually,
- INTER_AREA -> Suitable for shrinking an image.
- INTER_LINEAR/INTER_CUBIC -> Suitable for enlarging an image.


In [4]:
def rescaleFrame(frame):
    scale_value = 0.75

    width = int(frame.shape[1] * scale_value)
    height = int(frame.shape[0] * scale_value)

    dimensions = (width,height)

    return cv.resize(frame,dimensions,interpolation=cv.INTER_AREA)

img = cv.imread(os.path.join(photo_dir,photos[0]))

rescaled_img = rescaleFrame(img)

cv.imshow("Rescaled Image", rescaled_img)
cv.waitKey(0)

# Note that this function can also work with videos by passing the frames one by one in the while loop.


-1

## Drawing Images and Text on Images
We can either draw on imported images or create a blank image using numpy (create a zero matrix)

### Blank Images

Creating a blank image
- Use the np.zeros() function to create a zero matrix.
- Display this matrix.

Changing the colour of the blank image
- Assign all the pixels a certain colour (eg. 0,255,0 for all green.
- This can also be done for specific areas of the image by passing in the indexes

### Drawing Rectangles

rectangle()
- Takes in the image and the points of where we want to draw the rectangle, the color and thickness.
- To get a filled rectangle, use thickness=cv.FILLED

### Drawing Circles
circle()
- Takes in the image, the center of the circle, radius (in pixels), colour and thickness.

### Drawing Lines
line()
- Takes in the image, the points of where we want to draw the line, color and thickness.

### Writing Text on an Image
putText()
- Takes in the image, the text, where we want the text to begin font face (there are in built ones that can be used), scale, colour, thickness.


In [5]:
img = cv.imread(os.path.join(photo_dir,photos[0]))
img_center_height = img.shape[1] // 2
img_center_width = img.shape[0] // 2

cv.rectangle(img,(0,0),(200,200), (0,255,0), thickness=2)
cv.circle(img,(img_center_height,img_center_width),40,(0,255,0),thickness=5)
cv.line(img, (0,0),(img_center_height,img_center_width), (0,255,0), thickness=5)
cv.putText(img, "Elmo is Angry", (img_center_height, img_center_width), cv.FONT_HERSHEY_TRIPLEX, 1.0, (255,255,255))

cv.imshow("Demo Pic", img)
cv.waitKey(0)

-1

## Essential Functions in OpenCV

### 1. Converting an image to gray scale.
- The image we initially read is a BGR image (Blue, Green, Red)
- We would normally want to convert this to gray scale in order to deal with the intensity distribution rather than the colour itself.
- We convert by using **cvtColor()** that takes in the image and the colour code (COLOR_BGR2GRAY to convert BGR to gray scale.)

### 2. Blurring an image
- Commonly used to reduce the noise that exists in the image. (noises can be in the form of bad lighting etc.)
- One of blurs that can be used in the **GaussianBlur()** function.

### 3. Edge Cascade
- Used to find the edges that are present in the image.
- One of a popularly used function is the Canny() function.
- Blurring the image reduces the edges that exist in the image.

### 4. Dilating an Image
- Uses a specific structuring element (in this case we use the edges found by Canny)
- Uses the dilate() function and takes in the canny version of the image.
- Dilation adds pixels to the boundaries of objects in an image.

### 5. Eroding an Image
- Erosion removes pixels on object boundaries.
- Uses the erode() function.
- We can erode a dilated image to obtain a close to original edge cascade.

### 6. Cropping Images
- By remembering that our images are an array of pixels, we can hence crop images by performing array slicing.







In [6]:
gray = cv.cvtColor(img,cv.COLOR_BGR2GRAY)

cv.imshow("Gray Scale", gray)

blur = cv.GaussianBlur(img,(3,3), cv.BORDER_DEFAULT)

cv.imshow("Blurred Image", blur)

edges = cv.Canny(img, 125, 175)
cv.imshow("Edges", edges)

dilated = cv.dilate(edges, (3,3), iterations=1)
cv.imshow("Dilated", dilated)

eroded = cv.erode(dilated, (3,3), iterations=1)
cv.imshow("Eroded", eroded)

cropped = img[50:200,200:400]
cv.imshow("Cropped", cropped)

cv.waitKey(0)

-1

## Image Transformations

Common techniques including translation, rotation, resizing, flipping and cropping.

### 1. Translation
- Shift an image along the x and y axis. (up, down, left, right or any combination)
- warpAffine() function which takes in the image, transformation matrix and the dimensions of the original image.
- https://opencv-python-tutroals.readthedocs.io/en/latest/py_tutorials/py_imgproc/py_geometric_transformations/py_geometric_transformations.html

### 2. Rotation
- Rotate an image by some angle.
- OpenCV allows rotating around any specified point (usually center)
- Also uses the warpAffine() function which takes in the image, the rotation matrix generated by getRotationMatrix2D() and the original dimensions.
- A positive rotation angle rotates the image counter-clockwise (hence if we want to rotate clockwise, just pass in a negative value)

### 3. Flip
- Uses the flip() function which takes in the image and the flip code.
- Code 0 -> Flip vertically
- Code 1 -> Flip horizontally
- Code -1 -> Flip both horizontally and vertically

In [7]:
# +ve x = right shift
# -ve x = left shift
# +ve y = up shift
# -ve y = down shift

def translate(img, x, y):
    trans_matrix = np.float32([[1,0,x],[0,1,y]])
    dimensions = (img.shape[1],img.shape[0])
    return cv.warpAffine(img, trans_matrix, dimensions)

def rotate(img, angle, rotation_point=None):
    (height, width) = img.shape[:2]

    if rotation_point is None:
        rotation_point = (width//2, height//2)
    
    rot_matrix = cv.getRotationMatrix2D(rotation_point, angle, 1.0)
    dimensions = (width, height)

    return cv.warpAffine(img, rot_matrix, dimensions)

img = cv.imread(os.path.join(photo_dir,photos[0]))
translated = translate(img,100,200)
rotated = rotate(img, 45)
flipped = cv.flip(img, 0)

cv.imshow("Transformed Image", flipped)
cv.waitKey(0)

-1

## Contour Detection
Contours - Boundaries of objects (line or curves that joins the continous points along the boundary of the object)

From a mathematical point of view, contours and edges are two different things. Contours are useful when we get into shape analysis, object detection and recognition. (We can however still get away with thinking about them as the same)

### Contour Detection with OpenCV

### Method 1
1. Convert the image to gray scale using cvtColor().
2. Grab the edges of the image using Canny()
3. Find contours using the findContours() method that takes in the edges (the canny image), the mode and the contours approximation method.
    - Looks at the structuring element (edges in the image) and returns the contours and hierarchies
    - Contours - Essentially a python list of all the coordinates of the contours found in the image.
    - Hierarchies - Hierarchical representation of the contours. Eg. if the within the rectangle there is a circle.
    - RETR_LIST - Essentially returns all the contours in the list (there are other modes)
    - CHAIN_APPROX_NONE - How do we want to approximate the countours? CHAIN_APPROX_NONE does nothing and returns all of the contours (once again there are other methods such as CHAIN_APPROX_SIMPLE which compresses the contours and returns the ones that makes most sense (eg. the endpoints of a line))

### Method 2
1. Convert the image to gray scale using cvtColor().
2. Pass the grayscale image into the threshold() function.
    - Thresholding - looks at an image and attempts to binarise the image.
    - The function takes in the threshold value and the maximum value. (If the intensity of a pixel is below the threshold value, it is set to black, 0. Else, it is set to white, 255)
3. The same findCountours() function can be used with the threshold image returned.

- A recommendation is to use the Canny method first.

In [20]:
img = cv.imread(os.path.join(photo_dir,photos[0]))
gray = cv.cvtColor(img,cv.COLOR_BGR2GRAY)

blur = cv.GaussianBlur(gray, (5,5), cv.BORDER_DEFAULT)

# With Canny Edge Detection
canny = cv.Canny(blur, 125, 175)
cv.imshow("Edges", canny)

# With Threshold Function
ret, thresh = cv.threshold(gray, 125, 255, cv.THRESH_BINARY)
cv.imshow("Threshold Image", thresh)

contours, hierarchies = cv.findContours(thresh, cv.RETR_LIST, cv.CHAIN_APPROX_NONE)

num_contours = len(contours)
print(f'{num_contours} contour(s) found')

cv.waitKey(0)



210 contour(s) found


-1