Import needed for the notebook

In [None]:
import cv2
import random
import numpy as np

# Read, write and show an image
To read an image, you can use the function imread. The first parameter corresponds to the path leading to the image, the second parameter is optional and determine the mode to open the image.

In [None]:
img = cv2.imread('../assets/img.png', cv2.IMREAD_COLOR)

For instance:
- IMREAD_COLOR convert the image to the 3 channel BGR color image
- IMREAD_GRAYSCALE convert the image to the single channel grayscale image
- IMREAD_UNCHANGED return the loaded image as is (with alpha channel)

The function *imshow* open a window with the image. To avoid the window to close immediately, OpenCV can wait for a keypress with *waitKey*. It is possible to specify which key in the parameter of the function. Then it is needed to explicitly close the window.

In [None]:
cv2.imshow('My image', img)
cv2.waitKey(0) # Will close after any input
cv2.destroyAllWindows()

The function *imwrite* allows to write an image:

In [None]:
cv2.imwrite('../assets/new_img.png', img)

# Transform the image

An image can be rotated:

In [None]:
img = cv2.rotate(img, cv2.cv2.ROTATE_90_CLOCKWISE)

It can also be resized using *resize*. The first parameter is the image, the second is the desired size of the new image, and fx and fy can be used to change the size using a ratio.

In [None]:
img = cv2.resize(img, (0, 0), fx=0.5, fy=0.5) 

In [None]:
cv2.imshow('My image', img)
cv2.waitKey(0) # Will close after any input
cv2.destroyAllWindows()

# Modify the image

In [None]:
img = cv2.imread('../assets/img.png', cv2.IMREAD_COLOR)

print(img)
print("Type:", type(img))
print("Shape:", img.shape)

An image consists of a matrix of three dimensions : Rows, Columns, Channels

It is important to notice that OpenCV is not using the channels order as RGB but BGR.

Here, the image is a numpy array. That we can easily modify:

In [None]:
for i in range(100): # The first 100 lines
    for j in range(img.shape[1]): # All columns
        img[i][j] = [random.randint(0, 255), random.randint(0, 255), random.randint(0, 255)]

cv2.imshow('Image with random values', img)
cv2.waitKey()
cv2.destroyAllWindows()

You can also take a part of the image and duplicate it somewhere else on the image:

In [None]:
tag = img[150:200, 70:100]
img[0:50, 150:180] = tag

cv2.imshow('Image with random values', img)
cv2.waitKey()
cv2.destroyAllWindows()

# Cameras and Video Capture

You can capture from your camera:

In [None]:
cap = cv2.VideoCapture(0) # 0 corresponds to the camera number if you have several

Or from a video

In [None]:
cap = cv2.VideoCapture('../assets/vid.mp4')

Then using a while loop, we can get the next frame and display it in a new window. The loop will stop once we press the key 'q'.
At the end it is important to release the resource of the camera.

In [None]:
cap = cv2.VideoCapture(0)

while True:
    retval, frame = cap.read() # retval is not used here but shows if something wrong is happening

    cv2.imshow('frame', frame)

    if cv2.waitKey(30) == ord('q'):
        break

cap.release() # release the camera resource
cv2.destroyAllWindows()

It is possible to modify the image captured and then display it with the modification.

In the next example we get the next frame, transform the dimension by half and duplicate it four times on the window. For the picture at the bottom-left and the one at the top-right, we apply a rotation.

In [None]:
cap = cv2.VideoCapture(0)

while True:
    ret, frame = cap.read()

    width = int(cap.get(3)) # 3 == width (values by default are floating point)
    height = int(cap.get(4))# 4 == height

    image = np.zeros(frame.shape, np.uint8) 
    smaller_frame = cv2.resize(frame, (0, 0), fx=0.5, fy=0.5)

    image[:height//2, :width//2] = smaller_frame # Top left
    image[height//2:, :width//2] = cv2.rotate(smaller_frame, cv2.cv2.ROTATE_180) # Bottom left
    image[:height//2, width//2:] = cv2.rotate(smaller_frame, cv2.cv2.ROTATE_180) # Top right
    image[height//2:, width//2:] = smaller_frame # Bottom right

    cv2.imshow('frame', image)

    if cv2.waitKey(30) == ord('q'):
        break

cap.release()
cv2.destroyAllWindows()

# Drawing

## Line
Syntax: cv2.line(image, start_point, end_point, color, thickness)

In [None]:
cap = cv2.VideoCapture(0)

while True:
    ret, frame = cap.read()

    width = int(cap.get(3))
    height = int(cap.get(4))

    # Drawing a line
    img = cv2.line(frame, (0,0), (width, height), (255, 0, 0), 10)
    img = cv2.line(img, (width,0), (0, height), (0, 255, 0), 5)

    cv2.imshow('frame', img)

    if cv2.waitKey(1) == ord('q'):
        break

cap.release() # release the camera resource
cv2.destroyAllWindows()

## Rectangle
cv2.rectangle(image, start_point, end_point, color, thickness)

Thickness of -1 px will fill the rectangle shape by the specified color.

In [None]:
cap = cv2.VideoCapture(0)

while True:
    ret, frame = cap.read()

    # Drawing a rectangle
    img = cv2.rectangle(frame, (100, 100), (200, 200), (128, 128, 128), 5)

    cv2.imshow('frame', img)

    if cv2.waitKey(1) == ord('q'):
        break

cap.release() # release the camera resource
cv2.destroyAllWindows()

## Circle
cv2.circle(image, center_coordinates, radius, color, thickness)

In [None]:
cap = cv2.VideoCapture(0)

while True:
    ret, frame = cap.read()

    # Drawing a circle
    img = cv2.circle(frame, (300, 300), 60, (0, 0, 255), -1)

    cv2.imshow('frame', img)

    if cv2.waitKey(1) == ord('q'):
        break

cap.release() # release the camera resource
cv2.destroyAllWindows()

## Text
cv2.putText(image, text, org, font, fontScale, color[, thickness[, lineType]])

The origin org is not at the top left as for the coordinates in OpenCV, in this case it is at the bottom left.

The documentation is saying that the lineType cv2.LINE_AA makes the text look better.

In [None]:
cap = cv2.VideoCapture(0)

while True:
    ret, frame = cap.read()

    height = int(cap.get(4))

    # Drawing a text
    font = cv2.FONT_HERSHEY_SIMPLEX
    img = cv2.putText(frame, 'Hello world!', (10, height - 10), font, 2, (0, 0, 0), 5, cv2.LINE_AA)

    cv2.imshow('frame', img)

    if cv2.waitKey(1) == ord('q'):
        break

cap.release() # release the camera resource
cv2.destroyAllWindows()

# Colors and Color Detection

First we want to convert our BGR image into HSV (Hue Saturation Value/Brightness). The reason is because the method we will use to extract a color from an image requires an HSV image.

Display the HSV image:

In [None]:
import numpy as np
import cv2

cap = cv2.VideoCapture(0)

while True:
    ret, frame = cap.read()
    width = int(cap.get(3))
    height = int(cap.get(4))
    
    # convert an image from one color space to another
    hsv = cv2.cvtColor(frame, cv2.COLOR_BGR2HSV)

    cv2.imshow('frame', hsv)

    if cv2.waitKey(1) == ord('q'):
        break

cap.release()
cv2.destroyAllWindows()

We need to pick a lower bound and upper bound for the color we want to extract.

You can use a color picker to find the values of HSV.

We then apply a mask on the image to black out each pixel not in the HSV range, and change the pixel in the range into white.

In [None]:
import numpy as np
import cv2

cap = cv2.VideoCapture(0)

while True:
    ret, frame = cap.read()
    width = int(cap.get(3))
    height = int(cap.get(4))
    
    hsv = cv2.cvtColor(frame, cv2.COLOR_BGR2HSV)
    lower_blue = np.array([90, 50, 50])
    upper_blue = np.array([130, 255, 255])

    mask = cv2.inRange(hsv, lower_blue, upper_blue)
    result = cv2.bitwise_and(frame, frame, mask=mask)
    
    cv2.imshow('frame', result)

    if cv2.waitKey(1) == ord('q'):
        break

cap.release()
cv2.destroyAllWindows()

It is also possible to display the mask with:

cv2.imshow('mask', mask)

# Corner detection

Most of the time edge detection algorithms work better with gray scale images.

*cv2.goodFeaturesToTrack(img, N, quality, distance)* will find corners in the gray scale image.
It finds N strongest corners in the image by Shi-Tomasi method. Then you specify the quality level, which is a value between 0-1, which denotes the minimum quality of corner below which everyone is rejected. Then we provide the minimum euclidean distance between corners detected.

We got the corners position as floating points values, we need to convert them into integer values in order to use them on the picture: *corners = np.int0(corners)*

We can then draw circle at each corners and draw lines between each of them.


In [None]:
img = cv2.imread('assets/chessboard.png')
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)

corners = cv2.goodFeaturesToTrack(gray, 100, 0.01, 10)
corners = np.int0(corners)

for corner in corners:
    x, y = corner[0]
    cv2.circle(img, (x, y), 5, (255, 0, 0), -1)

# Drawing lines between the corners
for i in range(len(corners)):
    for j in range(i +1, len(corners)):
        corner1 = tuple(corners[i][0])
        corner2 = tuple(corners[j][0])
        color = tuple(map(lambda x : int(x), np.random.randint(0, 255, size=3)))
        cv2.line(img, corner1, corner2, color, 1)

cv2.imshow('Frame', img)
cv2.waitKey()
cv2.destroyAllWindows()