# OpenCV
opencv is a library of programming functions mainly aimed at real-time computer vision. We will use it to process images and videos. OpenCV was originally developed by Intel and is now maintained by Willow Garage and ItSeez. opencv-python is the Python API of OpenCV. It combines the best qualities of OpenCV C++ API and Python language.

## Installation

In [2]:
%pip install opencv-python

Collecting opencv-python
  Obtaining dependency information for opencv-python from https://files.pythonhosted.org/packages/8a/6f/8aa049b66bcba8b5a4dc872ecfdbcd8603a96704b070fde22222e479c3d7/opencv_python-4.8.0.76-cp37-abi3-macosx_10_16_x86_64.whl.metadata
  Downloading opencv_python-4.8.0.76-cp37-abi3-macosx_10_16_x86_64.whl.metadata (19 kB)
Downloading opencv_python-4.8.0.76-cp37-abi3-macosx_10_16_x86_64.whl (54.7 MB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m54.7/54.7 MB[0m [31m11.0 MB/s[0m eta [36m0:00:00[0m00:01[0m00:01[0m
[?25hInstalling collected packages: opencv-python
Successfully installed opencv-python-4.8.0.76
Note: you may need to restart the kernel to use updated packages.


## Mat - the basic image container
### Goal
We have multiple ways to aquire digital images from the real world: digital cameras, scanners, computed tomography, etc. In every case what we see are images. However, when transferred to a computer, the images are represented as a matrix of numbers. In this section we will learn how to use the Mat class to represent images in OpenCV.

### Basics
The Mat class represents a matrix. It is a 2D array, with each element of the array corresponding to a pixel in the image. The Mat class is defined in the Core module of OpenCV. The Mat class is a template class that can take any number of channels. The Mat class is defined as follows:

Mat< _Tp > class cv::Mat

The Mat class can be used to store photo in python as follows:

In [6]:
# import opencv
import cv2

# create a Mat object to store an image
img1 = cv2.imread('images/cat.webp')

# use the copy method to create a new image
img2 = img1

# modify the img1
img1[0, 0] = [255, 255, 255]

# what is the difference between the four images?
print(img1[0, 0])
print(img2[0, 0])
# img1 and img2 are the same object


[255 255 255]
[255 255 255]


What we can see that img1 and img4 are the same image, but the header is different. The real interesting is that you can headers which refer to subsection of the fall data.


In [7]:
# subsection of an image
img3 = img1[100:200, 200:300]
img3[0, 0] = [0, 0, 0]
print(img1[100, 200])
print(img3[0, 0])


[0 0 0]
[0 0 0]


Now we may wonder - if the matrix itself may belong to multiple Mat objects, how do we know when to deallocate the matrix? The answer is simple - the matrix will be deallocated when the last Mat object referencing it is gone. In other words, the matrix will be deallocated when the header is gone. For example, the following code will deallocate the matrix pointed by img1:

In [8]:
# delocate the memory
del img1, img2, img3

Sometimes we may want to copy the data of a matrix to another matrix. In this case, we can use the copy() method of the Mat class. For example, the following code will copy the data of img1 to img2:

In [12]:
# clone and copy
img1 = cv2.imread('images/cat.webp')
img2 = img1.copy()

# modify the img1
img1[0, 0] = [255, 255, 255]

# what is the difference between the four images?
print(img1[0, 0])
print(img2[0, 0])
# img1 and img2 are different objects
del img1, img2

[255 255 255]
[244 243 241]


### Storing methods
This is about how the pixel value is stored. We can select the color space and the data type of the pixel value. The color space refers to how we combine color components to create colors. The simplest color space is the grayscale color space. In this color space, each pixel is represented by a single number. The number represents the intensity of the pixel. The higher the number, the brighter the pixel. The grayscale color space is also called the intensity color space.

The RGB color space is the most common color space. In this color space, each pixel is represented by three numbers. The three numbers represent the intensity of the red, green and blue components of the pixel. The higher the number, the brighter the component. To code the transparency of a color a fourth component is added. This color space is called RGBA.

There are, however, many other color systems, each with their own advantages:
        RGB - Red, Green, Blue (default) : however OpenCV stores them in BGR order
        HSV - Hue, Saturation, Value and HLS - Hue, Lightness, Saturation : are more natural way to think about colors, we may discard the value or last component to make more robust to lighting changes
        YUV - Luminance, Chrominance : is used in video systems, where Y is the luminance component and U and V are the chrominance components
        YCrCb - Luminance, Red Chrominance, Blue Chrominance  : used for JPEG and MPEG images
        CIE - L*a*b* - Lightness, Red-Green, Blue-Yellow : is designed to approximate human vision, is perceptually uniform color space, meaning that the difference between two colors is perceptually uniform, which come handy if we need to measure distance given two colors

The data type of the pixel value can be unsigned char, signed char, unsigned short, signed short, int, float, double, etc. The data type of the pixel value is specified by the template parameter of the Mat class. Note that increasing the data type will increase the memory required to store the pixel value. 

Examples of how to create a Mat object with different color spaces and data types are shown below:


In [16]:
# images with different color spaces (BGR, GRAY, HSV) and different data types (uint8, float32 , char)
img = cv2.imread('images/cat.webp')
print(img.shape)
print(img.dtype)

# convert the image to gray scale
img_gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
print(img_gray.shape)
print(img_gray.dtype)

# convert the image to HSV
img_hsv = cv2.cvtColor(img, cv2.COLOR_BGR2HSV)
print(img_hsv.shape)
print(img_hsv.dtype)

# convert the image to float32
img_float32 = img.astype(np.float32)
print(img_float32.shape)
print(img_float32.dtype)

# convert the image to uint8
img_uint8 = img.astype(np.uint8)
print(img_uint8.shape)
print(img_uint8.dtype)

# convert the image to char
img_char = img.astype(np.int8)
print(img_char.shape)
print(img_char.dtype)


# delocate the memory
del img, img_gray, img_hsv, img_float32, img_uint8, img_char

(900, 600, 3)
uint8
(900, 600)
uint8
(900, 600, 3)
uint8
(900, 600, 3)
float32
(900, 600, 3)
uint8
(900, 600, 3)
int8


### Creating Mat objects explicitly
We can use cv::imwrite() to write a matrix to an image file. The first parameter is the name of the file. The second parameter is the matrix to be written. The third parameter is an optional parameter specifying the compression method. The default value is 0, which means that the image is not compressed. The following code writes the matrix img1 to the file img1.png:

In [17]:
# imwrite method
img = cv2.imread('images/cat.webp')
cv2.imwrite('images/cat.png', img)

# delocate the memory
del img

However, for debugging purposes, it is convenient to see actual values of the matrix. 

In [30]:
# import numpy
import numpy as np

# 2 by 2 image with 0,0,255 color on all pixels
img = np.full((2, 2, 3), (0, 0, 255), dtype=np.uint8)
print("img shape with 0,0,255 color on all pixels")
print(img)
# 2x2 image with 2 channels
print("2x2 image with 2 channels")
img = np.full((2, 2, 2), 255, dtype=np.uint8)
print(img)
# zeros ones and eye
## 2x2 image with 3 channels and all pixels are 0
print("2x2 image with 3 channels and all pixels are 0")
img = np.zeros((2, 2, 3), dtype=np.uint8)
print(img)
## 2x2 image with 3 channels and all pixels are 1
print("2x2 image with 3 channels and all pixels are 1")
img = np.ones((2, 2, 3), dtype=np.uint8)
print(img)
## eye matrix with 2x2 and 1 channel and all pixels are 1 on the diagonal
print("eye matrix with 2x2 and 1 channel and all pixels are 1 on the diagonal")
img = np.eye(2, dtype=np.uint8)
print(img)

del img



img shape with 0,0,255 color on all pixels
[[[  0   0 255]
  [  0   0 255]]

 [[  0   0 255]
  [  0   0 255]]]
2x2 image with 2 channels
[[[255 255]
  [255 255]]

 [[255 255]
  [255 255]]]
2x2 image with 3 channels and all pixels are 0
[[[0 0 0]
  [0 0 0]]

 [[0 0 0]
  [0 0 0]]]
2x2 image with 3 channels and all pixels are 1
[[[1 1 1]
  [1 1 1]]

 [[1 1 1]
  [1 1 1]]]
eye matrix with 2x2 and 1 channel and all pixels are 1 on the diagonal
[[1 0]
 [0 1]]


In [29]:
# clone a row
img = cv2.imread('images/cat.webp')
img_clone = img[0].copy()
img_clone[0] = [255, 255, 255]
print('row')
print(img[0, 0])
print(img_clone[0])

# clone a column
img_clone = img[:, 0].copy()
img_clone[0] = [255, 255, 255]
print('column')
print(img[0, 0])
print(img_clone[0])

del img, img_clone



row
[244 243 241]
[255 255 255]
column
[244 243 241]
[255 255 255]


### Output format
In python we can use the print() function to print the matrix. However, the output is not very readable. We can use the format() method of the Mat class to print the matrix in a more readable format. The format() method takes a format specifier as a parameter. The format specifier is a string that specifies how the matrix should be formatted. The format specifier is similar to the format specifier used in the printf() function of the C language.


## Getting Started with Image

### Goal
In this section we will learn how to use OpenCV to read an image from a file and display the image on the screen. We will also learn how to save an image to a file.

### Source code
This code reads an image from a file and displays the image on the screen. The parameters of imread() function is 
- the name of the file to be read
- the format of the file to be read. The default value is IMREAD_COLOR, which means that the image is read in the BGR format. Other possible values are IMREAD_GRAYSCALE and IMREAD_UNCHANGED.

namedWindow() function is used to create a window in which the image is to be displayed. The parameters of the namedWindow() function are:
- the name of the window
- the flags of the window. The default value is WINDOW_AUTOSIZE, which means that the window is automatically resized to fit the image. Other possible values are WINDOW_NORMAL, which means that the window is not automatically resized, and WINDOW_OPENGL, which means that the window is created for OpenGL support.

imshow() function is used to display the image on the screen. The parameters of the imshow() function are:
- the name of the window in which the image is to be displayed
- the image to be displayed


waitKey() function is used to wait for a key to be pressed, has the following parameter:
- the number of milliseconds to wait for a key to be pressed. The default value is 0, which means that the program waits indefinitely for a key to be pressed.

imwrite() function is used to write the image to a file. The parameters of the imwrite() function are:
- the name of the file to be written
- the image to be written to the file

destroyAllWindows() function is used to destroy all the windows created by the program.

In [1]:
import cv2 as cv
import sys

# read the image
img = cv.imread('images/cat.webp')

# check if the image is loaded
if img is None:
    sys.exit("Could not read the image.")

# create a window
cv.namedWindow('image', cv.WINDOW_NORMAL)

# display the image
cv.imshow('image', img)

# wait for a key
k = cv.waitKey(0)

# if the key pressed is 's'
if k == ord('s'):
    # save the image
    cv.imwrite('images/cat.png', img)
    # destroy all windows
    cv.destroyAllWindows()




: 

## Getting Started with Video

### Goal
In this section we will learn how to use OpenCV to read a video from a file and display the video on the screen. We will also learn how to save a video to a file.

### Capture video from camera
To capture a video from a camera, we first need to create a VideoCapture object. The VideoCapture object is created as follows:

In [3]:
# capture video from camera
import cv2 as cv
import numpy as np

# create a VideoCapture object
cap = cv.VideoCapture(0)

# check if the camera is opened
if not cap.isOpened():
    print("Cannot open camera")
    exit()

# read the frame
while True:
    # capture frame by frame
    ret, frame = cap.read()

    # if frame is read correctly ret is True
    if not ret:
        print("Can't receive frame (stream end?). Exiting ...")
        break

    # our operations on the frame come here
    gray = cv.cvtColor(frame, cv.COLOR_BGR2GRAY)
    mirror = cv.flip(gray, 1)
    
    cv.imshow('mirror', mirror)
    # wait for a key
    if cv.waitKey(1) == ord('q'):
        break

# release the capture
cap.release()

# destroy all windows
cv.destroyAllWindows()



cap.read() function is used to read a frame from the video. The parameters of the cap.read() function are:
- the frame to be read from the video

cap.isOpened() function is used to check if the video is opened successfully. The parameters of the cap.isOpened() function are:
- the frame to be read from the video

cap.release() function is used to release the video. The parameters of the cap.release() function are:

cap.get() function is used to get the properties of the video. The parameters of the cap.get() function are:
- the property to be retrieved

cap.set() function is used to set the properties of the video. The parameters of the cap.set() function are:
- the property to be set
- the value of the property to be set

### Play video from file
To play a video from a file, we first need to create a VideoCapture object. The VideoCapture object is created as follows:


In [5]:
# play video from file
import cv2 as cv
import numpy as np

# create a VideoCapture object
cap = cv.VideoCapture('videos/cute.mp4')

while cap.isOpened():
    # read the frame
    ret, frame = cap.read()

    # if frame is read correctly ret is True
    if not ret:
        print("Can't receive frame (stream end?). Exiting ...")
        break

    # our operations on the frame come here
    gray = cv.cvtColor(frame, cv.COLOR_BGR2GRAY)
    mirror = cv.flip(gray, 1)
    
    cv.imshow('mirror', mirror)
    # wait for a key
    if cv.waitKey(1) == ord('q'):
        break



### Save video to file
To save a video to a file, we first need to create a VideoWriter object. The VideoWriter object is created as follows:


In [10]:
# Save video
import cv2 as cv
import numpy as np

# create a VideoCapture object
cap = cv.VideoCapture(0)

# define the codec and create VideoWriter object
fourcc = cv.VideoWriter_fourcc(*'MJPG')
out = cv.VideoWriter('videos/output.mp4', fourcc, 20.0, (640, 480))

while cap.isOpened():
    # read the frame
    ret, frame = cap.read()

    # if frame is read correctly ret is True
    if not ret:
        print("Can't receive frame (stream end?). Exiting ...")
        break

    # our operations on the frame come here
    gray = cv.cvtColor(frame, cv.COLOR_BGR2GRAY)
    mirror = cv.flip(gray, 1)
    # make it look cartoonish
    mirror = cv.medianBlur(mirror, 5)
    # add edge detection
    mirror = cv.Canny(mirror, 100, 200)
    
    # write the flipped frame
    out.write(mirror)
    
    cv.imshow('mirror', mirror)
    # wait for a key
    if cv.waitKey(1) == ord('q'):
        break

# release the capture
cap.release()

# release the output
out.release()

# destroy all windows
cv.destroyAllWindows()


OpenCV: FFMPEG: tag 0x47504a4d/'MJPG' is not supported with codec id 7 and format 'mp4 / MP4 (MPEG-4 Part 14)'
OpenCV: FFMPEG: fallback to use tag 0x7634706d/'mp4v'


video.write() function is used to write a frame to the video. The parameters of the video.write() function are:
- the frame to be written to the video

VideoWriter_fourcc() function is used to create a four character code (FOURCC) for the video codec.
FourCC is a 4-byte code used to specify the video codec. The list of available codes can be found in fourcc.org. It is platform dependent. The following codecs work fine for me.

In Fedora: DIVX, XVID, MJPG, X264, WMV1, WMV2. (XVID is more preferable. MJPG results in high size video. X264 gives very small size video)
In Windows: DIVX (More to be tested and added)
In OSX: MJPG (.mp4), DIVX (.avi), X264 (.mkv).
FourCC code is passed as `cv.VideoWriter_fourcc('M','J','P','G')or cv.VideoWriter_fourcc(*'MJPG')` for MJPG.


## Getting Started with Drawing

### Goal
In this section we will learn how to use OpenCV to draw lines, rectangles, circles, ellipses, polygons, text and put images on an image.

### Drawing lines
line() function is used to draw a line on an image. The parameters of the line() function are:
- the image on which the line is to be drawn
- the starting point of the line
- the ending point of the line
- the color of the line
- the thickness of the line

In [13]:
# draw a line
import cv2 as cv
import numpy as np

# create a black image
img = np.zeros((512, 512, 3), np.uint8)

# draw a diagonal blue line with thickness of 5 px
img = cv.line(img, (0, 0), (511, 511), (255, 0, 0), 5)

# display the image
cv.imshow('image', img)

# wait for a key
cv.waitKey(0)

# destroy all windows
cv.destroyAllWindows()


### Drawing rectangles
rectangle() function is used to draw a rectangle on an image. The parameters of the rectangle() function are:
- the image on which the rectangle is to be drawn
- the top-left corner of the rectangle
- the bottom-right corner of the rectangle
- the color of the rectangle
- the thickness of the rectangle

In [14]:
# draw a rectangle with thickness of 3 px and green color
img = cv.rectangle(img, (384, 0), (510, 128), (0, 255, 0), 3)

# display the image
cv.imshow('image', img)

# wait for a key
cv.waitKey(0)

# destroy all windows
cv.destroyAllWindows()

### Drawing circles
circle() function is used to draw a circle on an image. The parameters of the circle() function are:
- the image on which the circle is to be drawn
- the center of the circle
- the radius of the circle
- the color of the circle
- the thickness of the circle


In [15]:
# draw a circle with thickness of -1 px and red color
img = cv.circle(img, (447, 63), 63, (0, 0, 255), -1)

# display the image
cv.imshow('image', img)

# wait for a key
cv.waitKey(0)

# destroy all windows
cv.destroyAllWindows()

### Drawing ellipses
ellipse() function is used to draw an ellipse on an image. The parameters of the ellipse() function are:
- the image on which the ellipse is to be drawn
- the center of the ellipse
- the axes of the ellipse
- the angle of rotation of the ellipse
- the starting angle of the ellipse
- the ending angle of the ellipse
- the color of the ellipse
- the thickness of the ellipse

In [17]:
# draw an ellipse with thickness of -1 px of angle 0 and 180 degrees and blue color
img = cv.ellipse(img, (256, 256), (100, 50), 0, 0, 180, 255, -1)

# display the image
cv.imshow('image', img)

# wait for a key
cv.waitKey(0)

# destroy all windows
cv.destroyAllWindows()


### Drawing polygons
polylines() function is used to draw a polygon on an image. The parameters of the polylines() function are:
- the image on which the polygon is to be drawn
- the points of the polygon
- the color of the polygon
- the thickness of the polygon

Note that the points of the polygon are passed as a list of points. Each point is a tuple of two integers. The first integer is the x-coordinate of the point. The second integer is the y-coordinate of the point. The points are connected by straight lines. The last point is connected to the first point. Can be used to draw multiple lines. Better and faster way to draw a group of lines than calling cv2.line() for each line.


In [19]:
# draw a polygon with thickness of -1 px and yellow color
pts = np.array([[10,5],[20,30],[70,20],[50,10]], np.int32)
pts = pts.reshape((-1,1,2))
cv.polylines(img,[pts],True,(0,255,255))

# display the image
cv.imshow('image', img)

# wait for a key
cv.waitKey(0)

# destroy all windows
cv.destroyAllWindows()

### Drawing text
putText() function is used to draw text on an image. The parameters of the putText() function are:
- the image on which the text is to be drawn
- the text to be drawn
- the position of the text
- the font of the text
- the size of the text
- the color of the text
- the thickness of the text

In [20]:
# add text to the image
font = cv.FONT_HERSHEY_SIMPLEX

cv.putText(img,'OpenCV',(10,500), font, 4,(255,255,255),2,cv.LINE_AA)

# display the image
cv.imshow('image', img)

# wait for a key
cv.waitKey(0)

# destroy all windows
cv.destroyAllWindows()

#### OpenCV logo

In [88]:
## OpenCv logo
# create a black image
img = np.zeros((512, 512, 3), np.uint8)

# draw the red C on top center
img = cv.circle(img, (256, 128), 100, (0, 0, 255), -1)
img = cv.circle(img, (256, 128), 50, (0, 0, 0), -1)
# draw a filled triangle
pts = np.array([[256, 128], [190, 256], [320, 256]], np.int32)
pts = pts.reshape((-1, 1, 2))
img = cv.fillPoly(img, [pts], (0, 0, 0))

# draw the green C on bottom left
img = cv.circle(img, (128, 384), 100, (0, 255, 0), -1)
img = cv.circle(img, (128, 384), 50, (0, 0, 0), -1)
# draw a filled triangle for the green C
pts = np.array([[128, 384], [190, 256], [256, 356]], np.int32)
pts = pts.reshape((-1, 1, 2))
img = cv.fillPoly(img, [pts], (0, 0, 0))

# draw the blue C on bottom right
img = cv.circle(img, (384, 384), 100, (255, 0, 0), -1)
img = cv.circle(img, (384, 384), 50, (0, 0, 0), -1)
# draw a filled triangle to open the blue C from the top
pts = np.array([[384, 384], [448, 256], [320, 256]], np.int32)
pts = pts.reshape((-1, 1, 2))
img = cv.fillPoly(img, [pts], (0, 0, 0))

# display the image
cv.imshow('image', img)

# wait for a key
cv.waitKey(0)

# destroy all windows
cv.destroyAllWindows()

![opencv](images/opencv_logo.png)