# Open cv

https://opencv-python-tutroals.readthedocs.io/en/latest/py_tutorials/py_gui/py_video_display/py_video_display.html

OpenCV (Open Source Computer Vision Library) is an open source computer vision and machine learning software library. OpenCV was built to provide a common infrastructure for computer vision applications and to accelerate the use of machine perception in the commercial products

### How OpenCV Works

How does computer recognize the image?

Human eyes provide lots of information based on what they see. Machines are facilitated with seeing everything, convert the vision into numbers and store in the memory. Here the question arises how computer convert images into numbers. So the answer is that the pixel value is used to convert images into numbers. A pixel is the smallest unit of a digital image or graphics that can be displayed and represented on a digital display device.

![image.png](attachment:e66c9b21-787c-48be-9610-c71298ced523.png)

There are two common ways to identify the images:

1. Grayscale

Grayscale images are those images which contain only two colors black and white. The contrast measurement of intensity is black treated as the weakest intensity, and white as the strongest intensity. When we use the grayscale image, the computer assigns each pixel value based on its level of darkness.

2. RGB

An RGB is a combination of the red, green, blue color which together makes a new color. The computer retrieves that value from each pixel and puts the results in an array to be interpreted.

![image.png](attachment:b8917def-959c-4b56-8ddb-6f48026b8cea.png)

![image.png](attachment:3a84f408-04fa-4c7e-8ae0-940d39b79ad4.png)

### 1. Reading, Writing and Displaying images with Opencv

In [None]:
# importing cv
import cv2
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns

#### How to read image, display, save image


. Here, you will learn how to read an image, how to display it and how to save it back

. You will learn these functions : cv2.imread(), cv2.imshow() , cv2.imwrite()

#### Read an image

Use the function cv2.imread() to read an image. The image should be in the working directory or a full path of image should be given.

Second argument is a flag which specifies the way image should be read.

cv2.IMREAD_COLOR : Loads a color image. Any transparency of image will be neglected. It is the default flag.

cv2.IMREAD_GRAYSCALE : Loads image in grayscale mode

cv2.IMREAD_UNCHANGED : Loads image as such including alpha channel

Instead of these three flags, you can simply pass integers 1, 0 or -1 respectively.

In [None]:
img = cv2.imread('ninja.jpg', -1 )# read the image

#### Display an image

Use the function cv2.imshow() to display an image in a window. The window automatically fits to the image size.

First argument is a window name which is a string. second argument is our image. You can create as many windows as you wish, but with different window names.


In [None]:
cv2.imshow('hero', img)
cv2.waitKey(0)
cv2.destroyAllWindows()


#### Write an image

Use the function cv2.imwrite() to save an image.

First argument is the file name, second argument is the image you want to save.

In [None]:
cv2.imwrite('ninja1.png',img)

Below program loads an image in grayscale, displays it, save the image if you press ‘s’ and exit, or simply exit without saving if you press ESC key.

In [None]:
img2 = cv2.imread('ninja.jpg', 0)
cv2.imshow('hero1', img2)
m = cv2.waitKey()
if m == 27: # wait for ESC key to exit
    cv2.destroyAllWindows()
elif m == ord('s'): # wait for 's' key to save and exit
    cv2.imwrite('messigray.png',img2)
    cv2.destroyAllWindows()    

#### Using Matplotlib

Matplotlib is a plotting library for Python which gives you wide variety of plotting methods. You will see them in coming articles. Here, you will learn how to display image with Matplotlib. You can zoom images, save it etc using Matplotlib.


In [None]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import cv2

In [None]:

img3 = cv2.imread('ninja.jpg',0)
plt.imshow(img3, cmap = 'gray', interpolation = 'bicubic')
plt.show()

In [None]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import cv2

img3 = cv2.imread('ninja.jpg',0)
plt.imshow(img3, cmap = 'gray', interpolation = 'bicubic')
plt.xticks([]), plt.yticks([])# to hide tick values on X and Y axis
plt.show()

### dimensions of image

. Height represents the number of pixel rows in the image or the number of pixels in each column of the image array.

. Width represents the number of pixel columns in the image or the number of pixels in each row of the image array.

. Number of Channels represents the number of components used to represent each pixel.

. In the above example, Number of Channels = 4 represent Alpha, Red, Green and Blue channels.

Conclusion

In [None]:
import numpy as np
import cv2
img = cv2.imread('ninja.jpg', 1)
dimensions = img.shape
height = img.shape[0]
width = img.shape[1]
channels = img.shape[2]
size = img.size# size gives total no of pixels
print('Image Dimension    : ',dimensions)
print('Image Height       : ',height)
print('Image Width        : ',width)
print('Number of Channels : ',channels)
print('size: ', size)

### Resize image

syntax = cv2.resize(src, dsize[, dst[, fx[, fy[, interpolation]]]])

src=[required] source/input image

dsize=[required] desired size for the output image

fx=[optional] scale factor along the horizontal axis

fy=[optional] scale factor along the vertical axis

interpolation = [optional] flag that takes one of the following methods. INTER_NEAREST – a nearest-neighbor interpolation INTER_LINEAR – a bilinear interpolation (used by default) INTER_AREA – resampling using pixel area relation. It may be a preferred method for image decimation, as it gives moire’-free results. But when the image is zoomed, it is similar to the INTER_NEAREST method. INTER_CUBIC – a bicubic interpolation over 4×4 pixel neighborhood INTER_LANCZOS4 – a Lanczos interpolation over 8×8 pixel neighborhood


In [None]:
import numpy as np
import cv2
img = cv2.imread('ninja.jpg', 1)
width = 750
height = 850
dim = (width, height)
resized = cv2.resize(img, dim, interpolation = cv2.INTER_AREA)
print('Resized Dimensions : ',resized.shape)
 
cv2.imshow("Resized image", resized)
cv2.imwrite('ninja1.png',img)
cv2.waitKey(0)
cv2.destroyAllWindows()

In [None]:
import numpy as np
import cv2
img = cv2.imread('eyes.jpg', 1)
width = 750
height = 850
dim = (width, height)
resized = cv2.resize(img, dim, interpolation = cv2.INTER_AREA)
print('Resized Dimensions : ',resized.shape)
 
cv2.imshow("Resized image", resized)
cv2.imwrite('eyes1.png',img)
cv2.waitKey(0)
cv2.destroyAllWindows()

#### Read PNG images with Transparency (Alpha) Channel



In [None]:
import numpy as np
import cv2
img = cv2.imread('ninja.jpg', 1)
print(img[250][400])

The output is a pixel value at (100,50)th position. It contains 3 channels of data.

### 2.Getting Started with Videos(read video, display video and save video)

. Learn to read video, display video and save video.

. Learn to capture from Camera and display it.

. You will learn these functions : cv2.VideoCapture(), cv2.VideoWriter()

#### Capture Video from Camera

link: https://opencv-python-tutroals.readthedocs.io/en/latest/py_tutorials/py_gui/py_video_display/py_video_display.html

Often, we have to capture live stream with camera. OpenCV provides a very simple interface to this. Let’s capture a video from the camera (I am using the in-built webcam of my laptop), convert it into grayscale video and display it. Just a simple task to get started.

To capture a video, you need to create a VideoCapture object. Its argument can be either the device index or the name of a video file. Device index is just the number to specify which camera. Normally one camera will be connected (as in my case). So I simply pass 0 (or -1). You can select the second camera by passing 1 and so on. After that, you can capture frame-by-frame. But at the end, don’t forget to release the capture.

ret is a boolean variable that returns true if the frame is available.

frame is an image array vector captured based on the default frames per second defined explicitly or implicitly

In [None]:
import numpy as np
import pandas as pd
import cv2

cap = cv2.VideoCapture(0)
while(True):
    ret, frame = cap.read()
    cv2.imshow('frame', frame)
    if cv2.waitKey(1) & 0xFF == ord('q'):
        break
cap.release()
cv2.destroyAllWindows()

#### to capture in gray color

In [None]:
import numpy as np
import cv2

cap1 = cv2.VideoCapture(0)
while(True):
    ret, frame1 = cap1.read()
    gray1 = cv2.cvtColor(frame1, cv2.COLOR_BGR2GRAY)# gray used to capture video in gray color
    cv2.imshow('frame1', gray1)
    if cv2.waitKey(1) & 0xFF == ord('g'):
        break
cap.release()
cv2.destroyAllWindows()

In [None]:
import numpy as np
import cv2

cap1 = cv2.VideoCapture(0)
while(True):
    ret, frame1 = cap1.read()
    frame1 = cv2.Canny(frame1,60,50)
    cv2.imshow('frame1', frame1)
    if cv2.waitKey(1) & 0xFF == ord('g'):
        break
cap1.release()
cv2.destroyAllWindows()

### capturing video from mobile

In [1]:
import cv2
import numpy as np
import requests
import imutils
import time

video = cv2.VideoCapture(0)
url ='http://192.168.43.142:8080/shot.jpg'

while True :
    r = requests.get(url)
    img_arr = np.array(bytearray(r.content),dtype = np.uint8 )
    img = cv2.imdecode(img_arr, -1)
    cv2.imshow('web cam', img)
    if cv2.waitKey(1) & 0xFF == 'q':
        break
cv2.destroyAllWindows()

ConnectionError: HTTPConnectionPool(host='192.168.43.142', port=8080): Max retries exceeded with url: /shot.jpg (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x0000027FBB0A5FA0>: Failed to establish a new connection: [WinError 10061] No connection could be made because the target machine actively refused it'))

In [None]:
import cv2
import numpy as np
import requests
import imutils
import time

video = cv2.VideoCapture(0)
url ='http://192.168.43.142:8080/shot.jpg'

while True :
    r = requests.get(url)
    img_arr = np.array(bytearray(r.content),dtype = np.uint8 )
    img = cv2.imdecode(img_arr, -1)
    cv2.imshow('web cam', img)
    if cv2.waitKey(1) & 0xFF == 'q':
        break
cv2.destroyAllWindows()

In [None]:
import urllib
import cv2
import numpy as np
import ssl
from urllib.request import urlopen

ctx = ssl.create_default_context()
ctx.check_hostname = False
ctx.verify_mode = ssl.CERT_NONE

url = 'http://192.168.43.1:8080'

while True:
    imgResp = urlopen(url)
    imgNp = np.array(bytearray(imgResp.read()), dtype=np.uint8)
    img = cv2.imdecode(imgNp, -1)
    cv2.imshow('temp', img)
    q = cv2.waitKey(1)
    if q == ord("q"):
        break;

cv2.destroyAllWindows()

### Drawing Functions in OpenCV

Learn to draw different geometric shapes with OpenCV

You will learn these functions : cv2.line(), cv2.circle() , cv2.rectangle(), cv2.ellipse(), cv2.putText()

img : The image where you want to draw the shapes

color : Color of the shape. for BGR, pass it as a tuple, eg: (255,0,0) for blue. For grayscale, just pass the scalar value.

thickness : Thickness of the line or circle etc. If -1 is passed for closed figures like circles, it will fill the shape. default thickness = 1

lineType : Type of line, whether 8-connected, anti-aliased line etc. By default, it is 8-connected. cv2.LINE_AA gives anti-aliased line which looks great for curves.

#### Drawing Line

To draw a line, you need to pass starting and ending coordinates of line. We will create a black image and draw a blue line on it from top-left to bottom-right corners.

In [None]:
import numpy as np
import cv2

img = np.zeros((520,520,3))
img = cv2.line(img,(0,0),(511,511),(255,0,0),5)# We will draw a first line with a blue color (B=255, G=0, R=0) 
                                                #between points (x=0, y=0) and (x=511, y=511) and with a thickness of 5 pixels.
cv2.imshow('image', img)
cv2.waitKey()
cv2.destroyAllWindows()


In [None]:
import numpy as np
import cv2

imgs = np.ones((518,518,3))
imgs = cv2.line(imgs, (250,250),(511,511), (255,0,0),5)
imgs = cv2.line(imgs, (0,250), (250,0),(255,0,0), 5)
cv2.imshow('image', imgs)
cv2.waitKey(0)
cv2.destroyAllWindows()

In [None]:
import numpy as np
import cv2

imga = cv2.imread('ninja.jpg', 1)
imga = cv2.line(imga, (0,0),(512,512), (0,52, 0), 50)
imga = cv2.line(imga, (512,0),(0,512),(0,0,0),10)
cv2.imshow('images', imga)
cv2.waitKey(0)
cv2.destroyAllWindows()

#### Drawing Rectangle

To draw a rectangle, you need top-left corner and bottom-right corner of rectangle. This time we will draw a green rectangle at the top-right corner of image.

In [None]:
import numpy as np
import cv2
imgs = np.ones((518,518,3))
imgs = cv2.rectangle(imgs,(20,20),(300,300),(0,255,0),3)
cv2.imshow('imaas', imgs)
cv2.waitKey(0)
cv2.destroyAllWindows()

In [None]:
import numpy as np
import cv2
img = cv2.imread('ninja.jpg', 1)
img = cv2.rectangle(img, (120,120),(520,620),(0,0,255),-1)
cv2.imshow('thadrr', img)
cv2.waitKey(0)
cv2.destroyAllWindows()

#### Drawing Circle

To draw a circle, you need its center coordinates and radius. We will draw a circle inside the rectangle drawn above.

In [None]:
import numpy as np
import cv2
img = cv2.imread('ninja.jpg', 1)
circles = cv2.circle(img,(750, 450), 150, (0,255,0), -1)
cv2.imshow('circles',circles)
cv2.waitKey(0)
cv2.destroyAllWindows()

#### Drawing Ellipse

To draw the ellipse, we need to pass several arguments. One argument is the center location (x,y). Next argument is axes lengths (major axis length, minor axis length). angle is the angle of rotation of ellipse in anti-clockwise direction. startAngle and endAngle denotes the starting and ending of ellipse arc measured in clockwise direction from major axis. i.e. giving values 0 and 360 gives the full ellipse. For more details, check the documentation of cv2.ellipse(). Below example draws a half ellipse at the center of the image.

In [None]:
import numpy as np
import cv2
img = cv2.imread('ninja.jpg', 1)
img = cv2.ellipse(img,(400,350),(200,50),10,10,360,(0,255,0),3)
cv2.imshow('ellips', img)
cv2.waitKey(0)
cv2.destroyAllWindows()

#### Drawing Polygon

To draw a polygon, first you need coordinates of vertices. Make those points into an array of shape ROWSx1x2 where ROWS are number of vertices and it should be of type int32. Here we draw a small polygon of with four vertices in yellow color.

In [None]:
import numpy as np
import cv2
pts = np.array([[10,5],[20,30],[70,20],[50,10]], np.int32)
pts = pts.reshape((-1,1,2))
img = cv2.polylines(img,[pts],True,(0,255,255))

#### Adding Text to Images
To put texts in images, you need specify following things.

. Text data that you want to write

. Position coordinates of where you want put it (i.e. bottom-left corner where data starts).

. Font type (Check cv2.putText() docs for supported fonts)

. Font Scale (specifies the size of font)

. regular things like color, thickness, lineType etc. For better look, lineType = cv2.LINE_AA is recommended.


In [None]:
import numpy as np
import cv2
img = cv2.imread('ninja.jpg', 1)
font = cv2.FONT_HERSHEY_SIMPLEX
img = cv2.putText(img,'Snakeeyes',(10,500), font, 8,(0,0,255),8,cv2.LINE_AA)
cv2.imshow('text', img)
cv2.waitKey(0)
cv2.destroyAllWindows()

## Core operations

### Basic Operations on Images

#### Accessing and Modifying pixel values

You can access a pixel value by its row and column coordinates. For BGR image, it returns an array of Blue, Green, Red values. For grayscale image, just corresponding intensity is returned.

In [None]:
# accessing pixel values
import numpy as np
import cv2
cap = cv2.imread('ninja.jpg', 1)
pixel = cap[150,1090]#150,1090 are coordinates or x,y values where we want pixel values
print(pixel)
# Image datatype is obtained by img.dtype:
dtype = cap.dtype
print(dtype)

in the above blue is 109, green is 114, red is 135 as pixel values at coordinates(150, 1090) for image ninja.jpg

In [None]:
# modifing pixel values
import numpy as np
import cv2 
img = cv2.imread('ninja.jpg', 1)
img[250,340]=(255,255,255)# to modify pixel values we give BGR values to perticular coordinates
pixel=img[250,340]
print(pixel)


In [None]:
# accessing only blue pixel
blue = img[150,1090,0]
print(blue)
# accessing only green pixel
green = img[150,1090,1]
print(green)
# accessing only rred pixel
red = img[150,1090,2]
print(red)

### Image ROI(Region of Interest)

Sometimes, we need to work with some areas of the image. As we discuss in the previous tutorial face detection is over the entire picture. When a face is obtained, we select only the face region and search for eyes inside it instead of searching the whole image.

ROI is again obtained using Numpy indexing. Here I am selecting the ball and copying it to another region in the image


#### Splitting and Merging Image Channels

The B,G,R channels of an image can be split into their individual planes when needed. Then, the individual channels can be merged back together to form a BGR image again. This can be performed by.

b,g,r = cv2.split(img)

img = cv2.merge((b,g,r))

b = img[:,:,0] 

Suppose, you want to make all the red pixels to zero, you need not split like this and put it equal to zero. You can simply use Numpy indexing which is faster. img[:,:,2] = 0

cv2.split() is a costly operation (in terms of time), so only use it if necessary. Numpy indexing is much more efficient and should be used if possible.

#### Making borders for images

If you want to create a border around the image, something like a photo frame, you can use = cv2.copyMakeBorder() function

. src - input image

. top, bottom, left, right - border width in number of pixels in corresponding directions

. borderType - Flag defining what kind of border to be added. It can be following types:
  
   1.cv2.BORDER_CONSTANT - Adds a constant colored border. The value should be given as next argument.

   2.cv2.BORDER_REFLECT - Border will be mirror reflection of the border elements, like this : fedcba|abcdefgh|hgfedcb

   3.cv2.BORDER_REFLECT_101 or cv2.BORDER_DEFAULT - Same as above, but with a slight change, like this : gfedcb|abcdefgh|gfedcba

   4.cv2.BORDER_REPLICATE - Last element is replicated throughout, like this: aaaaaa|abcdefgh|hhhhhhh

   5.cv2.BORDER_WRAP - Can’t explain, it will look like this : cdefgh|abcdefgh|abcdefg

. value - Color of border if border type is cv2.BORDER_CONSTANT

ex:![image.png](attachment:68c47570-b740-4bb6-b988-fabaf9dcdf21.png)

In [None]:
import numpy as np
import cv2
img = cv2.imread('ninja.jpg', 1)
blue = (255,0,0)
border = cv2.copyMakeBorder(img,20,10,30,10,cv2.BORDER_CONSTANT, value = blue)
cv2.imshow('image',border)
cv2.waitKey(0)
cv2.destroyAllWindows()

In [None]:
import numpy as np
import cv2
img = cv2.imread('ninja1.png',1)
red = (0,0,255)
border = cv2.copyMakeBorder(img,15,10,20,20,cv2.BORDER_WRAP, value = red)
cv2.imshow('imagesd', border)
cv2.waitKey(0)
cv2.destroyAllWindows()

#### Image Rotation

The image can be rotated in various angles (90,180,270 and 360). OpenCV calculates the affine matrix that performs affine transformation, which means it does not preserve the angle between the lines or distances between the points, although it preserves the ratio of distances between points lying on the lines.

syntax = cv2.getRotationMatrix2D(center, angle, scale rotated = cv2.warpAfifne(img,M,(w,h)) 

center: It represents the center of the image.

angle: It represents the angle by which a particular image to be rotated in the anti-clockwise direction.

rotated: ndarray that holds the rotated image data.

scale: The value 1.0 is denoted that the shape is preserved. Scale the image according to the provided value.

In [None]:
import numpy as np
import cv2

img = cv2.imread('eyes.jpg',1)
dimension = img.shape
h=768
w=1024
center = (768/2 , 1024/2)
angle90 = 90  
angle180 = 180  
angle270 = 270  
scale = 1.0
# Perform the counterclockwise rotation holding at the center  abuout 270 degrees
M = cv2.getRotationMatrix2D(center, angle270, scale)  
rotated270 = cv2.warpAffine(img, M, (h, w)) 
cv2.imshow('Image rotated by 270 degrees', rotated270)  
cv2.waitKey(0)   
cv2.destroyAllWindows()  

#### Mouse as a Paint-Brush
 
Here, we create a simple application which draws a circle on an image wherever we double-click on it.

First we create a mouse callback function which is executed when a mouse event take place. Mouse event can be anything related to mouse like left-button down, left-button up, left-button double-click etc. It gives us the coordinates (x,y) for every mouse event. With this event and location, we can do whatever we like. To list all available events available, run the following code in Python terminal

Learn to handle mouse events in OpenCV
You will learn these functions : cv2.setMouseCallback()

In [None]:
import cv2
events = [i for i in dir(cv2) if 'EVENT' in i]
print (events)

In [None]:
import numpy as np
import cv2

def draw_circle(events, x, y, flags, param):
    if events == cv2.EVENT_LBUTTONDOWN:
        cv2.circle(img,(750, 450), 150, (0,255,0), -1)
img = cv2.imread('hero.jpg')
cv2.setMouseCallback('image', draw_circle)
while(1):
    cv2.imshow('image',img)
    if cv2.waitKey(20) & 0xFF == 27:
        break
cv2.destroyAllWindows()


### Arithmetic Operations on Images
#### Image Addition

You can add two images by OpenCV function, cv2.add() or simply by numpy operation, res = img1 + img2. Both images should be of same depth and type, or second image can just be a scalar value.

There is a difference between OpenCV addition and Numpy addition. OpenCV addition is a saturated operation while Numpy addition is a modulo operation.

It will be more visible when you add two images. OpenCV function will provide a better result. So always better stick to OpenCV functions.


In [None]:
import numpy as np
import cv2
img1 = cv2.imread('ninja1.jpg',1)
img2 = cv2.imread('eyes.jpg', 1)
img = cv2.add(img1, img2)
cv2.imshow('imagessefx', img)
cv2.waitKey(0)
cv2.destroyAllWindows()


#### Image Blending

This is also image addition, but different weights are given to images so that it gives a feeling of blending or transparency. Images are added as per the equation below:

g(x) = (1 - \alpha)f_{0}(x) + \alpha f_{1}(x)

In [None]:
import numpy as np
import cv2
img1 = cv2.imread('ninja1.jpg',1)
img2 = cv2.imread('eyes1.jpg', 1)
img = cv2.addWeighted(img1,0.7,img2,0.3, 1)
cv2.imshow('edgesds', img)
cv2.waitKey(0)
cv2.destroyAllWindows()

#### Bitwise Operations

This includes bitwise AND, OR, NOT and XOR operations. They will be highly useful while extracting any part of the image (as we will see in coming chapters), defining and working with non-rectangular ROI etc. Below we will see an example on how to change a particular region of an image

I want to put OpenCV logo above an image. If I add two images, it will change color. If I blend it, I get an transparent effect. But I want it to be opaque. If it was a rectangular region, I could use ROI as we did in last chapter. But OpenCV logo is a not a rectangular shape. So you can do it with bitwise operations as below

Syntax: cv2.bitwise_and(source1, source2, destination, mask)

Parameters:

source1: First Input Image array(Single-channel, 8-bit or floating-point)

source2: Second Input Image array(Single-channel, 8-bit or floating-point)

dest: Output array (Similar to the dimensions and type of Input image array)

mask: Operation mask, Input / output 8-bit single-channel mask

In [None]:
# bitwise_AND

import numpy as np
import cv2
img1 = cv2.imread('ninja.jpg')
img2 = cv2.imread('eyes.jpg')
img1 = cv2.resize(img1, (750,750))
img2 = cv2.resize(img2, (750,750))
dest_and = cv2.bitwise_and(img2, img1, mask = None) # cv2.bitwise_and is applied over the 
cv2.imshow('Bitwise And', dest_and)
cv2.waitKey(0)
cv2.destroyAllWindows()

In [None]:
# bitwise_OR
import numpy as np
import cv2
img1 = cv2.imread('ninja.jpg')
img2 = cv2.imread('spider.jpg')
img1 = cv2.resize(img1, (750,750))
img2 = cv2.resize(img2, (750,750))
img = cv2.bitwise_or(img1,img2, mask = None)
cv2.imshow('imgs',img)
cv2.waitKey(0)
cv2.destroyAllWindows()

In [None]:
import numpy as np
import cv2
img1 = cv2.imread('ninja.jpg')
img2 = cv2.imread('spider.jpg')
img1 = cv2.resize(img1, (750,750))
img2 = cv2.resize(img2, (750,750))
mask = cv2.cvtColor(img1,cv2.COLOR_BGR2HSV)
mask = cv2.inRange(mask, (110,50,50),(130,255,255))
img = cv2.bitwise_or(img1,img2, mask =mask)
cv2.imshow('imgs',img)
cv2.waitKey(0)
cv2.destroyAllWindows()

#### Performance Measurement and Improvement Techniques

To measure the performance of your code.

Some tips to improve the performance of your code.

You will see these functions : cv2.getTickCount, cv2.getTickFrequency etc.

. cv2.getTickCount function returns the number of clock-cycles after a reference event (like the moment machine was switched ON) to the moment this function is called. So if you call it before and after the function execution, you get number of clock-cycles used to execute a function.

. cv2.getTickFrequency function returns the frequency of clock-cycles, or the number of clock-cycles per second. So to find the time of execution in seconds, you can do following

. You can do the same with time module. Instead of cv2.getTickCount, use time.time() function. Then take the difference of two times.

e1 = cv2.getTickCount()

your code execution

e2 = cv2.getTickCount()

time = (e2 - e1)/ cv2.getTickFrequency()

In [None]:
import numpy as np
import cv2
img1 = cv2.imread('ninja.jpg')
e1 = cv2.getTickCount()
for i in range(5,49,2):
    img1 = cv2.medianBlur(img1,i)
e2 = cv2.getTickCount()
t = (e2 - e1)/cv2.getTickFrequency()
print(t)
cv2.imshow('tufjfj', img1)
cv2.waitKey(0)
cv2.destroyAllWindows()


#### Default Optimization in OpenCV

Many of the OpenCV functions are optimized using SSE2, AVX etc. It contains unoptimized code also. So if our system support these features, we should exploit them (almost all modern day processors support them). It is enabled by default while compiling. So OpenCV runs the optimized code if it is enabled, else it runs the unoptimized code. You can use cv2.useOptimized() to check if it is enabled/disabled and cv2.setUseOptimized() to enable/disable it. Let’s see a simple example.

#### Changing Colorspaces

In this tutorial, you will learn how to convert images from one color-space to another, like BGR \leftrightarrow Gray, BGR \leftrightarrow HSV etc.
In addition to that, we will create an application which extracts a colored object in a video
You will learn following functions : cv2.cvtColor(), cv2.inRange() etc.

There are more than 150 color-space conversion methods available in OpenCV. But we will look into only two which are most widely used ones, BGR \leftrightarrow Gray and BGR \leftrightarrow HSV.

syntax = For color conversion, we use the function cv2.cvtColor(input_image, flag) where flag determines the type of conversion.

For BGR \rightarrow Gray conversion we use the flags cv2.COLOR_BGR2GRAY. Similarly for BGR \rightarrow HSV, we use the flag cv2.COLOR_BGR2HSV. To get other flags, just run following commands in your Python terminal 

For HSV, Hue range is [0,179], Saturation range is [0,255] and Value range is [0,255]. Different softwares use different scales. So if you are comparing OpenCV values with them, you need to normalize these ranges.

In [None]:
import numpy as np
import cv2

img = cv2.imread('spider.jpg')
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY) # BGR to gray
cv2.imshow('grayimage', gray)
cv2.waitKey(0)
cv2.destroyAllWindows()

In [None]:
import numpy as np
import cv2
img = cv2.imread('spider.jpg')
hsv = cv2.cvtColor(img, cv2.COLOR_BGR2HSV) # bgr to HSV(Hue Saturation Value)
cv2.imshow('hsvde',hsv)
cv2.waitKey(0)
cv2.destroyAllWindows()

#### Object Tracking

Now we know how to convert BGR image to HSV, we can use this to extract a colored object. In HSV, it is more easier to represent a color than RGB color-space. In our application, we will try to extract a blue colored object. So here is the method:

Take each frame of the video

Convert from BGR to HSV color-space

We threshold the HSV image for a range of blue color

Now extract the blue object alone, we can do whatever on that image we want



In [None]:
# object traking on image
import cv2
import numpy as np
img = cv2.imread('spider.jpg')
hsv = cv2.cvtColor(img, cv2.COLOR_BGR2HSV)
lower_blue = np.array([10,50,70])
upper_blue = np.array([255,255,250]) # define range of blue color in HSV
mask = cv2.inRange(hsv, lower_blue, upper_blue) # Threshold the HSV image to get only blue colors
res = cv2.bitwise_and(img,img, mask= mask)# Bitwise-AND mask and original image
cv2.imshow('frame',img)
cv2.imshow('mask',mask)
cv2.imshow('res',res)
cv2.waitKey(0)
cv2.destroyAllWindows()

In [None]:
# object traking on live video
import numpy as np
import cv2
cap = cv2.VideoCapture(0)
while(True):
    ret, frame = cap.read()
    hsv=cv2.cvtColor(frame, cv2.COLOR_BGR2HSV) # Convert BGR to HSV
    # define range of blue color in HSV
    lower_blue = np.array([35,56,86])
    upper_blue = np.array([220,255,255])
    # Threshold the HSV image to get only blue colors
    mask = cv2.inRange(hsv, lower_blue, upper_blue)
    # Bitwise-AND mask and original image
    res = cv2.bitwise_and(frame,frame, mask= mask)
    cv2.imshow('frame',frame)
    cv2.imshow('mask',mask)
    cv2.imshow('res',res)
    if cv2.waitKey(1)& 0xFF == ord('w'):
        break
cap.release()
cv2.destroyAllWindows()


#### How to find HSV values to track

This is a common question found in stackoverflow.com. It is very simple and you can use the same function, cv2.cvtColor(). Instead of passing an image, you just pass the BGR values you want. For example, to find the HSV value of Green, try following commands in Python terminal

green = np.uint8([[[0,255,0 ]]])

hsv_green = cv2.cvtColor(green,cv2.COLOR_BGR2HSV)

print hsv_green

[[[ 60 255 255]]]

Now you take [H-10, 100,100] and [H+10, 255, 255] as lower bound and upper bound respectively. Apart from this method, you can use any image editing tools like GIMP or any online converters to find these values, but don’t forget to adjust the HSV ranges.

In [None]:
import cv2
import numpy as np
img = cv2.imread('spider.jpg')
hsv_green = cv2.cvtColor(img,cv2.COLOR_BGR2HSV)
cv2.imshow('ertft',hsv_green)
cv2.waitKey(0)
cv2.destroyAllWindows()

#### Image Thresholding

Thresholding:

In digital image processing, thresholding is the simplest method of segmenting images. From a grayscale image, thresholding can be used to create binary images

You will learn these functions : cv2.threshold, cv2.adaptiveThreshold etc.

##### Simple Thresholding

Here, the matter is straight forward. If pixel value is greater than a threshold value, it is assigned one value (may be white), else it is assigned another value (may be black). The function used is cv2.threshold. First argument is the source image, which should be a grayscale image. Second argument is the threshold value which is used to classify the pixel values. Third argument is the maxVal which represents the value to be given if pixel value is more than (sometimes less than) the threshold value. OpenCV provides different styles of thresholding and it is decided by the fourth parameter of the function. Different types are:

cv2.THRESH_BINARY

cv2.THRESH_BINARY_INV

cv2.THRESH_TRUNC

cv2.THRESH_TOZERO

cv2.THRESH_TOZERO_INV

![image.png](attachment:ae517594-74b0-4e79-a866-5e9e6f177712.png)



In [None]:
import numpy as np
import cv2

img = cv2.imread('hero.jpg',1)
ret,img1 = cv2.threshold(img, 120,255, cv2.THRESH_BINARY)# ret need to be write to display binary image
cv2.imshow('image',img1)
cv2.waitKey(0)
cv2.destroyAllWindows()

In [None]:
import numpy as np
import cv2
cap = cv2.VideoCapture(0)
while(True):
    ret,frame = cap.read()
    gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)
    ret,gray = cv2.threshold(gray,100,255,cv2.THRESH_BINARY)
    cv2.imshow('gray', gray)
    if cv2.waitKey(1)& 0xFF == ord('q'):
        break
cap.release()
cv2.destroyAllWindows()

#### Adaptive Thresholding

In the previous section, we used a global value as threshold value. But it may not be good in all the conditions where image has different lighting conditions in different areas. In that case, we go for adaptive thresholding. In this, the algorithm calculate the threshold for a small regions of the image. So we get different thresholds for different regions of the same image and it gives us better results for images with varying illumination.

Adaptive Method - It decides how thresholding value is calculated.

cv2.ADAPTIVE_THRESH_MEAN_C : threshold value is the mean of neighbourhood area.

cv2.ADAPTIVE_THRESH_GAUSSIAN_C : threshold value is the weighted sum of neighbourhood values where weights are a gaussian window.

Block Size - It decides the size of neighbourhood area.

C - It is just a constant which is subtracted from the mean or weighted mean calculated.


# adaptive trhesholding on image
import numpy as np
import cv2
img = cv2.imread('hero.jpg', 0)
img = cv2.adaptiveThreshold(img, 255, cv2.ADAPTIVE_THRESH_MEAN_C, 
                                          cv2.THRESH_BINARY, 199, 5) 
cv2.imshow('images',img)
cv2.waitKey(0)
cv2.destroyAllWindows()

In [None]:
# adaptive thresholding on video
import numpy as np
import cv2
cap = cv2.VideoCapture(0)
while(True):
    ret,frame = cap.read()
    gray = cv2.cvtColor(frame,cv2.COLOR_BGR2GRAY)
    gray = cv2.adaptiveThreshold(gray, 255, cv2.ADAPTIVE_THRESH_MEAN_C, 
                                          cv2.THRESH_BINARY, 199, 5)
    cv2.imshow('frames', gray)
    if cv2.waitKey(1)& 0xFF == ord('q'):
        break
cap.release()
cv2.destroyAllWindows()

### Otsu’s Binarization

In global thresholding, we used an arbitrary value for threshold value, right? So, how can we know a value we selected is good or not? Answer is, trial and error method. But consider a bimodal image (In simple words, bimodal image is an image whose histogram has two peaks). For that image, we can approximately take a value in the middle of those peaks as threshold value, right ? That is what Otsu binarization does. So in simple words, it automatically calculates a threshold value from image histogram for a bimodal image. (For images which are not bimodal, binarization won’t be accurate.)

For this, our cv2.threshold() function is used, but pass an extra flag, cv2.THRESH_OTSU. For threshold value, simply pass zero. Then the algorithm finds the optimal threshold value and returns you as the second output, retVal. If Otsu thresholding is not used, retVal is same as the threshold value you used.

Check out below example. Input image is a noisy image. In first case, I applied global thresholding for a value of 127. In second case, I applied Otsu’s thresholding directly. In third case, I filtered image with a 5x5 gaussian kernel to remove the noise, then applied Otsu thresholding. See how noise filtering improves the result.

![image.png](attachment:5b90a12d-e9c1-4ad0-a4ec-a175d11fe0ae.png)

In [None]:
import numpy as np
import cv2
img = cv2.imread('hero.jpg',0)
ret, otsu = cv2.threshold(img,0,255,cv2.THRESH_BINARY+cv2.THRESH_OTSU)
cv2.imshow('images', otsu)
cv2.waitKey(0)
cv2.destroyAllWindows()

## Geometric Transformations of Images
 
 Learn to apply different geometric transformation to images like translation, rotation, affine transformation etc.
You will see these functions: cv2.getPerspectiveTransform

![image.png](attachment:6f08789a-7c2e-4756-b911-bbc05629066c.png)

OpenCV provides two transformation functions, cv2.warpAffine and cv2.warpPerspective, with which you can have all kinds of transformations. cv2.warpAffine takes a 2x3 transformation matrix while cv2.warpPerspective takes a 3x3 transformation matrix as input.

#### Scaling

Scaling is just resizing of the image. OpenCV comes with a function cv2.resize() for this purpose. The size of the image can be specified manually, or you can specify the scaling factor. Different interpolation methods are used. Preferable interpolation methods are cv2.INTER_AREA for shrinking and cv2.INTER_CUBIC (slow) & cv2.INTER_LINEAR for zooming. By default, interpolation method used is cv2.INTER_LINEAR for all resizing purposes. You can resize an input image either of following methods:

In [None]:
import numpy as np
import cv2
img = cv2.imread('hero.jpg')
img = cv2.resize(img,(750,850),interpolation = cv2.INTER_CUBIC)
cv2.imshow('images',img)
cv2.waitKey(0)
cv2.destroyAllWindows()

### Translation

Translation is the shifting of object’s location. If you know the shift in (x,y) direction, let it be (t_x,t_y), you can create the transformation matrix \textbf{M} as follows

#### In Affine transformation,

all parallel lines in the original image will still be parallel in the output image. To find the transformation matrix, we need three points from input image and their corresponding locations in the output image. Then cv2.getAffineTransform will create a 2×3 matrix which is to be passed to cv2.warpAffine.

Third argument of the cv2.warpAffine() function is the size of the output image, which should be in the form of (width, height). Remember width = number of columns, and height = number of rows.

#### cv2.getAffineTransform method

syntax:cv2.getPerspectiveTransform(src, dst)

src: Coordinates of quadrangle vertices in the source image.

dst: Coordinates of the corresponding quadrangle vertices in the destination image.

#### cv2.warpAffine method

Syntax: cv2.warpAffine(src, M, dsize, dst, flags, borderMode, borderValue)

src: input image.

dst: output image that has the size dsize and the same type as src.

M: transformation matrix.

dsize: size of the output image.

flags: combination of interpolation methods (see resize() ) and the optional flag

WARP_INVERSE_MAP that means that M is the inverse transformation (dst->src).

borderMode: pixel extrapolation method; when borderMode=BORDER_TRANSPARENT, it means that the pixels in the destination image corresponding to 

the “outliers” in the source image are not modified by the function.

borderValue: value used in case of a constant border; by default, it is 0.

In [None]:
import numpy as np
import cv2
img = cv2.imread('hero.jpg',0)
rows,cols = img.shape

M = np.float32([[1,0,100],[0,1,50]])# m is the 2x3 transformation matrix
dst = cv2.warpAffine(img,M,(cols,rows))

cv2.imshow('img',dst)
cv2.waitKey(0)
cv2.destroyAllWindows()

In [None]:
import cv2 
import numpy as np 
from matplotlib import pyplot as plt 
img = cv2.imread('hero.jpg') 
rows, cols, ch = img.shape 
pts1 = np.float32([[50, 50], [200, 50],[50, 200]]) 
pts2 = np.float32([[85, 100], [200, 50],[50, 200]])
M = cv2.getAffineTransform(pts1, pts2) 
dst = cv2.warpAffine(img, M, (cols, rows)) 
plt.subplot(121) 
plt.imshow(img) 
plt.title('Input') 
  
plt.subplot(122) 
plt.imshow(dst) 
plt.title('Output') 
  
plt.show() 
  

#### Perspective Transformation

For perspective transformation, you need a 3x3 transformation matrix. Straight lines will remain straight even after the transformation. To find this transformation matrix, you need 4 points on the input image and corresponding points on the output image. Among these 4 points, 3 of them should not be collinear. Then transformation matrix can be found by the function cv2.getPerspectiveTransform. Then apply cv2.warpPerspective with this 3x3 transformation matrix.



In [None]:
import cv2
import numpy as np
img = cv2.imread('hero.jpg')
rows,cols,ch = img.shape

pts1 = np.float32([[56,65],[368,52],[28,387],[389,390]])
pts2 = np.float32([[0,0],[300,0],[0,300],[300,300]])

M = cv2.getPerspectiveTransform(pts1,pts2)

dst = cv2.warpPerspective(img,M,(300,300))

plt.subplot(121),plt.imshow(img),plt.title('Input')
plt.subplot(122),plt.imshow(dst),plt.title('Output')
plt.show()

### Smoothing Images

Blur imagess with various low pass filters

Apply custom-made filters to images (2D convolution)

syntax:https://docs.opencv.org/master/d4/d86/group__imgproc__filter.html

#### 2D Convolution ( Image Filtering )

As for one-dimensional signals, images also can be filtered with various low-pass filters (LPF), high-pass filters (HPF), etc. A LPF helps in removing noise, or blurring the image. A HPF filters helps in finding edges in an image.

OpenCV provides a function, cv2.filter2D(), to convolve a kernel with an image. As an example, we will try an averaging filter on an image. A 5x5 averaging filter kernel can be defined as follows

![image.png](attachment:db9df50d-29b4-441b-a766-d78d25adb77f.png)

Filtering with the above kernel results in the following being performed: for each pixel, a 5x5 window is centered on this pixel, all pixels falling within this window are summed up, and the result is then divided by 25. This equates to computing the average of the pixel values inside that window. This operation is performed for all the pixels in the image to produce the output filtered image. Try this code and check the result:

In [None]:
import cv2
import numpy as np
from matplotlib import pyplot as plt

img1 = cv2.imread('hero.jpg',0)

kernel = np.ones((5,5),np.float32)/25
dst = cv2.filter2D(img1,-1,kernel)

plt.subplot(121),plt.imshow(img),plt.title('Original')
plt.xticks([]), plt.yticks([])
plt.subplot(122),plt.imshow(dst),plt.title('Averaging')
plt.xticks([]), plt.yticks([])
plt.show()

### Image Blurring (Image Smoothing)

Image blurring is achieved by convolving the image with a low-pass filter kernel. It is useful for removing noise. It actually removes high frequency content (e.g: noise, edges) from the image resulting in edges being blurred when this is filter is applied. (Well, there are blurring techniques which do not blur edges). OpenCV provides mainly four types of blurring techniques.

#### 1. Averaging

This is done by convolving the image with a normalized box filter. It simply takes the average of all the pixels under kernel area and replaces the central element with this average. This is done by the function cv2.blur() or cv2.boxFilter(). Check the docs for more details about the kernel. We should specify the width and height of kernel. A 3x3 normalized box filter would look like this:
![image.png](attachment:75bf83a8-dafb-48a6-b0b1-cc2e4cd1b93c.png)

If you don’t want to use a normalized box filter, use cv2.boxFilter() and pass the argument normalize=False to the function.

syntax: cv2.blur (InputArray src, OutputArray dst, Size ksize, Point anchor=Point(-1,-1), int borderType=BORDER_DEFAULT)Blurs an image using the normalized box filter.

In [None]:
import cv2
import numpy as np
from matplotlib import pyplot as plt

img = cv2.imread('eyes.jpg')

blur = cv2.blur(img,(30,20))

plt.subplot(121),plt.imshow(img),plt.title('Original')
plt.xticks([]), plt.yticks([])
plt.subplot(122),plt.imshow(blur),plt.title('Blurred')
plt.xticks([]), plt.yticks([])
plt.show()

cv2.imshow('images', blur)
cv2.waitKey()
cv2.destroyAllWindows()

#### 2. Gaussian Filtering

In this approach, instead of a box filter consisting of equal filter coefficients, a Gaussian kernel is used. It is done with the function, cv2.GaussianBlur(). We should specify the width and height of the kernel which should be positive and odd. We also should specify the standard deviation in the X and Y directions, sigmaX and sigmaY respectively. If only sigmaX is specified, sigmaY is taken as equal to sigmaX. If both are given as zeros, they are calculated from the kernel size. Gaussian filtering is highly effective in removing Gaussian noise from the image.

Gaussian filters have the properties of having no overshoot to a step function input while minimizing the rise and fall time. In terms of image processing, any sharp edges in images are smoothed while minimizing too much blurring.

dst = cv2.GaussianBlur(src, ksize, sigmaX[, dst[, sigmaY[, borderType=BORDER_DEFAULT]]] )

src = input image

dst = output image

ksize = Gaussian Kernel Size. [height width]. height and width should be odd and can have different values. If ksize is set to [0 0], then ksize is computed from sigma values.

sigmaX = Kernel standard deviation along X-axis (horizontal direction).

sigmaY = Kernel standard deviation along Y-axis (vertical direction). If sigmaY=0, then sigmaX value is taken for sigmaY

borderType = Specifies image boundaries while kernel is applied on image borders. Possible values are : cv.BORDER_CONSTANT cv.BORDER_REPLICATE cv.BORDER_REFLECT cv.BORDER_WRAP cv.BORDER_REFLECT_101 cv.BORDER_TRANSPARENT cv.BORDER_REFLECT101 cv.BORDER_DEFAULT cv.BORDER_ISOLATED

In [None]:
import cv2
import numpy as np
from matplotlib import pyplot as plt

img = cv2.imread('eyes.jpg')

blur = cv2.GaussianBlur(img,(15,15),0)

plt.subplot(121),plt.imshow(img),plt.title('Original')
plt.xticks([]), plt.yticks([])
plt.subplot(122),plt.imshow(blur),plt.title('Blurred')
plt.xticks([]), plt.yticks([])
plt.show()

cv2.imshow('images', blur)
cv2.waitKey()
cv2.destroyAllWindows()

#### 3. Median Filtering

Here, the function cv2.medianBlur() computes the median of all the pixels under the kernel window and the central pixel is replaced with this median value. This is highly effective in removing salt-and-pepper noise. One interesting thing to note is that, in the Gaussian and box filters, the filtered value for the central element can be a value which may not exist in the original image. However this is not the case in median filtering, since the central element is always replaced by some pixel value in the image. This reduces the noise effectively. The kernel size must be a positive odd integer.



In [None]:
import cv2
import numpy as np
from matplotlib import pyplot as plt

img1 = cv2.imread('spider.jpg',1)

median = cv2.medianBlur(img1,5)

plt.subplot(121),plt.imshow(img1),plt.title('Original')
plt.xticks([]), plt.yticks([])
plt.subplot(122),plt.imshow(median),plt.title('Blurred')
plt.xticks([]), plt.yticks([])
plt.show()

cv2.imshow('images', median)
cv2.waitKey()
cv2.destroyAllWindows()

#### 4. Bilateral Filtering

As we noted, the filters we presented earlier tend to blur edges. This is not the case for the bilateral filter, cv2.bilateralFilter(), which was defined for, and is highly effective at noise removal while preserving edges. But the operation is slower compared to other filters. We already saw that a Gaussian filter takes the a neighborhood around the pixel and finds its Gaussian weighted average. This Gaussian filter is a function of space alone, that is, nearby pixels are considered while filtering. It does not consider whether pixels have almost the same intensity value and does not consider whether the pixel lies on an edge or not. The resulting effect is that Gaussian filters tend to blur edges, which is undesirable.

The bilateral filter also uses a Gaussian filter in the space domain, but it also uses one more (multiplicative) Gaussian filter component which is a function of pixel intensity differences. The Gaussian function of space makes sure that only pixels are ‘spatial neighbors’ are considered for filtering, while the Gaussian component applied in the intensity domain (a Gaussian function of intensity differences) ensures that only those pixels with intensities similar to that of the central pixel (‘intensity neighbors’) are included to compute the blurred intensity value. As a result, this method preserves edges, since for pixels lying near edges, neighboring pixels placed on the other side of the edge, and therefore exhibiting large intensity variations when compared to the central pixel, will not be included for blurring.

In [None]:
import cv2
import numpy as np
from matplotlib import pyplot as plt

img = cv2.imread('spider.jpg')

blur = cv2.bilateralFilter(img,25,75,75)

plt.subplot(121),plt.imshow(img),plt.title('Original')
plt.xticks([]), plt.yticks([])
plt.subplot(122),plt.imshow(blur),plt.title('Blurred')
plt.xticks([]), plt.yticks([])
plt.show()

cv2.imshow('images', blur)
cv2.waitKey()
cv2.destroyAllWindows()

##### Morphological Transformations

Erosion and Dilation are morphological image processing operations. OpenCV morphological image processing is a procedure for modifying the geometric structure in the image. In morphism, we find the shape and size or structure of an object. Both operations are defined for binary images, but we can also use them on a grayscale image. These are widely used in the following way:

. Removing Noise

. Identify intensity bumps or holes in the picture.

. Isolation of individual elements and joining disparate elements in image.

We will learn different morphological operations like Erosion, Dilation, Opening, Closing etc.

We will see different functions like : cv2.erode(), cv2.dilate(), cv2.morphologyEx() etc.

Morphological transformations are some simple operations based on the image shape. It is normally performed on binary images. It needs two inputs, one is our original image, second one is called structuring element or kernel which decides the nature of operation. Two basic morphological operators are Erosion and Dilation. Then its variant forms like Opening, Closing, Gradient etc also comes into play. We will see them one-by-one with help of following image.

1. Erosion
The basic idea of erosion is just like soil erosion only, it erodes away the boundaries of foreground object (Always try to keep foreground in white). So what does it do? The kernel slides through the image (as in 2D convolution). A pixel in the original image (either 1 or 0) will be considered 1 only if all the pixels under the kernel is 1, otherwise it is eroded (made to zero).

So what happends is that, all the pixels near boundary will be discarded depending upon the size of kernel. So the thickness or size of the foreground object decreases or simply white region decreases in the image. It is useful for removing small white noises (as we have seen in colorspace chapter), detach two connected objects etc.

![image.png](attachment:664c5902-a8fe-49d4-a404-053838bfaa28.png),![image.png](attachment:300dd890-6ea7-4105-8339-db476480b9c5.png)

In [None]:
# erosion on image
import cv2
import numpy as np
image = cv2.imread('hero.jpg',1)
kernel = np.ones((5,5))# when we increase matrix size it will erode more
erosion = cv2.erode(image,kernel,iterations = 1)# no of iterations (how many times we need to perform erosion operation on image)
cv2.imshow('images', erosion)
cv2.waitKey(0)
cv2.destroyAllWindows()

In [None]:
# erosion on video

import cv2
import numpy as np
cap = cv2.VideoCapture(0)
while(True):
    ret, frame = cap.read()
    kernel = np.ones((10,10))
    frame = cv2.cvtColor(frame,cv2.COLOR_BGR2GRAY)
    erosion = cv2.erode(frame,kernel,iterations = 1)
    cv2.imshow('imagesd', erosion)
    if cv2.waitKey(1) & 0xFF == ord('q'):
        break
cap.release()
cv2.destroyAllWindows()

#### 2. Dilation

It is just opposite of erosion. Here, a pixel element is ‘1’ if atleast one pixel under the kernel is ‘1’. So it increases the white region in the image or size of foreground object increases. Normally, in cases like noise removal, erosion is followed by dilation. Because, erosion removes white noises, but it also shrinks our object. So we dilate it. Since noise is gone, they won’t come back, but our object area increases. It is also useful in joining broken parts of an object.

Dilation: The value of the output pixel is the maximum value of all the pixels that fall within the structuring element's size and shape. For example in a binary image, if any of the pixels of the input image falling within the range of the kernel is set to the value 1, the corresponding pixel of the output image will be set to 1 as well. The latter applies to any type of image (e.g. grayscale, bgr, etc).

![image.png](attachment:3840746d-0145-493f-ab06-b8ddff6950d3.png)

![image.png](attachment:3ca73bb5-76e2-4b51-994f-f36869a76b38.png)

 ![image.png](attachment:b087b15e-dcc7-480c-a1bf-5849bc91d416.png),![image.png](attachment:9f57f545-f5d4-49c4-9143-ff6a9bebfa77.png)

In [None]:
# dilation on images
import cv2
import numpy as np
ima = cv2.imread('spider.jpg')
kernel = np.ones((7,7))
dilation = cv2.dilate(ima, kernel, iterations = 1)
cv2.imshow('images',dilation)
cv2.waitKey(0)
cv2.destroyAllWindows()

#### 3. Opening
Opening is just another name of erosion followed by dilation. It is useful in removing noise, as we explained above. Here we use the function, cv2.morphologyEx()
![image.png](attachment:f101c273-31b2-4034-9970-01293e4988f2.png)

In [None]:
import cv2
import numpy as np
ima = cv2.imread('spider.jpg')
kernel = np.ones((7,7))
opening = cv2.morphologyEx(ima, cv2.MORPH_OPEN, kernel)
cv2.imshow('images', opening)
cv2.waitKey(0)
cv2.destroyAllWindows()

#### 4. Closing
Closing is reverse of Opening, Dilation followed by Erosion. It is useful in closing small holes inside the foreground objects, or small black points on the object.
![image.png](attachment:55b679b4-3987-4889-a629-33c1ee0c37d7.png)

In [None]:
import cv2
import numpy as np
ima = cv2.imread('spider.jpg')
kernel = np.ones((7,7))
closing = cv2.morphologyEx(ima, cv2.MORPH_CLOSE, kernel)
cv2.imshow('imagegfb', closing)
cv2.waitKey(0)
cv2.destroyAllWindows()

#### 5. Morphological Gradient

It is the difference between dilation and erosion of an image.

The result will look like the outline of the object.

In [None]:
import cv2
import numpy as np
ima = cv2.imread('hero.jpg', 0)
kernel = np.ones((7,7))
gradient = cv2.morphologyEx(ima, cv2.MORPH_GRADIENT, kernel)
cv2.imshow('imagegfb', gradient)
cv2.waitKey(0)
cv2.destroyAllWindows()

In [None]:
import cv2
import numpy as np
cap = cv2.VideoCapture(0)
while(True):
    ret, frame = cap.read()
    kernel= np.ones((5,5))
    gradient = cv2.morphologyEx(frame, cv2.MORPH_GRADIENT, kernel)
    cv2.imshow('imagess', gradient)
    if cv2.waitKey(1) & 0xFF == ord('q'):
        break
cap.release()
cv2.destroyAllWindows()

#### Top Hat

It is the difference between input image and Opening of the image. Below example is done for a 9x9 kernel.

![image.png](attachment:75500bb0-030e-4475-b822-8f093bb02b4c.png)

In [None]:
import cv2
import numpy as np
ima = cv2.imread('hero.jpg',1)
kernel = np.ones((7,7))
tophat = cv2.morphologyEx(ima, cv2.MORPH_TOPHAT, kernel)
cv2.imshow('imagegfb', tophat)
cv2.waitKey(0)
cv2.destroyAllWindows()

#### Black Hat

It is the difference between the closing of the input image and input image.

![image.png](attachment:96d127ea-ed97-4e33-ba7f-712a613ae431.png)

In [None]:
import cv2
import numpy as np
ima = cv2.imread('hero.jpg',1)
kernel = np.ones((7,7))
blackhat = cv2.morphologyEx(ima, cv2.MORPH_BLACKHAT, kernel)
cv2.imshow('imagegfb', blackhat)
cv2.waitKey(0)
cv2.destroyAllWindows()

#### Structuring Element

We manually created a structuring elements in the previous examples with help of Numpy. It is rectangular shape. But in some cases, you may need elliptical/circular shaped kernels. So for this purpose, OpenCV has a function, cv2.getStructuringElement(). You just pass the shape and size of the kernel, you get the desired kernel.

In [None]:
import cv2
import numpy as np
# Rectangular Kernel
img = cv2.getStructuringElement(cv2.MORPH_RECT,(450,560))
cv2.imshow('imagegfb', img)
cv2.waitKey(0)
cv2.destroyAllWindows()


In [None]:
import cv2
import numpy as np
# Elliptical Kernel
img = cv2.getStructuringElement(cv2.MORPH_ELLIPSE,(655,755))
cv2.imshow('imagegfb', img)
cv2.waitKey(0)
cv2.destroyAllWindows()


### Image Gradients

The gradient of an image measures how it is changing. It provides two pieces of information. The magnitude of the gradient tells us how quickly the image is changing, while the direction of the gradient tells us the direction in which the image is changing most rapidly.

Find Image gradients, edges etc

We will see following functions : cv2.Sobel(), cv2.Scharr(), cv2.Laplacian() etc

1. Sobel and Scharr Derivatives
Sobel operators is a joint Gausssian smoothing plus differentiation operation, so it is more resistant to noise. You can specify the direction of derivatives to be taken, vertical or horizontal (by the arguments, yorder and xorder respectively). You can also specify the size of kernel by the argument ksize. If ksize = -1, a 3x3 Scharr filter is used which gives better results than 3x3 Sobel filter. Please see the docs for kernels used.

2. Laplacian Derivatives
It calculates the Laplacian of the image given by the relation, \Delta src = \frac{\partial ^2{src}}{\partial x^2} + \frac{\partial ^2{src}}{\partial y^2} where each derivative is found using Sobel derivatives. If ksize = 1, then following kernel is used for filtering:

![image.png](attachment:b5be1219-02ae-417d-a871-fea79bac251b.png)

In [None]:
import cv2
import numpy as np
img = cv2.imread('hero.jpg',1)
laplacian = cv2.Laplacian(img,cv2.CV_64F)
sobelx = cv2.Sobel(img,cv2.CV_64F,1,0,ksize=5)
sobely = cv2.Sobel(img,cv2.CV_64F,0,1,ksize=5)

plt.subplot(2,2,1),plt.imshow(img,cmap = 'gray')
plt.title('Original'), plt.xticks([]), plt.yticks([])
plt.subplot(2,2,2),plt.imshow(laplacian,cmap = 'gray')
plt.title('Laplacian'), plt.xticks([]), plt.yticks([])
plt.subplot(2,2,3),plt.imshow(sobelx,cmap = 'gray')
plt.title('Sobel X'), plt.xticks([]), plt.yticks([])
plt.subplot(2,2,4),plt.imshow(sobely,cmap = 'gray')
plt.title('Sobel Y'), plt.xticks([]), plt.yticks([])

plt.show()

cv2.imshow('images1', laplacian)
cv2.imshow('images2', sobelx)
cv2.imshow('images3', sobely)
cv2.waitKey(0)
cv2.destroyAllWindows()

In [None]:
# video
import cv2
import numpy as np
cap = cv2.VideoCapture(0)
while(True):
    ret, frame = cap.read()
    laplacian = cv2.Laplacian(frame,cv2.CV_64F)
    sobelx = cv2.Sobel(frame,cv2.CV_64F,1,0,ksize=5)
    sobely = cv2.Sobel(frame,cv2.CV_64F,0,1,ksize=5)
    cv2.imshow('images1', laplacian)
    cv2.imshow('images2', sobelx)
    cv2.imshow('images3', sobely)
    if cv2.waitKey(1) & 0xFF == ord('q'):
        break
cap.release()
cv2.destroyAllWindows()

#### Edge Detection

Concept of Canny edge detection

OpenCV functions for that : cv2.Canny()

Edge Detection is an image processing technique to find boundaries of objects in the image.

In this tutorial, we shall learn to find edges of focused objects in an image using Canny Edge Detection Technique.

edges = cv2.Canny('/path/to/img', minVal, maxVal, apertureSize, L2gradient)

. /path/to/img  (Mandatory)=File Path of the image

. minVal   (Mandatory) = Minimum intensity gradient

. maxVal   (Mandatory) = Maximum intensity gradient

. apertureSize (Optional)

. L2gradient (Optional) (Default Value : false)=If true, Canny() uses a much more computationally expensive equation to detect edges, which provides more accuracy at the cost of resources.

In [None]:
import numpy as np
import cv2
img = cv2.imread('hero.jpg')
edge = cv2.Canny(img, 30,150)
cv2.imshow('imagesa', edge)
cv2.waitKey(0)
cv2.destroyAllWindows()

In [None]:
import numpy as np
import cv2
cap = cv2.VideoCapture(0)
while(True):
    ret,frame = cap.read()
    frame = cv2.Canny(frame, 35,250)
    cv2.imshow('imagesd',frame)
    if cv2.waitKey(1) & 0xFF == ord('q'):
        break
cap.release()
cv2.destroyAllWindows()
    

#### Image Pyramids

We will learn about Image Pyramids

We will use Image pyramids to create a new fruit, “Orapple”

We will see these functions: cv2.pyrUp(), cv2.pyrDown()

Normally, we used to work with an image of constant size. But in some occassions, we need to work with images of different resolution of the same image. For example, while searching for something in an image, like face, we are not sure at what size the object will be present in the image. In that case, we will need to create a set of images with different resolution and search for object in all the images. These set of images with different resolution are called Image Pyramids (because when they are kept in a stack with biggest image at bottom and smallest image at top look like a pyramid).

There are two kinds of Image Pyramids. 1) Gaussian Pyramid and 2) Laplacian Pyramids

Higher level (Low resolution) in a Gaussian Pyramid is formed by removing consecutive rows and columns in Lower level (higher resolution) image. Then each pixel in higher level is formed by the contribution from 5 pixels in underlying level with gaussian weights. By doing so, a M \times N image becomes M/2 \times N/2 image. So area reduces to one-fourth of original area. It is called an Octave. The same pattern continues as we go upper in pyramid (ie, resolution decreases). Similarly while expanding, area becomes 4 times in each level. We can find Gaussian pyramids using cv2.pyrDown() and cv2.pyrUp() functions.

![image.png](attachment:c338bba9-5b86-4547-b85a-d50d61f0b5aa.png)

In [None]:
import numpy as np
import cv2
img = cv2.imread('hero.jpg')
lower_res = cv2.pyrUp(img)
cv2.imshow('images', lower_res)
cv2.waitKey(0)
cv2.destroyAllWindows()

### Contours in OpenCV

Understand what contours are.

Learn to find contours, draw contours etc

You will see these functions : cv2.findContours(), cv2.drawContours()

##### What are contours?

Contours can be explained simply as a curve joining all the continuous points (along the boundary), having same color or intensity. The contours are a useful tool for shape analysis and object detection and recognition.

For better accuracy, use binary images. So before finding contours, apply threshold or canny edge detection.
findContours function modifies the source image. So if you want source image even after finding contours, already store it to some other variables.
In OpenCV, finding contours is like finding white object from black background. So remember, object to be found should be white and background should be black.

See, there are three arguments in cv2.findContours() function, first one is source image, second is contour retrieval mode, third is contour approximation method. And it outputs the image, contours and hierarchy. contours is a Python list of all the contours in the image. Each individual contour is a Numpy array of (x,y) coordinates of boundary points of the object.

##### Contour Approximation Method

bove, we told that contours are the boundaries of a shape with same intensity. It stores the (x,y) coordinates of the boundary of a shape. But does it store all the coordinates ? That is specified by this contour approximation method.

If you pass cv2.CHAIN_APPROX_NONE, all the boundary points are stored. But actually do we need all the points? For eg, you found the contour of a straight line. Do you need all the points on the line to represent that line? No, we need just two end points of that line. This is what cv2.CHAIN_APPROX_SIMPLE does. It removes all redundant points and compresses the contour, thereby saving memory.

Below image of a rectangle demonstrate this technique. Just draw a circle on all the coordinates in the contour array (drawn in blue color). First image shows points I got with cv2.CHAIN_APPROX_NONE (734 points) and second image shows the one with cv2.CHAIN_APPROX_SIMPLE (only 4 points). See, how much memory it saves!!!

![image.png](attachment:ed38d345-1321-46a9-9300-fd42b0ab34f8.png)


In [None]:
# Let’s see how to find contours of a binary image
import numpy as np
import cv2
img = cv2.imread('hero.jpg')
gray = cv2.cvtColor(img,cv2.COLOR_BGR2GRAY)# converting bgr to gray color
ret,thres = cv2.threshold(gray, 120,255, cv2.THRESH_BINARY)# converting gray color to binary images
contours, hierarchy = cv2.findContours(thres,cv2.RETR_TREE,cv2.CHAIN_APPROX_SIMPLE)
print('number of contours =' + str(len(contours)))# gives total no of contours
print(contours[0])# gives coordinates of contour at index 0 position

This way, contours in an image has some relationship to each other. And we can specify how one contour is connected to each other, like, is it child of some other contour, or is it a parent etc. Representation of this relationship is called the Hierarchy.

In [None]:
# once we got contours, by using draw contours we draw contours on output images
img = cv2.drawContours(img, contours, -1, (0,255,0), 3)
cv2.imshow('images', img)
cv2.waitKey(0)
cv2.destroyAllWindows()

In [None]:
#using canny edge detection for finding contours
import numpy as np
import cv2
img = cv2.imread('hero.jpg')
gray = cv2.cvtColor(img,cv2.COLOR_BGR2GRAY)# converting bgr to gray color
edge = cv2.Canny(img, 30,150)# using canny edge detection
contours, hierarchy = cv2.findContours(edge,cv2.RETR_TREE,cv2.CHAIN_APPROX_SIMPLE)
img = cv2.drawContours(img, contours, -1, (0,255,0), 3)
cv2.imshow('images', img)
cv2.waitKey(0)
cv2.destroyAllWindows()

In [None]:
# drawing counters on live video
import numpy as np
import cv2
cap = cv2.VideoCapture(0)
while(True):
    ret,frame = cap.read()
    gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)
    ret,thres = cv2.threshold(gray,120,255,cv2.THRESH_BINARY)
    edge = cv2.Canny(thres, 30,150)# using canny edge detection
   
    contours1, hierarchy1 = cv2.findContours(thres,cv2.RETR_TREE,cv2.CHAIN_APPROX_NONE)
    imag = cv2.drawContours(frame, contours1,-1,(0,255,255),2)
   
    cv2.imshow('images', imag)
    
    if cv2.waitKey(1)& 0xFF == ord('q'):
        break
cap.release()
cv2.destroyAllWindows()
       

### Contour Features
To find the different features of contours, like area, perimeter, centroid, bounding box etc
You will see plenty of functions related to contours.

##### 1. Moments

Image moments help you to calculate some features like center of mass of the object, area of the object etc. Check out the wikipedia page on Image Moments

The function cv2.moments() gives a dictionary of all moment values calculated.



In [None]:
import cv2
import numpy as np

img = cv2.imread('hero.jpg',0)
ret,thresh = cv2.threshold(img,127,255,0)
contours,hierarchy = cv2.findContours(thresh, 1, 2)

cnt = contours[0]
M = cv2.moments(cnt)
print(M)

##### 2. Contour Area

Contour area is given by the function cv2.contourArea() or from moments, M[‘m00’].

In [None]:
import cv2
import numpy as np

img = cv2.imread('bmax.jpg',0)
ret,thresh = cv2.threshold(img,127,255,0)
contours,hierarchy = cv2.findContours(thresh, 1, 2)
cnt = contours[-1]
area = cv2.contourArea(cnt)
print(area)

##### 3. Contour Perimeter

It is also called arc length. It can be found out using cv2.arcLength() function. Second argument specify whether shape is a closed contour (if passed True), or just a curve.

In [None]:
import cv2
import numpy as np

img = cv2.imread('bmax.jpg',0)
ret,thresh = cv2.threshold(img,127,255,0)
contours,hierarchy = cv2.findContours(thresh, 1, 2)
cnt = contours[-1]
perimeter = cv2.arcLength(cnt,True)
print(perimeter)

##### 4. Contour Approximation

It approximates a contour shape to another shape with less number of vertices depending upon the precision we specify. It is an implementation of Douglas-Peucker algorithm. Check the wikipedia page for algorithm and demonstration.

To understand this, suppose you are trying to find a square in an image, but due to some problems in the image, you didn’t get a perfect square, but a “bad shape” (As shown in first image below). Now you can use this function to approximate the shape. In this, second argument is called epsilon, which is maximum distance from contour to approximated contour. It is an accuracy parameter. A wise selection of epsilon is needed to get the correct output.

Below, in second image, green line shows the approximated curve for epsilon = 10% of arc length. Third image shows the same for epsilon = 1% of the arc length. Third argument specifies whether curve is closed or not.

![image.png](attachment:abe44e93-be5f-4fbe-a60b-d82309bbc2bf.png)

In [None]:
import numpy as np
import cv2
img = cv2.imread('curved.png')
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
ret,thres = cv2.threshold(gray, 120,255, 0)
contours,hierarchy = cv2.findContours(thres, 1, 2)
epsilon = 0.1*cv2.arcLength(contours,True)
approx = cv2.approxPolyDP(contours,epsilon,True)

cv2.imshow('images', approx)
cv2.waitKey(0)
cv2.destroyAllWindows()

#### 5. Convex Hull

Convex Hull will look similar to contour approximation, but it is not (Both may provide same results in some cases). Here, cv2.convexHull() function checks a curve for convexity defects and corrects it. Generally speaking, convex curves are the curves which are always bulged out, or at-least flat. And if it is bulged inside, it is called convexity defects. For example, check the below image of hand. Red line shows the convex hull of hand. The double-sided arrow marks shows the convexity defects, which are the local maximum deviations of hull from contours.

![image.png](attachment:fc01fa0d-f0d2-42bb-8880-00c43d10751f.png)

##### syntax: hull = cv2.convexHull(points[, hull[, clockwise[, returnPoints]]

points are the contours we pass into.
hull is the output, normally we avoid it.
clockwise : Orientation flag. If it is True, the output convex hull is oriented clockwise. Otherwise, it is oriented counter-clockwise.
returnPoints : By default, True. Then it returns the coordinates of the hull points. If False, it returns the indices of contour points corresponding to the hull points.

#### 6. Checking Convexity
There is a function to check if a curve is convex or not, cv2.isContourConvex(). It just return whether True or False. Not a big deal.

#### 7. Bounding Rectangle
There are two types of bounding rectangles.

#### 7.a. Straight Bounding Rectangle
It is a straight rectangle, it doesn’t consider the rotation of the object. So area of the bounding rectangle won’t be minimum. It is found by the function cv2.boundingRect().

Let (x,y) be the top-left coordinate of the rectangle and (w,h) be its width and height.

x,y,w,h = cv2.boundingRect(cnt)

img = cv2.rectangle(img,(x,y),(x+w,y+h),(0,255,0),2)
#### 7.b. Rotated Rectangle
Here, bounding rectangle is drawn with minimum area, so it considers the rotation also. The function used is cv2.minAreaRect(). It returns a Box2D structure which contains following detals - ( top-left corner(x,y), (width, height), angle of rotation ). But to draw this rectangle, we need 4 corners of the rectangle. It is obtained by the function cv2.boxPoints()

rect = cv2.minAreaRect(cnt)

box = cv2.boxPoints(rect)

box = np.int0(box)

im = cv2.drawContours(im,[box],0,(0,0,255),2)

![image.png](attachment:18641dad-389a-4694-a708-c099aa57cd40.png)

In [None]:
import numpy as np
import cv2
img = cv2.imread('hawk')
ret,thresh = cv2.threshold(img,127,255,0)
contours,hierarchy = cv2.findContours(thresh, 1, 2)
cnt = contours[2]
x,y,w,h = cv2.boundingRect(cnt)
img = cv2.rectangle(img,(x,y),(x+w,y+h),(0,255,0),2)
cv2.imshow('imgres', img)
cv2.waitkey(0)
cv2.detroyAllWindows()

#### 8. Minimum Enclosing Circle

Next we find the circumcircle of an object using the function cv2.minEnclosingCircle(). It is a circle which completely covers the object with minimum area.

(x,y),radius = cv2.minEnclosingCircle(cnt)

center = (int(x),int(y))

radius = int(radius)

img = cv2.circle(img,center,radius,(0,255,0),2)

![image.png](attachment:16b6188f-f67d-4cb7-ac39-b5e838744cd4.png)

#### 9.Match Shapes

OpenCV comes with a function cv2.matchShapes() which enables us to compare two shapes, or two contours and returns a metric showing the similarity. The lower the result, the better match it is. It is calculated based on the hu-moment values

![image.png](attachment:d10eed63-54a8-42ea-bac2-9d9a3e27e8c3.png)

Matching Image A with itself = 0.0

Matching Image A with Image B = 0.001946

Matching Image A with Image C = 0.326911

See, even image rotation doesn’t affect much on this comparison.

In [None]:
import numpy as np
import cv2
img1 = cv2.imread('hero.jpg',0)
img2 = cv2.imread('bmax.jpg',0)
ret, thresh = cv2.threshold(img1, 127, 255,0)
ret, thresh2 = cv2.threshold(img2, 127, 255,0)
contours,hierarchy = cv2.findContours(thresh,2,1)
cnt1 = contours[0]

contours,hierarchy = cv2.findContours(thresh2,2,1)
cnt2 = contours[0]
ret = cv2.matchShapes(cnt1,cnt2,1,0.0)
print(ret)

### Contours Hierarchy
#### What is Hierarchy?
Normally we use the cv2.findContours() function to detect objects in an image, right ? Sometimes objects are in different locations. But in some cases, some shapes are inside other shapes. Just like nested figures. In this case, we call outer one as parent and inner one as child. This way, contours in an image has some relationship to each other. And we can specify how one contour is connected to each other, like, is it child of some other contour, or is it a parent etc. Representation of this relationship is called the Hierarchy.

## Histograms

So what is histogram ? You can consider histogram as a graph or plot, which gives you an overall idea about the intensity distribution of an image. It is a plot with pixel values (ranging from 0 to 255, not always) in X-axis and corresponding number of pixels in the image on Y-axis.

It is just another way of understanding the image. By looking at the histogram of an image, you get intuition about contrast, brightness, intensity distribution etc of that image. Almost all image processing tools today, provides features on histogram.

![image.png](attachment:680840f9-8be7-44c7-869b-a8ec892c9494.png)

You can see the image and its histogram. (Remember, this histogram is drawn for grayscale image, not color image). Left region of histogram shows the amount of darker pixels in image and right region shows the amount of brighter pixels. From the histogram, you can see dark region is more than brighter region, and amount of midtones (pixel values in mid-range, say around 127) are very less.

#### TONES
The region where most of the brightness values are present is called the "tonal range." Tonal range can vary drastically from image to image, so developing an intuition for how numbers map to actual brightness values is often critical—both before and after the photo has been taken. There is no one "ideal histogram" which all images should try to mimic; histograms should merely be representative of the tonal range in the scene and what the photographer wishes to convey.

![image.png](attachment:0ed2164c-016c-44a3-9775-651378d8f508.png)

DIMS : It is the number of parameters for which we collect the data. In this case, we collect data regarding only one thing, intensity value. So here it is 1.

RANGE : It is the range of intensity values you want to measure. Normally, it is [0,256], ie all intensity values.

#### 1. Histogram Calculation in OpenCV

So now we use cv2.calcHist() function to find the histogram. Let’s familiarize with the function and its parameters :

cv2.calcHist(images, channels, mask, histSize, ranges[, hist[, accumulate]])

images : it is the source image of type uint8 or float32. it should be given in square brackets, ie, “[img]”.

channels : it is also given in square brackets. It the index of channel for which we calculate histogram. For example, if input is grayscale 

image, its value is [0]. For color image, you can pass [0],[1] or [2] to calculate histogram of blue,green or red channel respectively.

mask : mask image. To find histogram of full image, it is given as “None”. But if you want to find histogram of particular region of image, you have to create a mask image for that and give it as mask. (I will show an example later.)

histSize : this represents our BIN count. Need to be given in square brackets. For full scale, we pass [256].

ranges : this is our RANGE. Normally, it is [0,256].

###### Numpy also provides you a function, np.histogram(). So instead of calcHist() function, you can try below line :

hist,bins = np.histogram(img.ravel(),256,[0,256])

In [None]:
import numpy as np
import cv2
img = cv2.imread('hero.jpg',0)
hist = cv2.calcHist([img],[0],None,[256],[0,256])
#print(hist)

#### Plotting Histograms

There are two ways for this,

Short Way : use Matplotlib plotting functions
    
   import cv2
   
   import numpy as np
   
   from matplotlib import pyplot as plt

    img = cv2.imread('home.jpg',0)

    plt.hist(img.ravel(),256,[0,256]); plt.show()

Long Way : use OpenCV drawing functions

In [None]:
#ploting histogram using matplotlib
import cv2
import numpy as np
from matplotlib import pyplot as plt

img = cv2.imread('hero.jpg',0)
plt.hist(img.ravel(),256,[0,256]); 
plt.show()

In [None]:
# ploting histogram using opencv for gray image
import cv2
import numpy as np
import matplotlib.pyplot as plt

img = cv2.imread('hero.jpg',0)
hist = cv2.calcHist([img],[0],None,[256],[0,256])
plt.plot(hist)
plt.show()

In [None]:
#for color images
import cv2
import numpy as np
from matplotlib import pyplot as plt

img = cv2.imread('ironman.jpg')
color = ('b','g','r')
for i,col in enumerate(color):
    histr = cv2.calcHist([img],[i],None,[256],[0,256])
    plt.plot(histr,color = col)
    plt.xlim([0,256])
plt.show()

### Histograms - 2: Histogram Equalization

Consider an image whose pixel values are confined to some specific range of values only. For eg, brighter image will have all pixels confined to high values. But a good image will have pixels from all regions of the image. So you need to stretch this histogram to either ends (as given in below image, from wikipedia) and that is what Histogram Equalization does (in simple words). This normally improves the contrast of the image.

![image.png](attachment:b3696edc-1a7c-4bf9-b979-15aca40f2010.png)

OpenCV has a function to do this, cv2.equalizeHist(). Its input is just grayscale image and output is our histogram equalized image.

In [None]:
import matplotlib.pyplot as plt
img = cv2.imread('hero.jpg',0)
equ = cv2.equalizeHist(img)
cv2.imshow('images',equ)
cv2.waitKey(0)
cv2.destroyAllWindows()
plt.subplot(1,2,1),plt.imshow(img,cmap = 'gray')
plt.title('Original'), plt.xticks([]), plt.yticks([])
plt.subplot(1,2,2),plt.imshow(equ,cmap = 'gray')
plt.title('eqivilizer'), plt.xticks([]), plt.yticks([])

#### CLAHE (Contrast Limited Adaptive Histogram Equalization)

The first histogram equalization we just saw, considers the global contrast of the image. In many cases, it is not a good idea

![image.png](attachment:7e8fb6ec-b322-4393-997a-6652cb5d73b4.png)

It is true that the background contrast has improved after histogram equalization. But compare the face of statue in both images. We lost most of the information there due to over-brightness. It is because its histogram is not confined to a particular region as we saw in previous cases (Try to plot histogram of input image, you will get more intuition).

So to solve this problem, adaptive histogram equalization is used. In this, image is divided into small blocks called “tiles” (tileSize is 8x8 by default in OpenCV). Then each of these blocks are histogram equalized as usual. So in a small area, histogram would confine to a small region (unless there is noise). If noise is there, it will be amplified. To avoid this, contrast limiting is applied. If any histogram bin is above the specified contrast limit (by default 40 in OpenCV), those pixels are clipped and distributed uniformly to other bins before applying histogram equalization. After equalization, to remove artifacts in tile borders, bilinear interpolation is applied.

In [None]:
import numpy as np
import cv2

img = cv2.imread('hero.jpg',0)

# create a CLAHE object (Arguments are optional).
clahe = cv2.createCLAHE(clipLimit=2.0, tileGridSize=(8,8))
cla = clahe.apply(img)
equ = cv2.equalizeHist(img)
cv2.imshow('images',cla)
cv2.imshow('image1',equ)
cv2.imshow('images2',img)
cv2.waitKey(0)
cv2.destroyAllWindows()
plt.subplot(2,2,1),plt.imshow(img,cmap = 'gray')
plt.title('Original'), plt.xticks([]), plt.yticks([])
plt.subplot(2,2,2),plt.imshow(equ,cmap = 'gray')
plt.title('eqivilizer'), plt.xticks([]), plt.yticks([])
plt.subplot(2,2,3),plt.imshow(cla,cmap = 'gray')
plt.title('clahe'), plt.xticks([]), plt.yticks([])
plt.show()

### Histograms - 3 : 2D Histograms

In the first article, we calculated and plotted one-dimensional histogram. It is called one-dimensional because we are taking only one feature into our consideration, ie grayscale intensity value of the pixel. But in two-dimensional histograms, you consider two features. Normally it is used for finding color histograms where two features are Hue & Saturation values of every pixel.

There is a python sample in the official samples already for finding color histograms. We will try to understand how to create such a color histogram, and it will be useful in understanding further topics like Histogram Back-Projection.

It is quite simple and calculated using the same function, cv2.calcHist(). For color histograms, we need to convert the image from BGR to HSV. (Remember, for 1D histogram, we converted from BGR to Grayscale). For 2D histograms, its parameters will be modified as follows:

channels = [0,1] because we need to process both H and S plane.

bins = [180,256] 180 for H plane and 256 for S plane.

range = [0,180,0,256] Hue value lies between 0 and 180 & Saturation lies between 0 and 256.



In [None]:
import cv2
import numpy as np

img = cv2.imread('hero.jpg')
hsv = cv2.cvtColor(img,cv2.COLOR_BGR2HSV)

hist = cv2.calcHist([hsv], [0, 1], None, [180, 256], [0, 180, 0, 256])
plt.imshow(hist,interpolation = 'nearest')
plt.show()


### Histogram - 4 : Histogram Backprojection

OpenCV provides an inbuilt function cv2.calcBackProject(). Its parameters are almost same as the cv2.calcHist() function. One of its parameter is histogram which is histogram of the object and we have to find it. Also, the object histogram should be normalized before passing on to the backproject function. It returns the probability image. Then we convolve the image with a disc kernel and apply threshold. 

### Image Transforms in OpenCV
#### Fourier Transform

Fourier Transform is used to analyze the frequency characteristics of various filters. For images, 2D Discrete Fourier Transform (DFT) is used to find the frequency domain. A fast algorithm called Fast Fourier Transform (FFT) is used for calculation of DFT. Details about these can be found in any image processing or signal processing textbooks. 

OpenCV provides the functions cv2.dft() and cv2.idft() for this. It returns the same result as previous, but with two channels. First channel will have the real part of the result and second channel will have the imaginary part of the result. The input image should be converted to np.float32 first. We will see how to do it.

In [None]:
import numpy as np
import cv2
from matplotlib import pyplot as plt

img = cv2.imread('hero.jpg',0)
dft = cv2.dft(np.float32(img),flags = cv2.DFT_COMPLEX_OUTPUT)
dft_shift = np.fft.fftshift(dft)
magnitude_spectrum = 20*np.log(cv2.magnitude(dft_shift[:,:,0],dft_shift[:,:,1]))

plt.subplot(121),plt.imshow(img, cmap = 'gray')
plt.title('Input Image'), plt.xticks([]), plt.yticks([])
plt.subplot(122),plt.imshow(magnitude_spectrum, cmap = 'gray')
plt.title('Magnitude Spectrum'), plt.xticks([]), plt.yticks([])
plt.show()

In [None]:
rows, cols = img.shape
crow,ccol = rows/2 , cols/2

# create a mask first, center square is 1, remaining all zeros
mask = np.zeros((rows,cols,2),np.uint8)
mask[crow-30:crow+30, ccol-30:ccol+30] = 1

# apply mask and inverse DFT
fshift = dft_shift*mask
f_ishift = np.fft.ifftshift(fshift)
img_back = cv2.idft(f_ishift)
img_back = cv2.magnitude(img_back[:,:,0],img_back[:,:,1])

plt.subplot(121),plt.imshow(img, cmap = 'gray')
plt.title('Input Image'), plt.xticks([]), plt.yticks([])
plt.subplot(122),plt.imshow(img_back, cmap = 'gray')
plt.title('Magnitude Spectrum'), plt.xticks([]), plt.yticks([])
plt.show()

### Template Matching

Template Matching is a method for searching and finding the location of a template image in a larger image. OpenCV comes with a function cv2.matchTemplate() for this purpose. It simply slides the template image over the input image (as in 2D convolution) and compares the template and patch of input image under the template image. Several comparison methods are implemented in OpenCV. (You can check docs for more details). It returns a grayscale image, where each pixel denotes how much does the neighbourhood of that pixel match with template.

If input image is of size (WxH) and template image is of size (wxh), output image will have a size of (W-w+1, H-h+1). Once you got the result, you can use cv2.minMaxLoc() function to find where is the maximum/minimum value. Take it as the top-left corner of rectangle and take (w,h) as width and height of the rectangle. That rectangle is your region of template.

If you are using cv2.TM_SQDIFF as comparison method, minimum value gives the best match.

In [None]:
import numpy as np
import cv2
import matplotlib.pyplot as plt
img = cv2.imread('hero.jpg', 0)
img2 = img.copy()
template = cv2.imread('hero1.png',0)
w, h = template.shape[::-1]
res = cv2.matchTemplate(img,template,cv2.TM_CCOEFF)
min_val, max_val, min_loc, max_loc = cv2.minMaxLoc(res)
top_left = min_loc
bottom_right = (top_left[0] + w, top_left[1] + h)
cv2.rectangle(img,top_left, bottom_right, 255, 2)
plt.subplot(121),plt.imshow(res,cmap = 'gray')
plt.title('Matching Result'), plt.xticks([]), plt.yticks([])
plt.subplot(122),plt.imshow(img,cmap = 'gray')
plt.title('Detected Point'), plt.xticks([]), plt.yticks([])
plt.suptitle(meth)
plt.show()

In [None]:
import cv2
import numpy as np
from matplotlib import pyplot as plt

img = cv2.imread('ironman.jpg',0)
img2 = img.copy()
template = cv2.imread('spider.jpg',0)
w, h = template.shape[::-1]

# All the 6 methods for comparison in a list
methods = ['cv2.TM_CCOEFF', 'cv2.TM_CCOEFF_NORMED', 'cv2.TM_CCORR',
            'cv2.TM_CCORR_NORMED', 'cv2.TM_SQDIFF', 'cv2.TM_SQDIFF_NORMED']

for meth in methods:
    img = img2.copy()
    method = eval(meth)

    # Apply template Matching
    res = cv2.matchTemplate(img,template,method)
    min_val, max_val, min_loc, max_loc = cv2.minMaxLoc(res)

    # If the method is TM_SQDIFF or TM_SQDIFF_NORMED, take minimum
    if method in [cv2.TM_SQDIFF, cv2.TM_SQDIFF_NORMED]:
        top_left = min_loc
    else:
        top_left = max_loc
    bottom_right = (top_left[0] + w, top_left[1] + h)

    cv2.rectangle(img,top_left, bottom_right, 255, 2)
    cv2.imshow('image',img)
    cv2.waitKey(0)
    cv2.destroyAllWindows()

    plt.subplot(121),plt.imshow(res,cmap = 'gray')
    plt.title('Matching Result'), plt.xticks([]), plt.yticks([])
    plt.subplot(122),plt.imshow(img,cmap = 'gray')
    plt.title('Detected Point'), plt.xticks([]), plt.yticks([])
    plt.suptitle(meth)

    plt.show()

#### Template Matching with Multiple Objects

In the previous section, we searched image for Messi’s face, which occurs only once in the image. Suppose you are searching for an object which has multiple occurances, cv2.minMaxLoc() won’t give you all the locations. In that case, we will use thresholding. 

In [None]:
import cv2
import numpy as np
from matplotlib import pyplot as plt

img_rgb = cv2.imread('rupee1.png',1)
img_gray = cv2.cvtColor(img_rgb, cv2.COLOR_BGR2GRAY)
template = cv2.imread('rupee.jfif',0)
w, h = template.shape[::-1]

res = cv2.matchTemplate(img_gray,template,cv2.TM_CCOEFF_NORMED)
threshold = 0.8
loc = np.where( res >= threshold)
for pt in zip(*loc[::-1]):
    cv2.rectangle(img_gray, pt, (pt[0] + w, pt[1] + h), (0,0,255), 2)
cv2.imshow('images',img_gray)
cv2.waitKey(0)
cv2.destroyAllWindows()

### Hough Line Transform

We will see how to use it detect lines in an image.

We will see following functions: cv2.HoughLines(), cv2.HoughLinesP()

Hough Transform is a popular technique to detect any shape, if you can represent that shape in mathematical form. It can detect the shape even if it is broken or distorted a little bit. We will see how it works for a line.

A line can be represented as y = mx+c or in parametric form, as \rho = x \cos \theta + y \sin \theta where \rho is the perpendicular distance from origin to the line, and \theta is the angle formed by this perpendicular line and horizontal axis measured in counter-clockwise ( That direction varies on how you represent the coordinate system. This representation is used in OpenCV).

![image.png](attachment:cb318458-f3c9-4d3f-9143-dc22fb47816f.png)

So if line is passing below the origin, it will have a positive rho and angle less than 180. If it is going above the origin, instead of taking angle greater than 180, angle is taken less than 180, and rho is taken negative. Any vertical line will have 0 degree and horizontal lines will have 90 degree.

Everything explained above is encapsulated in the OpenCV function, cv2.HoughLines(). It simply returns an array of (\rho, \theta) values. \rho is measured in pixels and \theta is measured in radians. First parameter, Input image should be a binary image, so apply threshold or use canny edge detection before finding applying hough transform. Second and third parameters are \rho and \theta accuracies respectively. Fourth argument is the threshold, which means minimum vote it should get for it to be considered as a line. Remember, number of votes depend upon number of points on the line. So it represents the minimum length of line that should be detected.

In [None]:
import cv2
import numpy as np
img = cv2.imread('line.jpeg')
gray = cv2.cvtColor(img,cv2.COLOR_BGR2GRAY)
edges = cv2.Canny(gray,50,150,apertureSize = 3)
lines = cv2.HoughLines(edges,1,np.pi/180,200)
for rho,theta in lines[2]:
    a = np.cos(theta)
    b = np.sin(theta)
    x0 = a*rho
    y0 = b*rho
    x1 = int(x0 + 1000*(-b))
    y1 = int(y0 + 1000*(a))
    x2 = int(x0 - 1000*(-b))
    y2 = int(y0 - 1000*(a))
    img=cv2.line(img,(x1,y1),(x2,y2),(50,29,255),2)
    
cv2.imshow('image',img)
cv2.waitKey(0)
cv2.destroyAllWindows()

#### Probabilistic Hough Transform
In the hough transform, you can see that even for a line with two arguments, it takes a lot of computation. Probabilistic Hough Transform is an optimization of Hough Transform we saw. It doesn’t take all the points into consideration, instead take only a random subset of points and that is sufficient for line detection. Just we have to decrease the threshold. See below image which compare Hough Transform and Probabilistic Hough Transform in hough space. 

cv2.HoughLinesP(). It has two new arguments.

minLineLength - Minimum length of line. Line segments shorter than this are rejected.

maxLineGap - Maximum allowed gap between line segments to treat them as single line.

Best thing is that, it directly returns the two endpoints of lines. In previous case, you got only the parameters of lines, and you had to find all the points. Here, everything is direct and simple.

In [None]:
import cv2
import numpy as np

img = cv2.imread('line.jpeg')
gray = cv2.cvtColor(img,cv2.COLOR_BGR2GRAY)
edges = cv2.Canny(gray,50,150,apertureSize = 3)
minLineLength = 100
maxLineGap = 10
lines = cv2.HoughLinesP(edges,1,np.pi/180,100,minLineLength,maxLineGap)
for x1,y1,x2,y2 in lines[1]:
    cv2.line(img,(x1,y1),(x2,y2),(0,255,0),2)
cv2.imshow('image',img)
cv2.waitKey(0)
cv2.destroyAllWindows()

### Hough Circle Transform

We will learn to use Hough Transform to find circles in an image.

We will see these functions: cv2.HoughCircles()
    
A circle is represented mathematically as (x-x_{center})^2 + (y - y_{center})^2 = r^2 where (x_{center},y_{center}) is the center of the circle, and r is the radius of the circle. From equation, we can see we have 3 parameters, so we need a 3D accumulator for hough transform, which would be highly ineffective. So OpenCV uses more trickier method, Hough Gradient Method which uses the gradient information of edges.

The function we use here is cv2.HoughCircles()

In [None]:
import cv2
import numpy as np

img = cv2.imread('circle.png',0)
img = cv2.medianBlur(img,5)
cimg = cv2.cvtColor(img,cv2.COLOR_GRAY2BGR)

circles = cv2.HoughCircles(img,cv2.HOUGH_GRADIENT,1,20,
                            param1=50,param2=30,minRadius=0,maxRadius=0)

circles = np.uint16(np.around(circles))
for i in circles[0,:]:
    # draw the outer circle
    cv2.circle(cimg,(i[0],i[1]),i[2],(0,255,0),2)
    # draw the center of the circle
    cv2.circle(cimg,(i[0],i[1]),2,(0,0,255),3)

cv2.imshow('detected circles',cimg)
cv2.waitKey(0)
cv2.destroyAllWindows()

### Image Segmentation with Watershed Algorithm

We will learn to use marker-based image segmentation using watershed algorithm

We will see: cv2.watershed()

Any grayscale image can be viewed as a topographic surface where high intensity denotes peaks and hills while low intensity denotes valleys. You start filling every isolated valleys (local minima) with different colored water (labels). As the water rises, depending on the peaks (gradients) nearby, water from different valleys, obviously with different colors will start to merge. To avoid that, you build barriers in the locations where water merges. You continue the work of filling water and building barriers until all the peaks are under water. Then the barriers you created gives you the segmentation result. This is the “philosophy” behind the watershed

But this approach gives you oversegmented result due to noise or any other irregularities in the image. So OpenCV implemented a marker-based watershed algorithm where you specify which are all valley points are to be merged and which are not. It is an interactive image segmentation. What we do is to give different labels for our object we know. Label the region which we are sure of being the foreground or object with one color (or intensity), label the region which we are sure of being background or non-object with another color and finally the region which we are not sure of anything, label it with 0. That is our marker. Then apply watershed algorithm. Then our marker will be updated with the labels we gave, and the boundaries of objects will have a value of -1.

In [None]:
import numpy as np
import cv2
from matplotlib import pyplot as plt

img = cv2.imread('coins.jpg')
gray = cv2.cvtColor(img,cv2.COLOR_BGR2GRAY)
ret, thresh = cv2.threshold(gray,0,255,cv2.THRESH_BINARY_INV+cv2.THRESH_OTSU)
cv2.imshow('detected circles',thresh)
cv2.waitKey(0)
cv2.destroyAllWindows()
plt.imshow(thresh, cmap='gray')
plt.show()

we got a threshold image.Now we need to remove any small white noises in the image. For that we can use morphological opening. To remove any small holes in the object, we can use morphological closing. So, now we know for sure that region near to center of objects are foreground and region much away from the object are background. Only region we are not sure is the boundary region of coins.

So we need to extract the area which we are sure they are coins. Erosion removes the boundary pixels. So whatever remaining, we can be sure it is coin. That would work if objects were not touching each other. But since they are touching each other, another good option would be to find the distance transform and apply a proper threshold. Next we need to find the area which we are sure they are not coins. For that, we dilate the result. Dilation increases object boundary to background. This way, we can make sure whatever region in background in result is really a background, since boundary region is removed. See the image below.

In [None]:
# noise removal
kernel = np.ones((3,3),np.uint8)
opening = cv2.morphologyEx(thresh,cv2.MORPH_OPEN,kernel, iterations = 2)

# sure background area
sure_bg = cv2.dilate(opening,kernel,iterations=3)

# Finding sure foreground area
dist_transform = cv2.distanceTransform(opening,cv2.DIST_L2,5)
ret, sure_fg = cv2.threshold(dist_transform,0.7*dist_transform.max(),255,0)

# Finding unknown region
sure_fg = np.uint8(sure_fg)
unknown = cv2.subtract(sure_bg,sure_fg)
plt.imshow(sure_fg, cmap='gray')
plt.show()

Now we know for sure which are region of coins, which are background and all. So we create marker (it is an array of same size as that of original image, but with int32 datatype) and label the regions inside it. The regions we know for sure (whether foreground or background) are labelled with any positive integers, but different integers, and the area we don’t know for sure are just left as zero. For this we use cv2.connectedComponents(). It labels background of the image with 0, then other objects are labelled with integers starting from 1.

But we know that if background is marked with 0, watershed will consider it as unknown area. So we want to mark it with different integer. Instead, we will mark unknown region, defined by unknown, with 0.

In [None]:
# Marker labelling
ret, markers = cv2.connectedComponents(sure_fg)

# Add one to all labels so that sure background is not 0, but 1
markers = markers+1

# Now, mark the region of unknown with zero
markers[unknown==255] = 0

markers = cv2.watershed(img,markers)
img[markers == -1] = [255,0,0]
plt.imshow(markers)
plt.show()

### Interactive Foreground Extraction using GrabCut Algorithm

## Feature Detection and Description

### Harris Corner Detection

We will understand the concepts behind Harris Corner Detection.

We will see the functions: cv2.cornerHarris(), cv2.cornerSubPix()

img - Input image, it should be grayscale and float32 type.

blockSize - It is the size of neighbourhood considered for corner detection

ksize - Aperture parameter of Sobel derivative used.

k - Harris detector free parameter in the equation.

In [None]:
import cv2
import numpy as np
filename = 'squre.png'
img = cv2.imread(filename)
gray = cv2.cvtColor(img,cv2.COLOR_BGR2GRAY)
gray = np.float32(gray)
dst = cv2.cornerHarris(gray,2,3,0.04)
dst = cv2.dilate(dst,None)
img[dst>0.01*dst.max()]=[0,0,255]
cv2.imshow('dst',img)
cv2.waitKey(0) 
cv2.destroyAllWindows()
plt.imshow(img),plt.show()

#### Shi-Tomasi Corner Detector & Good Features to Track

We will learn about the another corner detector: Shi-Tomasi Corner Detector

We will see the function: cv2.goodFeaturesToTrack()

OpenCV has a function, cv2.goodFeaturesToTrack(). It finds N strongest corners in the image by Shi-Tomasi method (or Harris Corner Detection, if you specify it). As usual, image should be a grayscale image. Then you specify number of corners you want to find. Then you specify the quality level, which is a value between 0-1, which denotes the minimum quality of corner below which everyone is rejected. Then we provide the minimum euclidean distance between corners detected.

With all these informations, the function finds corners in the image. All corners below quality level are rejected. Then it sorts the remaining corners based on quality in the descending order. Then function takes first strongest corner, throws away all the nearby corners in the range of minimum distance and returns N strongest corners.

In [None]:
import numpy as np
import cv2
from matplotlib import pyplot as plt

img = cv2.imread('squre.png')
gray = cv2.cvtColor(img,cv2.COLOR_BGR2GRAY)

corners = cv2.goodFeaturesToTrack(gray,25,0.01,10)
corners = np.int0(corners)

for i in corners:
    x,y = i.ravel()
    cv2.circle(img,(x,y),3,255,-1)

plt.imshow(img),plt.show()

#### Introduction to SIFT (Scale-Invariant Feature Transform)



## Video Analysis

### Meanshift and Camshift:

We will learn about Meanshift and Camshift algorithms to find and track objects in videos.

#### Meanshift
The intuition behind the meanshift is simple. Consider you have a set of points. (It can be a pixel distribution like histogram backprojection). You are given a small window ( may be a circle) and you have to move that window to the area of maximum pixel density (or maximum number of points). 

![image.png](attachment:c6420fcb-be8d-4feb-8d87-07e02e2e1dd9.png)

To use meanshift in OpenCV, first we need to setup the target, find its histogram so that we can backproject the target on each frame for calculation of meanshift. We also need to provide initial location of window. For histogram, only Hue is considered here. Also, to avoid false values due to low light, low light values are discarded using cv2.inRange() function.

In [None]:
import numpy as np
import cv2

cap = cv2.VideoCapture(0)
# take first frame of the video
ret,frame = cap.read()


# setup initial location of window
r,h,c,w = 80,150,150,200  # simply hardcoded the values
track_window = (c,r,w,h)

# set up the ROI for tracking
roi = frame[r:r+h, c:c+w]

hsv_roi =  cv2.cvtColor(frame, cv2.COLOR_BGR2HSV)
mask = cv2.inRange(hsv_roi, np.array((0., 60.,32.)), np.array((180.,255.,255.)))
roi_hist = cv2.calcHist([hsv_roi],[0],mask,[180],[0,180])
cv2.normalize(roi_hist,roi_hist,0,255,cv2.NORM_MINMAX)
# Setup the termination criteria, either 10 iteration or move by atleast 1 pt
term_crit = ( cv2.TERM_CRITERIA_EPS | cv2.TERM_CRITERIA_COUNT, 13, 1 )
while(1):
    ret ,frame = cap.read()

    if ret == True:
        hsv = cv2.cvtColor(frame, cv2.COLOR_BGR2HSV)
        dst = cv2.calcBackProject([hsv],[0],roi_hist,[0,180],1)

        # apply meanshift to get the new location
        ret, track_window = cv2.meanShift(dst, track_window, term_crit)

        # Draw it on image
        x,y,w,h = track_window
        img2 = cv2.rectangle(frame, (x,y), (x+w,y+h), 255,2)
        cv2.imshow('img2',img2)
        if cv2.waitKey(1) & 0xFF == ord('q'):
             break
cap.release()
cv2.destroyAllWindows()

#### Camshift

Did you closely watch the last result? There is a problem. Our window always has the same size when car is farther away and it is very close to camera. That is not good. We need to adapt the window size with size and rotation of the target. Once again, the solution came from “OpenCV Labs” and it is called CAMshift (Continuously Adaptive Meanshift) published by Gary Bradsky in his paper “Computer Vision Face Tracking for Use in a Perceptual User Interface” in 1988.

It applies meanshift first. Once meanshift converges, it updates the size of the window as, s = 2 \times \sqrt{\frac{M_{00}}{256}}. It also calculates the orientation of best fitting ellipse to it. Again it applies the meanshift with new scaled search window and previous window location. The process is continued until required accuracy is met.

In [None]:
import cv2
import numpy as np

cap = cv2.VideoCapture(0)

# take first frame of the video
ret,frame = cap.read()

# setup initial location of window
r,h,c,w = 250,90,400,125  # simply hardcoded the values
track_window = (c,r,w,h)

# set up the ROI for tracking
roi = frame[r:r+h, c:c+w]
hsv_roi =  cv2.cvtColor(frame, cv2.COLOR_BGR2HSV)
mask = cv2.inRange(hsv_roi, np.array((0., 60.,32.)), np.array((180.,255.,255.)))
roi_hist = cv2.calcHist([hsv_roi],[0],mask,[180],[0,180])
cv2.normalize(roi_hist,roi_hist,0,255,cv2.NORM_MINMAX)

# Setup the termination criteria, either 10 iteration or move by atleast 1 pt
term_crit = ( cv2.TERM_CRITERIA_EPS | cv2.TERM_CRITERIA_COUNT, 10, 1 )

while(1):
    ret ,frame = cap.read()

    if ret == True:
        hsv = cv2.cvtColor(frame, cv2.COLOR_BGR2HSV)
        dst = cv2.calcBackProject([hsv],[0],roi_hist,[0,180],1)

        # apply meanshift to get the new location
        ret, track_window = cv2.CamShift(dst, track_window, term_crit)

        # Draw it on image
        pts = cv2.boxPoints(ret)
        pts = np.int0(pts)
        img2 = cv2.polylines(frame,[pts],True, 255,2)
        cv2.imshow('img2',img2)
        if cv2.waitKey(1) & 0xFF == ord('q'):
               break
cap.release()
cv2.destroyAllWindows()

### Optical Flow

Optical flow is the pattern of apparent motion of image objects between two consecutive frames caused by the movemement of object or camera. It is 2D vector field where each vector is a displacement vector showing the movement of points from first frame to second
![image.png](attachment:7f10975a-b532-42fb-a48e-e616d0fe6845.png)

It shows a ball moving in 5 consecutive frames. The arrow shows its displacement vector. Optical flow has many applications in areas like :

Structure from Motion
Video Compression
Video Stabilization ...
Optical flow works on several assumptions:

The pixel intensities of an object do not change between consecutive frames.
Neighbouring pixels have similar motion.

##### Lucas-Kanade Optical Flow in OpenCV

OpenCV provides all these in a single function, cv2.calcOpticalFlowPyrLK(). Here, we create a simple application which tracks some points in a video. To decide the points, we use cv2.goodFeaturesToTrack(). We take the first frame, detect some Shi-Tomasi corner points in it, then we iteratively track those points using Lucas-Kanade optical flow. For the function cv2.calcOpticalFlowPyrLK() we pass the previous frame, previous points and next frame. It returns next points along with some status numbers which has a value of 1 if next point is found, else zero. We iteratively pass these next points as previous points in next step

In [None]:
import numpy as np
import cv2

cap = cv2.VideoCapture(0)

# params for ShiTomasi corner detection
feature_params = dict( maxCorners = 100,
                       qualityLevel = 0.3,
                       minDistance = 7,
                       blockSize = 7 )

# Parameters for lucas kanade optical flow
lk_params = dict( winSize  = (15,15),
                  maxLevel = 2,
                  criteria = (cv2.TERM_CRITERIA_EPS | cv2.TERM_CRITERIA_COUNT, 10, 0.03))

# Create some random colors
color = np.random.randint(0,255,(100,3))

# Take first frame and find corners in it
ret, old_frame = cap.read()
old_gray = cv2.cvtColor(old_frame, cv2.COLOR_BGR2GRAY)
p0 = cv2.goodFeaturesToTrack(old_gray, mask = None, **feature_params)

# Create a mask image for drawing purposes
mask = np.zeros_like(old_frame)

while(1):
    ret,frame = cap.read()
    frame_gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)

    # calculate optical flow
    p1, st, err = cv2.calcOpticalFlowPyrLK(old_gray, frame_gray, p0, None, **lk_params)

    # Select good points
    good_new = p1[st==1]
    good_old = p0[st==1]

    # draw the tracks
    for i,(new,old) in enumerate(zip(good_new,good_old)):
        a,b = new.ravel()
        c,d = old.ravel()
        mask = cv2.line(mask, (a,b),(c,d), color[i].tolist(), 2)
        frame = cv2.circle(frame,(a,b),5,color[i].tolist(),-1)
    img = cv2.add(frame,mask)

    cv2.imshow('frame',img)
    if cv2.waitKey(1) & 0xFF == ord('q'):
               break
cap.release()
cv2.destroyAllWindows()

#### Dense Optical Flow in OpenCV

Lucas-Kanade method computes optical flow for a sparse feature set (in our example, corners detected using Shi-Tomasi algorithm). OpenCV provides another algorithm to find the dense optical flow. It computes the optical flow for all the points in the frame. It is based on Gunner Farneback’s algorithm which is explained in “Two-Frame Motion Estimation Based on Polynomial Expansion” by Gunner Farneback in 2003.

Below sample shows how to find the dense optical flow using above algorithm. We get a 2-channel array with optical flow vectors, (u,v). We find their magnitude and direction. We color code the result for better visualization. Direction corresponds to Hue value of the image. Magnitude corresponds to Value plane. See the code below:

In [None]:
import cv2
import numpy as np
cap = cv2.VideoCapture(0)

ret, frame1 = cap.read()
prvs = cv2.cvtColor(frame1,cv2.COLOR_BGR2GRAY)
hsv = np.zeros_like(frame1)
hsv[...,1] = 255

while(1):
    ret, frame2 = cap.read()
    next = cv2.cvtColor(frame2,cv2.COLOR_BGR2GRAY)

    flow = cv2.calcOpticalFlowFarneback(prvs,next, None, 0.5, 3, 15, 3, 5, 1.2, 0)

    mag, ang = cv2.cartToPolar(flow[...,0], flow[...,1])
    hsv[...,0] = ang*180/np.pi/2
    hsv[...,2] = cv2.normalize(mag,None,0,255,cv2.NORM_MINMAX)
    rgb = cv2.cvtColor(hsv,cv2.COLOR_HSV2BGR)

    cv2.imshow('frame2',rgb)
    if cv2.waitKey(1) & 0xFF == ord('q'):
               break
cap.release()
cv2.destroyAllWindows()

### Background Subtraction

If you have an image of background alone, like image of the room without visitors, image of the road without vehicles etc, it is an easy job. Just subtract the new image from the background. You get the foreground objects alone. But in most of the cases, you may not have such an image, so we need to extract the background from whatever images we have. It become more complicated when there is shadow of the vehicles. Since shadow is also moving, simple subtraction will mark that also as foreground. It complicates things.

##### 1.BackgroundSubtractorMOG

It is a Gaussian Mixture-based Background/Foreground Segmentation Algorithm. It was introduced in the paper “An improved adaptive background mixture model for real-time tracking with shadow detection” by P. KadewTraKuPong and R. Bowden in 2001. It uses a method to model each background pixel by a mixture of K Gaussian distributions (K = 3 to 5). The weights of the mixture represent the time proportions that those colours stay in the scene. The probable background colours are the ones which stay longer and more static.

While coding, we need to create a background object using the function, cv2.createBackgroundSubtractorMOG(). It has some optional parameters like length of history, number of gaussian mixtures, threshold etc. It is all set to some default values. Then inside the video loop, use backgroundsubtractor.apply() method to get the foreground mask.

In [None]:
import numpy as np
import cv2

cap = cv2.VideoCapture(0)

fgbg = cv2.createBackgroundSubtractorMOG2()
while(1):
    ret, frame = cap.read()

    fgmask = fgbg.apply(frame)

    cv2.imshow('frame',fgmask)
    if cv2.waitKey(1) & 0xFF == ord('q'):
               break
cap.release()
cv2.destroyAllWindows()
    

#### BackgroundSubtractorMOG2
It is also a Gaussian Mixture-based Background/Foreground Segmentation Algorithm. It is based on two papers by Z.Zivkovic, “Improved adaptive Gausian mixture model for background subtraction” in 2004 and “Efficient Adaptive Density Estimation per Image Pixel for the Task of Background Subtraction” in 2006. One important feature of this algorithm is that it selects the appropriate number of gaussian distribution for each pixel. (Remember, in last case, we took a K gaussian distributions throughout the algorithm). It provides better adaptibility to varying scenes due illumination changes etc.

In [None]:
import numpy as np
import cv2

cap = cv2.VideoCapture(0)

fgbg = cv2.createBackgroundSubtractorMOG2()

while(1):
    ret, frame = cap.read()

    fgmask = fgbg.apply(frame)

    cv2.imshow('frame',fgmask)
    if cv2.waitKey(1) & 0xFF == ord('q'):
               break
cap.release()
cv2.destroyAllWindows()

#### BackgroundSubtractorGMG
It uses first few (120 by default) frames for background modelling. It employs probabilistic foreground segmentation algorithm that identifies possible foreground objects using Bayesian inference. The estimates are adaptive; newer observations are more heavily weighted than old observations to accommodate variable illumination. Several morphological filtering operations like closing and opening are done to remove unwanted noise. You will get a black window during first few frames.

In [None]:
import numpy as np
import cv2

cap = cv2.VideoCapture(0)

kernel = cv2.getStructuringElement(cv2.MORPH_ELLIPSE,(3,3))
fgbg = cv2.createBackgroundSubtractorGMG()

while(1):
    ret, frame = cap.read()

    fgmask = fgbg.apply(frame)
    fgmask = cv2.morphologyEx(fgmask, cv2.MORPH_OPEN, kernel)

    cv2.imshow('frame',fgmask)
    if cv2.waitKey(1) & 0xFF == ord('q'):
               break
cap.release()
cv2.destroyAllWindows()

### Camera Calibration and 3D Reconstruction
#### Camera Calibration
Today’s cheap pinhole cameras introduces a lot of distortion to images. Two major distortions are radial distortion and tangential distortion.

Due to radial distortion, straight lines will appear curved. Its effect is more as we move away from the center of image. For example, one image is shown below, where two edges of a chess board are marked with red lines. But you can see that border is not a straight line and doesn’t match with the red line. All the expected straight lines are bulged out. 

In [None]:
import numpy as np
import cv2
import glob

# termination criteria
criteria = (cv2.TERM_CRITERIA_EPS + cv2.TERM_CRITERIA_MAX_ITER, 30, 0.001)

# prepare object points, like (0,0,0), (1,0,0), (2,0,0) ....,(6,5,0)
objp = np.zeros((6*7,3), np.float32)
objp[:,:2] = np.mgrid[0:7,0:6].T.reshape(-1,2)

# Arrays to store object points and image points from all the images.
objpoints = [] # 3d point in real world space
imgpoints = [] # 2d points in image plane.

images = glob.glob('chessboard.jpg')

for fname in images:
    img = cv2.imread(fname)
    gray = cv2.cvtColor(img,cv2.COLOR_BGR2GRAY)

    # Find the chess board corners
    ret, corners = cv2.findChessboardCorners(gray, (7,6),None)

    # If found, add object points, image points (after refining them)
    if ret == True:
        objpoints.append(objp)
        corners2 = cv2.cornerSubPix(gray,corners,(11,11),(-1,-1),criteria)
        imgpoints.append(corners2)

        # Draw and display the corners
        img = cv2.drawChessboardCorners(img, (7,6), corners2,ret)
        cv2.imshow('img',img)
        cv2.waitKey(500)

cv2.destroyAllWindows()


#### Pose Estimation

We will learn to exploit calib3d module to create some 3D effects in images

This is going to be a small section. During the last session on camera calibration, you have found the camera matrix, distortion coefficients etc. Given a pattern image, we can utilize the above information to calculate its pose, or how the object is situated in space, like how it is rotated, how it is displaced etc. For a planar object, we can assume Z=0, such that, the problem now becomes how camera is placed in space to see our pattern image. So, if we know how the object lies in the space, we can draw some 2D diagrams in it to simulate the 3D effect. Let’s see how to do it.

Our problem is, we want to draw our 3D coordinate axis (X, Y, Z axes) on our chessboard’s first corner. X axis in blue color, Y axis in green color and Z axis in red color. So in-effect, Z axis should feel like it is perpendicular to our chessboard plane.

### Machine Learning

### Human pose estimation using OpenCV and OpenPose

### Computational Photography

#### Image Denoising: 

will see different functions like cv2.fastNlMeansDenoising()

cv2.fastNlMeansDenoisingColored()

In earlier chapters, we have seen many image smoothing techniques like Gaussian Blurring, Median Blurring etc and they were good to some extent in removing small quantities of noise. In those techniques, we took a small neighbourhood around a pixel and did some operations like gaussian weighted average, median of the values etc to replace the central element. In short, noise removal at a pixel was local to its neighbourhood.

cv2.fastNlMeansDenoising() - works with a single grayscale images

cv2.fastNlMeansDenoisingColored() - works with a color image.

cv2.fastNlMeansDenoisingMulti() - works with image sequence captured in short period of time (grayscale images)

cv2.fastNlMeansDenoisingColoredMulti() - same as above, but for color images.

Common arguments are:

h : parameter deciding filter strength. Higher h value removes noise better, but removes details of image also. (10 is ok)

hForColorComponents : same as h, but for color images only. (normally same as h)

templateWindowSize : should be odd. (recommended 7)

searchWindowSize : should be odd. (recommended 21)

#### 1. cv2.fastNlMeansDenoisingColored()

In [None]:
import numpy as np
import cv2
from matplotlib import pyplot as plt

img = cv2.imread('hero.jpg')
converted_img = cv2.cvtColor(img, cv2.COLOR_GRAY2BGR)

dst = cv2.fastNlMeansDenoisingColored(converted_img,None,10,10,7,21)

plt.subplot(121),plt.imshow(img)
plt.subplot(122),plt.imshow(dst)
plt.show()

In [None]:
img.shape

#### 2. cv2.fastNlMeansDenoisingMulti()

In [None]:
import numpy as np
import cv2
from matplotlib import pyplot as plt

cap = cv2.VideoCapture(0)

# create a list of first 5 frames
img = [cap.read()[1] for i in range(5)]

# convert all to grayscale
gray = [cv2.cvtColor(i, cv2.COLOR_BGR2GRAY) for i in img]

# convert all to float64
gray = [np.float64(i) for i in gray]

# create a noise of variance 25
noise = np.random.randn(*gray[1].shape)*10

# Add this noise to images
noisy = [i+noise for i in gray]

# Convert back to uint8
noisy = [np.uint8(np.clip(i,0,255)) for i in noisy]

# Denoise 3rd frame considering all the 5 frames
dst = cv2.fastNlMeansDenoisingMulti(noisy, 2, 5, None, 4, 7, 35)
figure = plt.figure(figsize = (14,8))
plt.subplot(131),plt.imshow(gray[2],'gray')
plt.subplot(132),plt.imshow(noisy[2],'gray')
plt.subplot(133),plt.imshow(dst,'gray')
plt.show()

### Image Inpainting

We will learn how to remove small noises, strokes etc in old photographs by a method called inpainting

We will see inpainting functionalities in OpenCV.

Most of you will have some old degraded photos at your home with some black spots, some strokes etc on it. Have you ever thought of restoring it back? We can’t simply erase them in a paint tool because it is will simply replace black structures with white structures which is of no use. In these cases, a technique called image inpainting is used. The basic idea is simple: Replace those bad marks with its neighbouring pixels so that it looks like the neigbourhood.

Several algorithms were designed for this purpose and OpenCV provides two of them. Both can be accessed by the same function, cv2.inpaint()

In [None]:
import numpy as np
import cv2

img = cv2.imread('hero1.png')
mask = cv2.imread('hero 2.png',0)

 
dst = cv2.inpaint(img,mask,3,cv2.INPAINT_TELEA)

cv2.imshow('dst',dst)
cv2.waitKey(0)
cv2.destroyAllWindows()

### Face and Eye Detection using HAAR Cascade classifiers

link :https://docs.opencv.org/3.4/db/d28/tutorial_cascade_classifier.html

In [None]:
import numpy as np
import cv2

In [None]:
face_cascade = cv2.CascadeClassifier('haarcascade_frontalface_default.xml')
eye_cascade = cv2.CascadeClassifier('haarcascade_eye.xml')

img = cv2.imread('hero.jpg')
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
faces = face_cascade.detectMultiScale(gray, 1.3, 5)
for (x,y,w,h) in faces:
    img = cv2.rectangle(img,(x,y),(x+w,y+h),(255,0,0),2)
    roi_gray = gray[y:y+h, x:x+w]
    roi_color = img[y:y+h, x:x+w]
    eyes = eye_cascade.detectMultiScale(roi_gray)
    for (ex,ey,ew,eh) in eyes:
        cv2.rectangle(roi_color,(ex,ey),(ex+ew,ey+eh),(0,255,0),2)

cv2.imshow('img',img)
cv2.waitKey(0)
cv2.destroyAllWindows()

## Face and Eye Detection from videos

In [None]:
import cv2
import numpy as np

In [None]:
# load the cascade

face_cascade = cv2.CascadeClassifier('Haarcascades/haarcascade_frontface_default.xml')
eye_cascade = cv2.CascadeClassifier('Haarcascades/haarcascade_eye.xml')

In [None]:
def detect(gray, frame):
    faces = face_cascade.detectMultiScale(gray, 1.3, 5)
    for(x, y, w, h) in faces:
        cv2.rectangle(frame, (x,y), (x+w, y+h),(255, 0, 0), 2)
        rol_gray = gray[y:y+h, x:x+w]
        rol_color = frame[y:y+h, x:x+w]
        eyes = eye_cascade.detectMultiScale(rol_gray, 1.1, 3)
        for (ex, ey, ew, eh) in eyes:
            cv2.rectangle(rol_color, (ex, ey),(ex+ew, ey+eh), (0,255,0), 2)
    return frame

# doing face recognition with the webcam
video_capture = cv2.VideoCapture(0)
while True:
    _, frame = video_capture.read()
    gray = cv2.cvtColor(frame,cv2.COLOR_BGR2GRAY)
    canvas = detect(gray, frame)
    cv2.imshow('video', canvas)
    if cv2.waitKey(1) & 0xFF == ord('q'):
        break
        
video_capture.release()
cv2.destroyAllWindows()

### capture and draw rectangle from webcam and sketch it on a live feed

In [None]:
import cv2
import numpy as np
from matplotlib import pyplot as plt

In [None]:
def sketch_transform(image):
    image_grayscale = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
    image_grayscale_blurred = cv2.GaussianBlur(image_grayscale, (7,7), 0)
    image_canny = cv2.Canny(image_grayscale_blurred, 10,80)
    _, mask = image_canny_inverted = cv2.threshold(image_canny, 30, 255, cv2.THRESH_BINARY_INV)
    return mask

In [None]:
cam_capture = cv2.VideoCapture(0)
cv2.destroyAllWindows()

while True:
    _, im0 = cam_capture.read()
    showCrosshair = False
    fromCenter =  False
    r = cv2.selectROI('image', im0, fromCenter, showCrosshair)
    break

while True:
    _, image_frame = cam_capture.read()
    
    rect_img = image_frame[int(r[1]):int(r[1]+r[3]), int(r[0]):int(r[0]+r[2])]
    
    sketcher_rect = rect_img
    sketcher_rect = sketch_transform(sketcher_rect)
    
    sketcher_rect_rgb = cv2.cvtColor(sketcher_rect, cv2.COLOR_GRAY2RGB)
    
    image_frame[int(r[1]):int(r[1]+r[3]), int(r[0]):int(r[0]+r[2])] = sketcher_rect_rgb
    
    cv2.imshow('sketcher ROI', image_frame)
    
    if cv2.waitKey(1) & 0xFF == ord('q'):
             break 
cam_capture.release()
cv2.destroyAllWindows()

#### read text from images

In [None]:
import cv2
import pytesseract
pytesseract.pytesseract.tesseract_cmd = r'C:\Users\USER\AppData\Local\Tesseract-OCR\tesseract.exe'

In [None]:
img = cv2.imread('words.jpg')
text = pytesseract.image_to_string(img)
print(text)

# opencv end to end project

## Emergency vehicle detection

In [None]:
import os
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import cv2

In [None]:
catogories = ['']