# OpenCV


[OpenCV](https://opencv.org/) is a free and open-source image processing and computer vision library.  OpenCV has over 2500 optimized algorithms written in C++, but it provides Python wrappers. Therefore, this library can be used in your Python
programs. `opencv-python` is the Python package that contains pre-built OpenCV with dependencies and Python bindings.

### OpenCV-Python Installation

First install the Anaconda/Miniconda distribution of Python: (https://www.anaconda.com/products/distribution) or (https://docs.conda.io/en/main/miniconda.html)


It is recommended to install OpenCV in a separate virtual environment

To create a new conda environment, issue the following command on Anaconda prompt:

`conda create --name myenv `

`conda activate myenv`

More on managing conda environments [here](https://docs.conda.io/projects/conda/en/latest/user-guide/tasks/manage-environments.html)


Then install OpenCV with the command `pip install opencv-python`


Link to OpenCV-Python Tutorial [here](https://docs.opencv.org/4.x/d6/d00/tutorial_py_root.html)

#### Importing OpenCV

In [None]:
import cv2


OpenCV's Python module is called cv2 even though we are using
 OpenCV 4.x and not OpenCV 2.x. Historically, OpenCV had two Python
 modules: cv2 and cv. The latter wrapped a legacy version of OpenCV
 implemented in C. Nowadays, OpenCV has only the cv2 Python module,
 which wraps the current version of OpenCV implemented in C++.

In [None]:
cv2.__version__

Load an image using `cv2.imread()`: Many other python libs also have an imread fn: matplotlib, pillow, scikit-image..

In [None]:
img = cv2.imread('data/logo.png')

In [None]:
type(img)

Get the dimensions of the image:

In [None]:
img.shape

In [None]:
img.size #total number of elements

In [None]:
img.dtype

In [None]:
img.flags

Display the image using `cv2.imshow`:

In [None]:
cv2.imshow("Logo image", img)

# cv2.waitKey() is a keyboard binding function.
# The argument for waitKey is a number of milliseconds to wait for keyboard input. By
# default, it is 0, which is a special value meaning infinity. The return value is either -1
# (meaning that no key has been pressed) or an ASCII keycode, such as 27 for Esc.
#waitKey only captures input when an OpenCV window has focus.
cv2.waitKey(10000)

cv2.destroyWindow('Logo image') 

Now display the image using `matplotlib`:

In [None]:
import matplotlib.pyplot as plt
plt.imshow(img);

What happened?? <br>
For historical reasons, OpenCV defaults to BGR format instead of usual RGB

OpenCV implements literally hundreds of formulas that pertain to the conversion of color models. We can convert the BGR image to RGB:

In [None]:
img_rgb=cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
plt.imshow(img_rgb);

In [None]:
img_gray=cv2.cvtColor(img,cv2.COLOR_BGR2GRAY)
plt.imshow(img_gray); 
plt.colorbar();#default color map is 'viridis'

In [None]:
img_gray.shape

In [None]:
plt.imshow(img_gray, cmap='gray');
plt.colorbar();

In [None]:
cv2.imshow("gray image", img_gray)
cv2.waitKey(0)
cv2.destroyAllWindows() 

Let us create a random image:

In [None]:
import numpy as np
rim=np.random.randint(0, 256, (200,300))
plt.imshow(rim);

We can save images using `cv2.imwrite()`:

In [None]:
cv2.imwrite('data/rand_im.jpg', rim)

OpenCV supports a number of formats such as jpg, png, bmp, tiff,...

In [None]:
cv2.imwrite('data/rand_im.bmp', rim) 

### Image operations as Numpy array operations
Let us draw a black cross over the random image:

In [None]:
rim[75:125, :] = 0
rim[:, 100:200] = 0
plt.imshow(rim);

**Exercise**: Draw letter H in a red color in blue background on a 5x4 RGB image

In [None]:
#[] 
h= np.zeros((5,4,3), dtype=np.uint8)
plt.imshow(h)

What if `cv2.imshow()` is used to display `h`?

In [None]:
cv2.imshow("h", h) #image is too small
cv2.waitKey()
cv2.destroyAllWindows()

In [None]:
# Custom window
cv2.namedWindow('custom window', cv2.WINDOW_NORMAL ) # WINDOW_NORMAL enables you to resize the window
cv2.imshow('custom window', h)
cv2.waitKey()
cv2.destroyAllWindows()

Sometimes, you will have to play with certain regions of images. It can be done with Numpy slicing. Here, I am selecting a 50x50 region on the top-left of logo.png and pasting it to the bottom right corner:

In [None]:
im=cv2.imread('data/logo.png')
im[-50:, -50:, :] = im[:50, :50, :]
plt.imshow(im)


In the above cell, we are using `cv2.imread()` and `plt.imshow()`. Hence B and R channels are reversed in the displayed image. How can you reverse the R and B channels using array operations on `im` so that `plt.imshow` shows the correct colors? 

In [None]:
#Try this
b = im[:, :, 0]
r = im[:, :, 2]

im[:, :, 0] = r
im[:, :, 2] = b
plt.imshow(im);

Didn't work as expected!! What went wrong?
In Numpy, slice of an array is a view into the same data. Not copies. Unlike Matlab.
`b` is a view into `im[:, :, 0]`. When `im[:,:,0]` is modified, `b` is also modified.
If we want a copy, we have to use the `copy` method:

In [None]:
b = im[:, :, 0].copy()
r = im[:, :, 2]
im[:, :, 0] = r
im[:, :, 2] = b
plt.imshow(im);

### Arithmetic with Images
OpenCV does *saturation arithmetic* when performing arithmetic operation on images as opposed to *modular arithmetic* done by Numpy:

In [None]:
x = np.array([[250, 100], [250, 100]], dtype=np.uint8)
y = np.array([[10, 10],[10,10]], dtype=np.uint8)
x + y #Numpy addition

In [None]:
cv2.add(x, y) #OpenCV addition-which is what we normally need with images


In [None]:
img = cv2.imread('data/lena.jpg')

# Convert BGR image to RGB:
img_RGB = img[:, :, ::-1]
plt.imshow(img_RGB);

Add 60 to the image:

In [None]:
M = np.full(img.shape, 60, dtype=np.uint8)
img_add = cv2.add(img, M)
plt.imshow(img_add[:, :, ::-1]);

**Exercise**: Subtract 100 from all channels in all pixels in `img` using `cv2.subtract()` and display using `plt.imshow`:

In [None]:
M = np.full(img.shape, 100, dtype=np.uint8)
img_sub = cv2.subtract(img,M)
plt.imshow(img_sub[:, :, ::-1]);

### Image Blending
Image blending is also image addition, but different weights are given to the images.

This function is commonly used to get the
output from the Sobel operator.The Sobel operator is used for edge detection, where it creates an image emphasizing
edges. The Sobel operator uses two 3 × 3 kernels, which are convolved with the original
image in order to calculate approximations of the derivatives, capturing both horizontal
and vertical changes

In [None]:
img1 = cv2.imread('data/pic1.jpg')
img2 = cv2.imread('data/pic2.jpg')

#alpha = 0.3, 0.7=1-0.3; make sure those values add to 1 if you want conserve brightness
blended = cv2.addWeighted(img1, 0.3, img2, 0.7, 0)

plt.figure(figsize=(10,30))
plt.subplot(1,3,1)
plt.imshow(img1[:, :, ::-1])
plt.subplot(1,3,2)
plt.imshow(img2[:, :, ::-1])
plt.subplot(1,3,3)
plt.imshow(blended[:, :, ::-1]);

### Image Filtering
The `cv2.GaussianBlur()`  blurs an image by using a Gaussian kernel:

In [None]:
baboon = plt.imread('data/baboon.jpg')

#GaussianBlur(	src, ksize, sigmaX, sigmaY,...	)
#when sigmaX=0, it is computed from kernel size
babblur = cv2.GaussianBlur(baboon,(29,29),0)

plt.subplot(121)
plt.imshow(baboon)
plt.subplot(122)
plt.imshow(babblur);

The `cv2.filter2D()` function can be used to apply an arbitrary kernel to an
image, convolving the image with the provided kernel:

In [None]:
#custom kernel; simple box-car in this case
kernel = np.ones((15,15))
kernel /= kernel.size #normalize kernel so as not to scale image intensity

babblur2 = cv2.filter2D(baboon,-1,kernel) #the argument -1 is for ddepth=-1; the output image will have the same depth as the source-uint8
# each channel is processed independently

plt.subplot(121)
plt.imshow(baboon)
plt.subplot(122)
plt.imshow(babblur2);

Other smoothing filters such as median blur and bilateral filter are also available. See the [tutorial](https://docs.opencv.org/4.x/d4/d13/tutorial_py_filtering.html)

### Capturing camera frames
The `cv2.VideoCapture()` object allows you to capture videos from different sources, such as cameras, video files and image sequences. When capturing frames from a camera connected to your computer, you have to give the camera index as the argument: 

In [None]:
capture = cv2.VideoCapture(0) #calling the constructor, 0 is the camera index
# Get some properties of VideoCapture using get() method
frame_width = capture.get(cv2.CAP_PROP_FRAME_WIDTH)
frame_height = capture.get(cv2.CAP_PROP_FRAME_HEIGHT)
fps = capture.get(cv2.CAP_PROP_FPS)

# Print these values:
print(f"CV_CAP_PROP_FRAME_WIDTH: {frame_width}")
print(f"CV_CAP_PROP_FRAME_HEIGHT : {frame_height}")
print(f"CAP_PROP_FPS : {fps}")

ret, frame = capture.read()
while ret:
    cv2.imshow('Input frame from the camera', frame)
    # Capture frame-by-frame from the camera
    ret, frame = capture.read()

    if cv2.waitKey(1) == ord('q'):
        break
 
 
# Release everything:
capture.release()
cv2.destroyAllWindows()

## Deep Learning
Deep Learning has changed Computer Vision forever!

[Tensorflow](https://www.tensorflow.org/) is an open source Deep Learning platform developed by Google. 

TensorFlow Hub('https://tfhub.dev') is a repository of pretrained models curated by Google.

### Image Classification using pre-trained model from Tensorflow Hub



In [None]:
import tensorflow as tf
import tensorflow_hub as hub


In [None]:
model_handle='https://tfhub.dev/google/imagenet/efficientnet_v2_imagenet1k_s/classification/2'

In [None]:
classifier = hub.load(model_handle)
tf.saved_model.save(classifier, './model') #we can later use classifier=tf.saved_model.load('./model')

Let us input an image to the classifier:

In [None]:
image = cv2.imread('./data/tiger.jpg')[...,::-1]
image.dtype

In [None]:
# Use `convert_image_dtype` to convert to floats in the [0,1] range.
image = tf.image.convert_image_dtype(image, tf.float32)
image.dtype

In [None]:
image.shape

In [None]:
# reshape into shape [batch_size, height, width, num_channels]
image = tf.reshape(image, [1, image.shape[0], image.shape[1], image.shape[2]])
image.shape

In [None]:
# Though tf hub doc page mentions that input shape should be 384x384, it is seen that the model accepts any shape
# Also, the number of outputs is seen to be 1000, instead of 1001 as mentioned in the doc
# some models need input to be resized. (eg: mobilenet) We can use tf.resize() for resizing
#To resize without changing aspect ratio, use tf.resize_with_pad() or tf.resize_with_crop_or_pad()
output=classifier(image)
output.shape

In [None]:
probs=tf.nn.softmax(output)
probs.shape


In [None]:
probs.numpy().round(3)

The labels.txt file contains the 1000 class labels of ImageNet. Some models include an additional 'background' class in the predictions.

In [None]:
with open('./data/labels.txt') as f: #You use a with statement to create a context manager to ensure the file is closed as soon as it’s no longer needed.
  labels = f.readlines()
  classes = [l.strip() for l in labels]

In [None]:
top_5 = tf.argsort(probs, axis=-1, direction="DESCENDING")[0][:5].numpy()

for i in top_5:
    print(classes[i])

To learn how to apply transfer learning to models in TFHub, see [this](https://www.tensorflow.org/hub/tutorials/tf2_image_retraining) tutorial

#### Object detection using pre-trained model from Tensorflow Hub

In [None]:
detector = hub.load("https://tfhub.dev/google/openimages_v4/ssd/mobilenet_v2/1").signatures['default']


In [None]:
img = cv2.imread('./data/Naxos_Taverna.jpg')[...,::-1]
img.dtype

In [None]:
converted_img  = tf.image.convert_image_dtype(img, tf.float32)[tf.newaxis, ...]
converted_img.dtype, converted_img.shape

In [None]:
result = detector(converted_img)

In [None]:
result = {key:value.numpy() for key,value in result.items()}

print(f"Found {len(result['detection_scores'])} objects."  )

In [None]:
result.keys()

In [None]:
im_h, im_w, _ = img.shape

boxes = (np.array([im_h, im_w, im_h, im_w])*result["detection_boxes"]).astype('int')
scores = (100*result["detection_scores"]).round(1)
class_names = result["detection_class_entities"].astype('str')

for score, box, class_name in zip(scores, boxes, class_names):
    print(score, box, class_name)



In [None]:

# loop throughout the detections and place a box around it
img_boxes=img.copy()
for score, (ymin, xmin, ymax, xmax), label in zip(scores, boxes, class_names):

 
    if score > 20:
        cv2.rectangle(img_boxes, (xmin, ymin), (xmax, ymax), (0, 255, 0), 4)      
        font = cv2.FONT_HERSHEY_SIMPLEX
        txt=label+':'+ str(score)
        cv2.putText(img_boxes, txt, (xmin, ymax-10),
                        font, 1, (255, 0, 0), 2, cv2.LINE_AA)




In [None]:

plt.imshow(img_boxes)

#### Detecting objects in video

In [None]:
cap = cv2.VideoCapture(0)
im_w = cap.get(cv2.CAP_PROP_FRAME_WIDTH)
im_h = cap.get(cv2.CAP_PROP_FRAME_HEIGHT)

while True:
    #Capture frame-by-frame
    ret, frame = cap.read()
 
    #Convert img to RGB
    img = cv2.cvtColor(frame, cv2.COLOR_BGR2RGB)
    
  
    # convert to float and add dimension
    converted_img  = tf.image.convert_image_dtype(img, tf.float32)[tf.newaxis, ...]
    
    result = detector(converted_img)

    result = {key:value.numpy() for key,value in result.items()}

    boxes = (np.array([im_h, im_w, im_h, im_w])*result["detection_boxes"]).astype('int')
    class_names = result["detection_class_entities"].astype('str')
    scores = (100*result["detection_scores"]).round(1)

    img_boxes=frame.copy()
    for score, (ymin, xmin, ymax, xmax), label in zip(scores, boxes, class_names):
 
        if score > 20:
            cv2.rectangle(img_boxes, (xmin, ymin), (xmax, ymax), (0, 255, 0), 4)      
            font = cv2.FONT_HERSHEY_SIMPLEX
            txt=label+':'+ str(score)
            cv2.putText(img_boxes, txt, (xmin, ymax-10),
                            font, 1, (255, 0, 0), 2, cv2.LINE_AA)
    
    



    #Display the resulting frame
    cv2.imshow('detections', img_boxes)
    if cv2.waitKey(1) & 0xFF == ord('q'):
        break

# When everything done, release the capture
cap.release()
cv2.destroyAllWindows()

### Saving camera frames:
`cv2.VideoWriter` object can be used to write frames to a video file. The video's file name and codec must be specified as arguments to the constructor

In [None]:
capture = cv2.VideoCapture(0)
# Get some properties of VideoCapture (frame width, frame height and frames per second (fps)):
frame_width = capture.get(cv2.CAP_PROP_FRAME_WIDTH)
frame_height = capture.get(cv2.CAP_PROP_FRAME_HEIGHT)
fps = capture.get(cv2.CAP_PROP_FPS)

writer = cv2.VideoWriter('MyOutputVid.mp4', 
cv2.VideoWriter_fourcc(*'MP4V'), # FourCC is a 4-byte code used to specify the video codec-file ext and codec should match
int(fps), (int(frame_width), int(frame_height)))

# videoWriter = cv2.VideoWriter(
# 'MyOutputVid1.avi', cv2.VideoWriter_fourcc('I','4','2','0'),
# int(fps), (int(frame_width), int(frame_height)))

frame_number=0
while frame_number < 150:
    # Capture frame-by-frame from the camera
    ret, frame = capture.read()    
    writer.write(frame)

    frame_number +=1
 
# Release everything:
capture.release()
writer.release()
cv2.destroyAllWindows()

### Reading a video file
`cv2.VideoCapture` also allows us to read a video file. To read a video file, the
path to the video file should be passed instead of the camera's device index:

In [None]:
capture = cv2.VideoCapture('MyOutputVid.mp4')

ret, frame = capture.read()

while ret:
    cv2.imshow('Frame from video file', frame)
    ret, frame = capture.read()    
       
    if cv2.waitKey(33) == ord('q'): #30 frames per 1000 ms ~= 33 ms per frame
        break
 
# Release everything:
capture.release()
cv2.destroyAllWindows()

### Canny Edge Detection
(https://docs.opencv.org/4.x/da/d22/tutorial_py_canny.html)

In [None]:
img = cv2.imread('data/lena.jpg')
canny_edge= cv2.Canny(img, 100, 200) # Canny in one line!
plt.imshow(canny_edge, cmap='gray');

### Sample Scripts
Many sample programs are included in the OpenCV's source code archive. To dowload the source code, go to (https://opencv.org/releases/) and download **Sources**. It is a zip file (90 MB). Unzip it and find the samples scripts in `opencv/samples/python` folder.
Try running a sample program, for example, `hist.py`.

Note that many of the sample scripts require command line arguments.
