# Image Processing using Python

Python has many Image processing Libraries:

[scipy.ndimage](https://docs.scipy.org/doc/scipy/tutorial/ndimage.html)

[scikit-image](https://scikit-image.org/)

[Pillow](https://python-pillow.org/)

[OpenCV](https://opencv.org/)

OpenCV is a free and open-source computer vision library. OpenCV is written in optimized
C++, but it provides Python wrappers. Therefore, this library can be used in your Python
programs. `opencv-python` is the Python package that contains pre-built OpenCV with dependencies and Python bindings.

### OpenCV-Python Installation
From the Anaconda prompt, give the command: `pip install opencv-python`

More details on installation [here](https://github.com/opencv/opencv-python)

Link to OpenCV-Python Tutorial [here](https://docs.opencv.org/4.x/d6/d00/tutorial_py_root.html)

#### Importing OpenCV

In [None]:
# import opencv
# OpenCV's Python module is called cv2 even though we are using
# OpenCV 4.x and not OpenCV 2.x. Historically, OpenCV had two Python
# modules: cv2 and cv. The latter wrapped a legacy version of OpenCV
# implemented in C. Nowadays, OpenCV has only the cv2 Python module,
# which wraps the current version of OpenCV implemented in C++.
import cv2


Load an image using `cv2.imread()`:

In [None]:
img = cv2.imread('data/logo.png')

In [None]:
type(img)

Get the dimensions of the image:

In [None]:
img.shape

In [None]:
img.size #total number of elements

In [None]:
img.dtype

Display the image using `cv2.imshow`:

In [None]:
cv2.imshow("Logo image", img)

# cv2.waitKey() is a keyboard binding function.
# The argument for waitKey is a number of milliseconds to wait for keyboard input. By
# default, it is 0, which is a special value meaning infinity. The return value is either -1
# (meaning that no key has been pressed) or an ASCII keycode, such as 27 for Esc.
#waitKey only captures input when an OpenCV window has focus.
cv2.waitKey(0)

cv2.destroyWindow('Logo image') 

Now display the image using `matplotlib`:

In [None]:
import matplotlib.pyplot as plt
plt.imshow(img);

What happened?? <br>
For historical reasons, OpenCV defaults to BGR format instead of usual RGB

OpenCV implements literally hundreds of formulas that pertain to the conversion of color models. We can convert the BGR image to RGB:

In [None]:
img_rgb=cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
plt.imshow(img_rgb);

In [None]:
img_gray=cv2.cvtColor(img,cv2.COLOR_BGR2GRAY)
plt.imshow(img_gray); 
plt.colorbar();#default color map is 'viridis'

In [None]:
img_gray.shape

In [None]:
plt.imshow(img_gray, cmap='gray');
plt.colorbar();

In [None]:
cv2.imshow("gray image", img_gray)
cv2.waitKey(0)
cv2.destroyAllWindows() 

Let us create a random image:

In [None]:
import numpy as np
rim=np.random.randint(0, 256, (200,300))
plt.imshow(rim);

We can save images using `cv2.imwrite()`:

In [None]:
cv2.imwrite('rand_im.jpg', rim)

OpenCV supports a number of formats such as jpg, png, bmp, tiff,...

In [None]:
cv2.imwrite('rand_im.bmp', rim) 

### Image operations as Numpy array operations
Let us draw a black cross over the random image:

In [None]:
rim[75:125, :] = 0
rim[:, 100:200] = 0
plt.imshow(rim);

**Exercise**: Draw letter H in a red color in blue background on a 5x4 RGB image

In [None]:
#[] 
h= np.empty((5,4,3), dtype=np.uint8)


What if `cv2.imshow()` is used to display `h`?

In [None]:
cv2.imshow("h", h) #image is too small
cv2.waitKey()
cv2.destroyAllWindows()

In [None]:
# Custom window
cv2.namedWindow('custom window', cv2.WINDOW_NORMAL ) # WINDOW_NORMAL enables you to resize the window
cv2.imshow('custom window', h)
cv2.waitKey()
cv2.destroyAllWindows()

Sometimes, you will have to play with certain regions of images. It can be done with Numpy slicing. Here, I am selecting a 50x50 region on the top-left of logo.png and pasting it to the bottom right corner:

In [None]:
im=cv2.imread('data/logo.png')
im[-50:, -50:, :] = im[:50, :50, :]
plt.imshow(im)


In the above cell, we are using `cv2.imread()` and `plt.imshow()`. Hence B and R channels are reversed in the displayed image. How can you reverse the R and B channels using array operations on `im` so that `plt.imshow` shows the correct colors? 

In [None]:
#Try this
b = im[:, :, 0]
r = im[:, :, 2]

im[:, :, 0] = r
im[:, :, 2] = b
plt.imshow(im);

Didn't work as expected!! What went wrong?
In Numpy, slice of an array is a view into the same data. Not copies. Unlike Matlab.
`b` is a view into `im[:, :, 0]`. When `im[:,:,0]` is modified, `b` is also modified.
If we want a copy, we have to use the `copy` method:

In [None]:
b = im[:, :, 0].copy()
r = im[:, :, 2]
im[:, :, 0] = r
im[:, :, 2] = b
plt.imshow(im);

### Arithmetic with Images
OpenCV does *saturation arithmetic* when performing arithmetic operation on images as opposed to *modular arithmetic* done by Numpy:

In [None]:
x = np.array([250], dtype=np.uint8)
y = np.array([10], dtype=np.uint8)
x + y #Numpy addition

In [None]:
cv2.add(x, y) #OpenCV addition-which is what we normally need with images

In [None]:
img = cv2.imread('data/lena.jpg')

# Convert BGR image to RGB:
img_RGB = img[:, :, ::-1]
plt.imshow(img_RGB);

Add 60 to the image:

In [None]:
M = np.full(img.shape, 60, dtype=np.uint8)
img_add = cv2.add(img, M)
plt.imshow(img_add[:, :, ::-1]);

**Exercise**: Subtract 100 from all channels in all pixels in `img` using `cv2.subtract()` and display using `plt.imshow`:

### Image Blending
Image blending is also image addition, but different weights are given to the images.

This function is commonly used to get the
output from the Sobel operator.The Sobel operator is used for edge detection, where it creates an image emphasizing
edges. The Sobel operator uses two 3 Ã— 3 kernels, which are convolved with the original
image in order to calculate approximations of the derivatives, capturing both horizontal
and vertical changes

In [None]:
img1 = cv2.imread('data/pic1.jpg')
img2 = cv2.imread('data/pic2.jpg')

#alpha = 0.3, 0.7=1-0.3; make sure those values add to 1 if you want conserve brightness
blended = cv2.addWeighted(img1, 0.3, img2, 0.7, 0)

plt.figure(figsize=(10,30))
plt.subplot(1,3,1)
plt.imshow(img1[:, :, ::-1])
plt.subplot(1,3,2)
plt.imshow(img2[:, :, ::-1])
plt.subplot(1,3,3)
plt.imshow(blended[:, :, ::-1]);

### Image Filtering
The `cv2.GaussianBlur()`  blurs an image by using a Gaussian kernel:

In [None]:
baboon = plt.imread('data/baboon.jpg')

#GaussianBlur(	src, ksize, sigmaX, sigmaY,...	)
#when sigmaX=0, it is computed from kernel size
babblur = cv2.GaussianBlur(baboon,(29,29),0)

plt.subplot(121)
plt.imshow(baboon)
plt.subplot(122)
plt.imshow(babblur);

The `cv2.filter2D()` function can be used to apply an arbitrary kernel to an
image, convolving the image with the provided kernel:

In [None]:
#custom kernel; simple box-car in this case
kernel = np.ones((15,15))
kernel /= kernel.size #normalize kernel so as not to scale image intensity

babblur2 = cv2.filter2D(baboon,-1,kernel) #the argument -1 is for ddepth=-1; the output image will have the same depth as the source-uint8
# each channel is processed independently

plt.subplot(121)
plt.imshow(baboon)
plt.subplot(122)
plt.imshow(babblur2);

Other smoothing filters such as median blur and bilateral filter are also available. See the [tutorial](https://docs.opencv.org/4.x/d4/d13/tutorial_py_filtering.html)

### Capturing camera frames
The `cv2.VideoCapture()` object allows you to capture videos from different sources, such as cameras, video files and image sequences. When capturing frames from a camera connected to your computer, you have to give the camera index as the argument: 

In [None]:
capture = cv2.VideoCapture(0) #calling the constructor, 0 is the camera index
# Get some properties of VideoCapture using get() method
frame_width = capture.get(cv2.CAP_PROP_FRAME_WIDTH)
frame_height = capture.get(cv2.CAP_PROP_FRAME_HEIGHT)
fps = capture.get(cv2.CAP_PROP_FPS)

# Print these values:
print(f"CV_CAP_PROP_FRAME_WIDTH: {frame_width}")
print(f"CV_CAP_PROP_FRAME_HEIGHT : {frame_height}")
print(f"CAP_PROP_FPS : {fps}")

ret, frame = capture.read()
while ret:
    cv2.imshow('Input frame from the camera', frame)
    # Capture frame-by-frame from the camera
    ret, frame = capture.read()

    if cv2.waitKey(1) == ord('q'):
        break
 
 
# Release everything:
capture.release()
cv2.destroyAllWindows()

### Face Detection
OpenCV provides an implementation of Haar cascade based face detection first proposed by Viola and Jones(2001). More details can be found [here](https://opencv24-python-tutorials.readthedocs.io/en/latest/py_tutorials/py_objdetect/py_face_detection/py_face_detection.html)
The `cv2.CascadeClassifier()` function is used to load a classifier from a file. The cascade classifer files should be available  under `cv2/data` folder in your installation. 

In [None]:
# Face detection on still images

face_cascade = cv2.CascadeClassifier(
'C:/Users/ece/anaconda3/Lib/site-packages/cv2/data/haarcascade_frontalface_default.xml')
img = cv2.imread('data/woodcutters.jpg')
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
faces = face_cascade.detectMultiScale(gray, 1.08, 5)
for (x, y, w, h) in faces:
    img = cv2.rectangle(img, (x, y), (x+w, y+h), (255, 0, 0), 2)


cv2.namedWindow('Woodcutters Detected!')
cv2.imshow('Woodcutters Detected!', img)
#cv2.imwrite('./woodcutters_detected.jpg', img)
cv2.waitKey(0)
cv2.destroyAllWindows()

With `cv2.CascadeClassifier`, it makes little difference whether we perform face
detection on a still image or a video feed. The latter is just a sequential version of the
former. Face detection on a video is simply face detection applied to each frame. Naturally,
with more advanced techniques, it would be possible to track a detected face continuously
across multiple frames and determine that the face is the same one in each frame. 

However, it is good to know that a basic sequential approach also works.

In [None]:
camera = cv2.VideoCapture(0)
while (cv2.waitKey(1) == -1):
    success, frame = camera.read()
    if success:
        gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)
        faces = face_cascade.detectMultiScale(gray, 1.3, 5, minSize=(120, 120))
        for (x, y, w, h) in faces:
            cv2.rectangle(frame, (x, y), (x+w, y+h), (255, 0, 0), 2)
        cv2.imshow('Face Detection', frame)

camera.release()
cv2.destroyAllWindows()

### Two Deep Learning Object Detection Examples: 

#### 1. Object detection using pre-trained Tensorflow model
Deep Learning has changed Computer Vision forever!

Adapted from this [article](https://towardsdatascience.com/object-detection-with-tensorflow-model-and-opencv-d839f3e42849)

We are using the [EfficientDet-Lite2 ](https://tfhub.dev/tensorflow/efficientdet/lite2/detection/1) model trained on the COCO-2017 dataset for object detection. Many other models are available on [Tensorflow Hub](https://tfhub.dev/).

In [None]:
import cv2
import numpy
import tensorflow as tf
import tensorflow_hub as hub
import pandas as pd

In [None]:
detector = hub.load("https://tfhub.dev/tensorflow/efficientdet/lite2/detection/1")
#detector = tf.saved_model.load('C:/Users/ece/src/webcam_tf')
labels = pd.read_csv('data/labels.csv',sep=';',index_col='ID')
labels = labels['OBJECT (2017 REL.)']

In [None]:
cap = cv2.VideoCapture(0)

while(True):
    #Capture frame-by-frame
    ret, frame = cap.read()
    
    #Resize to respect the input_shape
    inp = cv2.resize(frame, (640, 480 )) # model accepts variable size images

    #Convert img to RGB
    rgb = cv2.cvtColor(inp, cv2.COLOR_BGR2RGB)

  
    rgb_tensor = tf.convert_to_tensor(rgb, dtype=tf.uint8)# model needs uint8 tensor

    #Add dims to rgb_tensor
    rgb_tensor = tf.expand_dims(rgb_tensor , 0)
    
    boxes, scores, classes, num_detections = detector(rgb_tensor) #other models may output in different formats
    
    # boxes: a tf.float32 tensor of shape [N, 4] containing bounding box coordinates in the following order: [ymin, xmin, ymax, xmax].
    # scores: a tf.float32 tensor of shape [N] containing detection scores.
    # classes: a tf.int tensor of shape [N] containing detection class index from the label file.
    # num_detections: a tf.int tensor with only one value, the number of detections [N].
    
    pred_labels = classes.numpy().astype('int')[0]  # index by 0 to remove batch dimension
    
    pred_labels = [labels[i] for i in pred_labels]
    pred_boxes = boxes.numpy()[0].astype('int')
    pred_scores = scores.numpy()[0]
   
   #loop throughout the detections and place a box around it  
    for score, (ymin,xmin,ymax,xmax), label in zip(pred_scores, pred_boxes, pred_labels):
        if score < 0.5:
            img_boxes = rgb
            continue
            
        score_txt = f'{100 * round(score,0)}'
        img_boxes = cv2.rectangle(rgb,(xmin, ymax),(xmax, ymin),(0,255,0),1)      
        font = cv2.FONT_HERSHEY_SIMPLEX
        cv2.putText(img_boxes,label,(xmin, ymax-10), font, 0.5, (255,0,0), 1, cv2.LINE_AA)
        cv2.putText(img_boxes,score_txt,(xmax, ymax-10), font, 0.5, (255,0,0), 1, cv2.LINE_AA)


    #Display the resulting frame
    cv2.imshow('detections', img_boxes)
    if cv2.waitKey(1) & 0xFF == ord('q'):
        break

# When everything done, release the capture
cap.release()
cv2.destroyAllWindows()

#### 2. Object Detection using YOLO

Adapted from [here](https://opencv-tutorial.readthedocs.io/en/latest/yolo/yolo.html)

YOLO [homepage](https://pjreddie.com/darknet/yolo/)

In [None]:
 
# Coco Names
classesFile = "data/coco.names"
classNames = []
with open(classesFile, 'r') as f:
    classNames = f.read().rstrip('\n').split('\n')
print(classNames)



In [None]:
## Model Files
modelConfiguration = "C:/Users/ece/src/yolo_obj_det/yolov3.cfg"
modelWeights = "C:/Users/ece/src/yolo_obj_det/yolov3.weights"
net = cv2.dnn.readNetFromDarknet(modelConfiguration, modelWeights)
net.setPreferableBackend(cv2.dnn.DNN_BACKEND_OPENCV)
net.setPreferableTarget(cv2.dnn.DNN_TARGET_CPU)
 

In [None]:
cap = cv2.VideoCapture(0)
whT = 320
confThreshold =0.5
nmsThreshold= 0.2

while cv2.waitKey(1) != ord('q'):
    success, img = cap.read()
    blob = cv2.dnn.blobFromImage(img, 1 / 255, (whT, whT), [0, 0, 0], 1, crop=False)#pre-processing
    net.setInput(blob)
    layersNames = net.getLayerNames()
    outputNames = [(layersNames[i-1]) for i in net.getUnconnectedOutLayers()]
    outputs = net.forward(outputNames)
    
    hT, wT, cT = img.shape
    bbox = []
    classIds = []
    confs = []
    for output in outputs: #outputs is a tuple with len(outputs)=3 since outputs from 3 layers are taken; 
        #output = output from a layer=numpy array; no. of bounding boxes x 85
        for det in output: #for each row of output=for each bounding box=85 element array of bounding box properties
            # elements 0 to 3 are bounding box coordinates, det[4] is conf. that an obj is present, and det[5:] are the scores for the 80 classes
            scores = det[5:]
            classId = np.argmax(scores)
            confidence = scores[classId]
            if confidence > confThreshold:
                w,h = int(det[2]*wT) , int(det[3]*hT)
                x,y = int((det[0]*wT)-w/2) , int((det[1]*hT)-h/2)
                bbox.append([x,y,w,h])
                classIds.append(classId)
                confs.append(float(confidence))
 
    indices = cv2.dnn.NMSBoxes(bbox, confs, confThreshold, nmsThreshold)#indices = #the kept indices of bboxes after NMS.
 
    for i in indices:
        #i = i[0]
        box = bbox[i]
        x, y, w, h = box[0], box[1], box[2], box[3]
        # print(x,y,w,h)
        cv2.rectangle(img, (x, y), (x+w,y+h), (255, 0 , 255), 2)
        cv2.putText(img,f'{classNames[classIds[i]].upper()} {int(confs[i]*100)}%',
                  (x, y-10), cv2.FONT_HERSHEY_SIMPLEX, 0.6, (255, 0, 255), 2)
 
    cv2.imshow('Image', img)

cap.release()
cv2.destroyAllWindows()  

OpenCV also allows loading DNN models from many frameworks such as Tensorflow, Pytorch etc..

In [None]:
help(cv2.dnn.readNetFromTensorflow)

### Saving camera frames:
`cv2.VideoWriter` object can be used to write frames to a video file. The video's file name and codec must be specified as arguments to the constructor

In [None]:
capture = cv2.VideoCapture(0)
# Get some properties of VideoCapture (frame width, frame height and frames per second (fps)):
frame_width = capture.get(cv2.CAP_PROP_FRAME_WIDTH)
frame_height = capture.get(cv2.CAP_PROP_FRAME_HEIGHT)
fps = capture.get(cv2.CAP_PROP_FPS)

writer = cv2.VideoWriter('MyOutputVid.mp4', 
cv2.VideoWriter_fourcc(*'MP4V'), # FourCC is a 4-byte code used to specify the video codec-file ext and codec should match
int(fps), (int(frame_width), int(frame_height)))

# videoWriter = cv2.VideoWriter(
# 'MyOutputVid1.avi', cv2.VideoWriter_fourcc('I','4','2','0'),
# int(fps), (int(frame_width), int(frame_height)))

frame_number=0
while frame_number < 150:
    # Capture frame-by-frame from the camera
    ret, frame = capture.read()    
    writer.write(frame)

    frame_number +=1
 
# Release everything:
capture.release()
writer.release()
cv2.destroyAllWindows()

### Reading a video file
`cv2.VideoCapture` also allows us to read a video file. To read a video file, the
path to the video file should be passed instead of the camera's device index:

In [None]:
capture = cv2.VideoCapture('MyOutputVid.mp4')

ret, frame = capture.read()

while ret:
    cv2.imshow('Frame from video file', frame)
    ret, frame = capture.read()    
       
    if cv2.waitKey(33) == ord('q'): #30 frames per 1000 ms ~= 33 ms per frame
        break
 
# Release everything:
capture.release()
cv2.destroyAllWindows()

### Canny Edge Detection
(https://docs.opencv.org/4.x/da/d22/tutorial_py_canny.html)

In [None]:
img = cv2.imread('data/lena.jpg')
canny_edge= cv2.Canny(img, 100, 200)
plt.imshow(canny_edge, cmap='gray');

### Sample Scripts
Many sample programs are included in the OpenCV's source code archive. To dowload the source code, go to (https://opencv.org/releases/) and download **Sources**. It is a zip file (90 MB). Unzip it and find the samples scripts in `opencv/samples/python` folder.
Try running a sample program, for example, `hist.py`.

Note that many of the sample scripts require command line arguments.


Many interesting OpenCV projects are on [this](https://www.youtube.com/channel/UCYUjYU5FveRAscQ8V21w81A) Youtube Channel