# Pertemuan 2
- OpenCV Image Datastructure
- Split & Merge Image Channel 
- Image Croping & Resizing 
- Change Colorspace (RGB - GRAY - HSV)

# Maximizing Jetson Nano Perfomance
```
sudo nvpmodel -m 0
sudo jetson_clocks
```

In [None]:
import cv2
import numpy as np

# 1. OpenCV Image Datastructure
## 1.1 OpenCV Python Bindings
- In OpenCV, all algorithms are implemented in C++. 
- But these algorithms can be used from different languages like Python, Java etc. 
- We can call it as **Python-OpenCV Bindings**
## 1.2 OpenCV Matrix (cv::Mat)
- **Numpy Array** is used as Image datastructure in Python-OpenCV bindding,
- And will be converted to `cv::Mat` when calling a specific OpenCV method or function then proceed the rest of task in C++ side.
## 1.3 OpenCV Unified Matrix (cv::UMat)
- The `cv::UMat` is the C++ class, which is very similar to `cv::Mat`.
- In Python we call it as `cv2.UMat`
- The `UMat` class tells OpenCV functions to process images with an **OpenCL** specific code which uses an **OpenCL-enabled GPU** if exists in the system (automatically switching to **CPU** otherwise).
    - **OpenCL™** (Open Computing Language) is a framework for writing programs that execute across heterogeneous platforms consisting of *central processing units (CPUs)*, *graphics processing units (GPUs)*, *digital signal processors (DSPs)*, *field-programmable gate arrays (FPGAs)* and other *processors or hardware accelerators*. 
    - Check our OpenCV build is OpenCL enable or not using `cv2.getBuildInformation()`.
- Performance comparison between `cv::Mat` and `cv::UMat` (run OpenCL)<br>
    <img src="res/OpenCL.jpg" style="width: 450px;"></img>
## 1.4 OpenCV CUDA GPU Matrix(cv::cuda::GpuMat)
- The `cv::gpu::GpuMat` is the C++ class inside **OpenCV GPU Module** (`cv::cuda`) written using CUDA.
- The GPU module is designed as host API extension.
- This design provides the user an explicit control on how data is **moved between CPU and GPU memory**. 
- `cv::gpu::GpuMat` which is a primary container for data kept in **GPU memory**.
- It’s interface is very similar with `cv::Mat`, its CPU counterpart. 
- All GPU functions receive `GpuMat` as **input** and **output** arguments. 
- In Python we call it as `cv2.cuda_GpuMat`
- Performance comparison between `cv::Mat` and `cv::cuda::GPUMat` (Tesla C2050 vs Core i5-760 2.8Ghz, SSE, TBB)<br>
    <img src="res/cuda_gpumat.png" style="width: 450px;"></img>
## 1.5 OpenCV Matrix Python vs C++
- OpenCV Matrix comparison between Python and C++<br>
    <img src="res/datastructure.png" style="width: 450px;"></img>

In [None]:
# OpenCV Image Matrix (Numpy Array)

img = cv2.imread("lena.jpg")

print(type(img))

In [None]:
# Converting Image Matrix to UMat

img_Umat = cv2.UMat(img)

print(type(img_Umat))

In [None]:
# Umat Object Property

print(dir(img_Umat))

In [None]:
# Converting Umat to Image Matrix (Numpy Array)

img = img_Umat.get()

print(type(img))

In [None]:
# Check OpenCL Enable in OpenCV Build Information

print(cv2.getBuildInformation())

In [None]:
# Performance comparison Image Matrix (Numpy Array) vs UMat (OpenCL)

# Resizing to super big image -> rotate -90 -> grayscaling -> do Canny detection -> rotate 90 -> resizing back to original size
# Image Matrix Implementation

times = []
big_h, big_w = 3440, 3540
h, w = 344, 354
gray = None
for _ in range(100):
    e1 = cv2.getTickCount()

    img = cv2.imread("lena.jpg", cv2.IMREAD_COLOR)
    img = cv2.resize(img, (big_w, big_h))
    img = cv2.rotate(img, cv2.ROTATE_90_CLOCKWISE)
    gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
    gray = cv2.Canny(gray, 0, 20)
    gray = cv2.rotate(gray, cv2.ROTATE_90_COUNTERCLOCKWISE)
    gray = cv2.resize(gray, (w, h))

    e2 = cv2.getTickCount()
    times.append((e2 - e1)/ cv2.getTickFrequency())
    
avg_time_mat = np.array(times).mean()
print("Average processing time CPU : %.2fs" % avg_time_mat)

In [None]:
# Performance comparison Image Matrix (Numpy Array) vs UMat (OpenCL)

# Resizing to super big image -> rotate -90 -> grayscaling -> do Canny detection -> rotate 90 -> resizing back to original size
# UMat (OpenCL) Implementation

import numpy as np

times = []
big_w, big_h = 3440, 3540
h, w = 344, 354

for _ in range(100):
    e1 = cv2.getTickCount()

    img = cv2.imread("lena.jpg", cv2.IMREAD_COLOR)
    imgUMat = cv2.UMat(img)
    imgUMat = cv2.resize(imgUMat, (big_w, big_h))
    imgUMat = cv2.rotate(imgUMat, cv2.ROTATE_90_CLOCKWISE)
    gray = cv2.cvtColor(imgUMat, cv2.COLOR_BGR2GRAY)
    gray = cv2.Canny(gray, 0, 20)
    gray = cv2.rotate(gray, cv2.ROTATE_90_COUNTERCLOCKWISE)
    gray = cv2.resize(gray, (w, h))

    e2 = cv2.getTickCount()
    times.append((e2 - e1)/ cv2.getTickFrequency())
    
avg_time_umat = np.array(times).mean()
print("Average processing time UMat (OpenCL) : %.2fs" % avg_time_umat)
print("Speedup over Mat : %.2fs" % (avg_time_mat/avg_time_umat))

In [None]:
cv2.imshow("window", gray)
cv2.waitKey(0)
cv2.destroyAllWindows()

## Why UMat (OpenCL) is not utilizing Jetson Nano GPU ?
- Because NVIDIA Tegra (Processor used in Jetson Nano) is not supported OpenCL (event it installed in OS)
- related note  : [https://forums.developer.nvidia.com/t/opencl-support/74071](https://forums.developer.nvidia.com/t/opencl-support/74071)

## OpenCV Gpu::Mat Datastructure in Python

In [None]:
# Converting Image Mtrix to cuda::GPUMat

img = cv2.imread("lena.jpg")

img_GpuMat = cv2.cuda_GpuMat() # Create GpuMat object

img_GpuMat.upload(img) # Upload Image Matrix (Host Memory) to GpuMat (GPU Memory)

type(img_GpuMat)

In [None]:
# GpuMat Property

print(dir(img_GpuMat))

In [None]:
# check GpuMat size

print(img_GpuMat.size())

In [None]:
# Convert GpuMat (GPU Memory) to Image Matrix (Host Memory)

img = img_GpuMat.download()

print(type(img))

In [None]:
# Performance comparison Image Matrix (Numpy Array) vs GpuMat (CUDA)

# cuda::Resizing to super big image -> rotate -90 -> cuda::grayscaling -> do cuda::Canny detection -> rotate 90 -> cuda::resizing back to original size
# GpuMat (CUDA) Implementation

times = []
big_h, big_w = 3440, 3540
h, w = 344, 354

img_GpuMat = cv2.cuda_GpuMat() # Create GpuMat object
img_GpuMat.create((w, h), cv2.CV_8UC3) # Initialize GPU (memory allocation & etc.), cv2.CV_8UC3 -> 8bit image 3 channel

img_big_GpuMat = cv2.cuda_GpuMat() # Create GpuMat object for Big Matrix
img_big_GpuMat.create((big_w, big_h), cv2.CV_8UC3) # cv2.CV_8UC1 -> 8bit image 3 channel

gray_GpuMat = cv2.cuda_GpuMat() # Create GpuMat object for Grayscale Matrix
gray_GpuMat.create((big_w, big_h), cv2.CV_8UC1) # cv2.CV_8UC1 -> 8bit image 1 channel

res_GpuMat = cv2.cuda_GpuMat() # Create GpuMat object for Result Matrix
res_GpuMat.create((w, h), cv2.CV_8UC1) # cv2.CV_8UC1 -> 8bit image 1 channel

Canny = cv2.cuda.createCannyEdgeDetector(0, 20) # Initialize Canny Detector in CUDA


for _ in range(100):
    e1 = cv2.getTickCount()

    img = cv2.imread("lena.jpg")
    img_GpuMat.upload(img)
    cv2.cuda.resize(img_GpuMat, (big_w, big_h), img_big_GpuMat) # Resize in CUDA context
    cv2.cuda.rotate(img_big_GpuMat, (big_w*2, big_h*2), -90, img_big_GpuMat) # rotate -90
    cv2.cuda.cvtColor(img_big_GpuMat, cv2.COLOR_BGR2GRAY, gray_GpuMat) # Grayscaling in CUDA context
    Canny.detect(gray_GpuMat, gray_GpuMat) # Call Canny Detector
    cv2.cuda.rotate(gray_GpuMat, (big_w*2, big_h*2), 90, gray_GpuMat) # rotate 90
    cv2.cuda.resize(gray_GpuMat, (w, h), res_GpuMat) # Resizig back

    e2 = cv2.getTickCount()
    times.append((e2 - e1)/ cv2.getTickFrequency())
    
    
avg_time_gpumat = np.array(times).mean()
print("Average processing time GpuMat (CUDA) : %.2fs" % avg_time_gpumat)
print("Speedup over Mat : %.2fs" % (avg_time_mat/avg_time_gpumat))

In [None]:
cv2.imshow("window", res_GpuMat.download())
cv2.waitKey(0)
cv2.destroyAllWindows()

## Special Note : 
- Performance improvement in GPU (CUDA) over CPU is depend on : 
    - How complex task to be executed, if there is **simple task**, using **GPU implementation** maybe **worse** compare to CPU, because time for upload / download data from or to GPU memory to Host Memory will give a lot portion to overall processing time.
    - Keep Processing in GPU space (GPU Memory) and download (if necassary) at the end processing.
    - Depend on compute capability on GPU Device : 
        - Jetson Nano GPU : 
            - NVIDIA Maxwell architecture NVIDIA CUDA® cores
            - Shared Memory 
            - Compute Capability : 5.3
        - Jetson Nano GPU relative performance comparison :<br>
        ![](res/jetson_nano_gpu.png)

___
# 2 Split & Merging Image Channel
- Using Numpy Matrix slicing
- Using OpenCV method
- CUDA Implementation

## 2.1 Using Numpy Matrix Slicing 

In [None]:
e1 = cv2.getTickCount()

img = cv2.imread("lena.jpg")

# split channel
img_b = img[:,:,0]
img_g = img[:,:,1]
img_r = img[:,:,2]

# merging back
# merging using np.dstack : stack arrays in sequence depth wise (along third axis).
img_res = np.dstack([img_b, img_g, img_r])

e2 = cv2.getTickCount()
numpy_time = (e2 - e1)/ cv2.getTickFrequency()
print("numpy implementation execution time : %.6fs" % numpy_time)

- method `np.dstack(tup)`
    - `tup` : sequence numpy array, `[arr1, arr2, arr3]`

In [None]:
cv2.imshow("window", img_res)
cv2.waitKey(0)
cv2.destroyAllWindows()

## 2.2 Using OpenCV Method

In [None]:
e1 = cv2.getTickCount()

img = cv2.imread("lena.jpg")

# split using cv2.split
img_b, img_g, img_r = cv2.split(img)

# merging back
# merging using cv2.merge
img_res = cv2.merge([img_b, img_g, img_r])

e2 = cv2.getTickCount()
opencv_time = (e2 - e1)/ cv2.getTickFrequency()
print("OpenCV implementation execution time : %.6fs" % opencv_time)
print("Speedup improvement over Numpy implementation : %.4f" % (numpy_time/opencv_time))

## 2.3 Using GPU Implementation

In [None]:
# Initialization
h, w = 344, 354
img_GpuMat = cv2.cuda_GpuMat() # Create GpuMat object 
img_GpuMat.create((w, h), cv2.CV_8UC3) # cv2.CV_8U -> 8bit image 3 channel (default)
ch1_GpuMat = cv2.cuda_GpuMat() # Create GpuMat object 
ch1_GpuMat.create((w, h), cv2.CV_8UC1) # cv2.CV_8UC1 -> 8bit image 1 channel
ch2_GpuMat = cv2.cuda_GpuMat() # Create GpuMat object 
ch2_GpuMat.create((w, h), cv2.CV_8UC1) # cv2.CV_8UC1 -> 8bit image 1 channel
ch3_GpuMat = cv2.cuda_GpuMat() # Create GpuMat object 
ch3_GpuMat.create((w, h), cv2.CV_8UC1) # cv2.CV_8UC1 -> 8bit image 1 channel
img_res_GpuMat = cv2.cuda_GpuMat() # Create GpuMat object 
img_res_GpuMat.create((w, h), cv2.CV_8UC3) # cv2.CV_8U -> 8bit image 3 channel (default)



e1 = cv2.getTickCount()

img = cv2.imread("lena.jpg")

# upload to GpuMat (GPU Memory)
img_GpuMat.upload(img)

# split using cv2.cuda.split
cv2.cuda.split(img_GpuMat, [ch1_GpuMat, ch2_GpuMat, ch3_GpuMat])

# merging back
# merging using cv2.cuda.merge
cv2.cuda.merge([ch1_GpuMat, ch2_GpuMat, ch3_GpuMat], img_res_GpuMat)

e2 = cv2.getTickCount()
cuda_time = (e2 - e1)/ cv2.getTickFrequency()
print("OpenCV CUDA implementation execution time : %.6fs" % cuda_time)
print("Speedup improvement over Numpy implementation : %.4f" % (numpy_time/cuda_time))

___
# 3. Image Croping & Resizing
- Croping using Numpy
- Resize using OpenCV
- CUDA Implementation

## 3.1 Crop using Numpy Slicing
- We can use *numpy slicing* to Crop Image Matrix
- using this format `image_array[y_min:y_max , x_min:x_max]`
- where `y_min`, `y_max`, `x_min` and `x_max` is pixel coordinate where the image cropped.<br>
    <img src="res/crop_img.png" style="width: 400px;"></img><br><br>
    <img src="res/crop_image_il.png" style="width: 450px;"></img>

In [None]:
img = cv2.imread("lena.jpg")

# Cropping using Numpy slicing, img[y_min:y_max, x_min:x_max]
img_crop = img[50:-50, 50:-50]

cv2.imshow("original", img)
cv2.imshow("Cropped", img_crop)
cv2.waitKey(0)
cv2.destroyAllWindows()

## 3.2 Resize using OpenCV Method
- untuk melakukan resize image pada OpenCV diprkenalkan beberapa method berikut :
    - `cv2.resize(img, (w_new, h_new))` : resize `img` ke ukuran `w_new` x `h_new`
    <img src="res/resize.jpg" style="width: 600px;"></img>

In [None]:
img = cv2.imread('lena.jpg')

# resize image (new_widht, new_height)
img_resize = cv2.resize(img, (320, 240))  

cv2.imshow("original", img)
cv2.imshow("Resized", img_resize)
cv2.waitKey(0)
cv2.destroyAllWindows()

In [None]:
# Resize Image and keep aspec ratio

ratio = 0.5 # resize ratio

img = cv2.imread('lena.jpg')
h, w, c = img.shape

width = int(w* ratio)
height = int(h * ratio)

# resize image (new_widht, new_height)
img_resize = cv2.resize(img, (width, height))  

cv2.imshow("original", img)
cv2.imshow("Resized", img_resize)
cv2.waitKey(0)
cv2.destroyAllWindows()

- OpenCV resize using `fx` and `fy` parameter,

In [None]:

img = cv2.imread('lena.jpg')

# resize image 
img_resize = cv2.resize(img, (0,0), fx=0.5, fy=0.5)  

cv2.imshow("original", img)
cv2.imshow("Resized", img_resize)
cv2.waitKey(0)
cv2.destroyAllWindows()

- resize with **interpolation**
- Interpolation parameter :
    - `cv2.INTER_NEAREST` : This is using a **nearest-neighbor interpolation** to **shrink an image**.
    - `cv2.INTER_LINEAR` : This is primarily used when **larging** is required (default).
    - `cv2.INTER_AREA` : This is used when we need need to **shrink an image** (the best).
    - `cv2.INTER_CUBIC` : This is **slow** for **larging image**, but more efficient (**higer quality**).<br><br>
- Interpolation Method : <br>
    <img src="res/interpolation.png" style="width: 400px;"></img><br><br>
- Nearest Neighbor Interpolation : <br>
    <img src="res/Nearest_Neighbor.png" style="width: 400px;"></img><br><br>
- Linear Interpolation : <br>
    <img src="res/Bilinear_interpolation.png" style="width: 400px;"></img><br><br>
- Cubic Interpolation : <br>
    <img src="res/Bicubic_interpolation.png" style="width: 400px;"></img><br><br>
- Inter Area Interpolationis :
    - is a **linear interpolation** with slightly more complicated coefficient values.

In [None]:
# ---------- shringking -------
img = cv2.imread('lena.jpg')

# resize image (new_widht, new_height)
img_resize_INTER_LINEAR = cv2.resize(img, (0,0), fx=2.5, fy=2.5) 
img_resize_INTER_NEAREST = cv2.resize(img, (0,0), fx=2.5, fy=2.5, interpolation=cv2.INTER_NEAREST) 

# show image 
cv2.imshow('Original Image', img)
cv2.imshow('INTER_LINEAR Resized Image', img_resize_INTER_LINEAR)
cv2.imshow('INTER_NEAREST Resized Image', img_resize_INTER_NEAREST)
cv2.waitKey(0)
cv2.destroyAllWindows()

In [None]:
# ---------- larging -------
img = cv2.imread('lena.jpg')

# resize image (new_widht, new_height)
img_resize = cv2.resize(img, (0,0), fx=3.5, fy=3.5) 
img_resize_INTER_CUBIC = cv2.resize(img, (0,0), fx=3.5, fy=3.5, interpolation=cv2.INTER_CUBIC) 
img_resize_INTER_NEAREST = cv2.resize(img, (0,0), fx=3.5, fy=3.5, interpolation=cv2.INTER_NEAREST) 
img_resize_INTER_AREA = cv2.resize(img, (0,0), fx=3.5, fy=3.5, interpolation=cv2.INTER_AREA) 

# show image 
cv2.imshow('Original Image', img)
cv2.imshow('INTER_LINEAR Resized Image', img_resize)
cv2.imshow('INTER_CUBIC Resized Image', img_resize_INTER_CUBIC)
cv2.imshow('INTER_NEAREST Resized Image', img_resize_INTER_NEAREST)
cv2.imshow('INTER_AREA Resized Image', img_resize_INTER_AREA)
cv2.waitKey(0)
cv2.destroyAllWindows()

## 3.3 Crop & Resize Image using OpenCV CUDA Module

In [None]:
# Initialization
h, w = 344-100, 354-100
ratio = 0.5
img_GpuMat = cv2.cuda_GpuMat() # Create GpuMat object 
img_GpuMat.create((w, h), cv2.CV_8UC3) # cv2.CV_8U -> 8bit image 3 channel (default)

img_resize_GpuMat = cv2.cuda_GpuMat() # Create GpuMat object 
img_resize_GpuMat.create((int(w*ratio), int(h*ratio)), cv2.CV_8UC3) # cv2.CV_8U -> 8bit image 3 channel (default)



e1 = cv2.getTickCount()
img = cv2.imread("lena.jpg")

# Cropping using Numpy slicing, img[y_min:y_max, x_min:x_max]
img_crop = img[50:-50, 50:-50]

# upload to GpuMat (GPU Memory)
img_GpuMat.upload(img_crop)

# Resize image using cv2.cuda.resize()
cv2.cuda.resize(img_GpuMat, (0,0), img_resize_GpuMat, fx=ratio, fy=ratio)

e2 = cv2.getTickCount()
cuda_time = (e2 - e1)/ cv2.getTickFrequency()
print("OpenCV CUDA implementation execution time : %.6fs" % cuda_time)

In [None]:
cv2.imshow("original", img)
cv2.imshow("Resized", img_resize_GpuMat.download())
cv2.waitKey(0)
cv2.destroyAllWindows()

___
# 4. Image Color Conversion

![](res/gray_image.png)
- Diperkenalkan method `cv2.cvtColor()` untuk color conversion pada OpenCV
- Berikut adalah parameter convert color yang dapat digunakan :
    - convert BGR <--> RGB \
    `cv2.COLOR_BGR2RGB` \
    `cv2.COLOR_RGB2BGR`
    - convert BGR <--> HSV \
    `cv2.COLOR_BGR2HSV` \
    `cv2.COLOR_HSV2RGB`
    - convert BGR <--> BGRA \
    `cv2.COLOR_BGR2BGRA` \
    `cv2.COLOR_BGRA2BGR`
    - convert RGB <--> RGBA \
    `cv2.COLOR_RGB2BGRA` \
    `cv2.COLOR_RGBA2BGR`
    - convert BGR <--> GRAY \
    `cv2.COLOR_BGR2GRAY` \
    `cv2.COLOR_GRAY2RGB` <br><br>
- Convert BGR to RGB Ilustration <br>
    - OpenCV using **Rec. 601 luma** formula to calculate grayscale image :
    $\text{RGB[A] to Gray:} \quad Y \leftarrow 0.299 \cdot R + 0.587 \cdot G + 0.114 \cdot B$ <br><br>
    <img src="res/gray_image_2.png" style="width: 400px;"></img><br><br>
- Source :
    - [OpenCV cvtColor Doc](https://docs.opencv.org/2.4/modules/imgproc/doc/miscellaneous_transformations.html#void%20cvtColor%28InputArray%20src,%20OutputArray%20dst,%20int%20code,%20int%20dstCn%29)
    - [Luma Formula (Grayscale Transformation)](https://en.wikipedia.org/wiki/Luma_%28video%29#Rec._601_luma_versus_Rec._709_luma_coefficients)

## 4.1 OpenCV Implementation

In [None]:
img = cv2.imread("lena.jpg")

# convert BGR to Gray
img_gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)

cv2.imshow('Original', img)
cv2.imshow('Gray', img_gray)
cv2.waitKey(0)
cv2.destroyAllWindows()

## 4.2 Opencv CUDA Module Implementation

In [None]:
# Initialization
h, w = 344, 354
img_GpuMat = cv2.cuda_GpuMat() # Create GpuMat object 
img_GpuMat.create((w, h), cv2.CV_8UC3) # cv2.CV_8UC3 -> 8bit image 3 channel (default)

gray_GpuMat = cv2.cuda_GpuMat() # Create GpuMat object 
gray_GpuMat.create((w, h), cv2.CV_8UC1) # cv2.CV_8UC3 -> 8bit image 1 channel

img = cv2.imread("lena.jpg")

img_GpuMat.upload(img)

# convert BGR to Gray using CUDA
cv2.cuda.cvtColor(img_GpuMat, cv2.COLOR_BGR2GRAY, gray_GpuMat)

cv2.imshow('Original', img)
cv2.imshow('Gray', gray_GpuMat.download())
cv2.waitKey(0)
cv2.destroyAllWindows()

# Source
- https://docs.opencv.org/master/da/d49/tutorial_py_bindings_basics.html
- https://en.wikipedia.org/wiki/OpenCL
- https://jeanvitor.com/opencv-opencl-umat-performance/
- https://opencv.org/opencl/
- https://opencv.org/platforms/cuda/
- https://www.techpowerup.com/gpu-specs/jetson-nano-gpu.c3643
