# Pertemuan 12
- CUDA Implementation Feature Detection, Description 
- CUDA Feature Matching
- Feature Matching + Homography to Detect Object
- Object Tracker : 
    - Tracker CSRT
    - Tracker KCF
___
### Maximizing Jetson Nano Perfomance


In [None]:
# sudo nvpmodel -m 0
# sudo jetson_clocks

In [None]:
import cv2
import numpy as np
import matplotlib.pyplot as plt

In [None]:
# check OpenCV Version

cv2.__version__

'4.5.3'

# 1. CUDA Implementation Feature Detection, Description and Feature Matching
## 1.1 CUDA Harris Corner Detector
- CUDA method use `cv2.cuda.createHarrisCorner(srcType, blockSize, ksize, k)`
- Here : 
    - `srcType` : Input source type. Only `CV_8UC1` and `CV_32FC1` are supported for now.
    - `blockSize` :  It is the size of neighbourhood considered for corner detection
    - `ksize` : Aperture parameter of the Sobel derivative used.
    - `k` : Harris detector free parameter in the equation.
- Then use `.compute(src, dst)` to find corner in GPU matrix image,
- Where :
    - `src` : input image (GPU Mat)
    - `dst` : output image (GPU Mat)

## 1.2 CUDA Shi-Tomasi Corner Detector
- CUDA method use `cv2.cuda.createGoodFeaturesToTrackDetector(srcType, maxCorners, qualityLevel, minDistance, blockSize, useHarrisDetector, harrisK)`,
- Where : 
    - `srcType` : Input source type. Only `CV_8UC1` and `CV_32FC1` are supported for now.
    - `maxCorners`: Maximum number of corners to return. If negative, means nolimit.
    - `qualityLevel`: Parameter characterizing the minimal accepted quality of image corners. 
    - `minDistance`: Minimum possible Euclidean distance between the returned corners.
    - `blockSize` : It is the size of neighbourhood considered for corner detection
    - `useHarrisDetector` : Parameter indicating whether to use a Harris detector.
    - `harrisK` : Harris detector free parameter in the equation.
- Then user `.detect(src, dst, corners, mask)` to find corner in GPU matrix image,
- Where :
    - `src` : input image (GPU Mat).
    - `corners` : corners Output vector of detected corners (1-row matrix with `CV_32FC2` type with corners positions).
    - `mask` : Optional region of interest.

## 1.3 CUDA SURF (Speeded-Up Robust Features)
- CUDA Method use `cv2.cuda.SURF_CUDA_create(_hessianThreshold, _nOctaves, _nOctaveLayers, _extended, _keypointsRatio, _upright)`,
- Where : 
    - `_hessianThreshold` : Threshold for hessian keypoint detector used in SURF.
    - `_nOctaves` : Number of pyramid octaves the keypoint detector will use.
    - `_nOctaveLayers` : Number of octave layers within each octave.
    - `_extended` : Extended descriptor flag (true - use extended 128-element descriptors;
    - `_upright` : Up-right or rotated features flag (true - do not compute orientation of features;
- Then call `.detect(img, mask)` to detect **keypoint** on image using SURF.
- Where : 
    - `img` : input image  (GPU Mat).
    - `mask` : Optional region of interest.
- Optionaly, we cann call `.detectWithDescriptors(img, mask)` to detect **keypoint** and **compute descrptor** at one command.

## 1.4 CUDA FAST (Features from Accelerated Segment Test) 
- CUDA Method use `cv2.cuda.FastFeatureDetector_create(threshold, nonmaxSuppression, type, max_npoints)`,
- Where : 
    - `threshold` : default 10.
    - `nonmaxSuppression` : default `true`.
    - `type` : 
        - `cv2.TYPE_5_8`
        - `cv2.TYPE_7_12 `
        - `cv2.TYPE_9_16`
        - `cv2.THRESHOLD`
        - `cv2.NONMAX_SUPPRESSION`
        - `cv2.FAST_N`
    - `max_npoints` : maximum detected keypoint. default 5000.
- Then call `.detect(img, mask)` to detect **keypoint** on image using FAST.
- Where : 
    - `img` : input image (GPU Mat).
    - `mask` : Optional region of interest.
- Then call `.compute(img, keypoint)` to computes the **descriptors** from the keypoints we have found.
- Where : 
    - `img` : input image (GPU Mat).
    - `keypoint` : detected keypoint from `.detect(img, mask)`.
- Optionaly, we cann call `.detectAndCompute(img, mask)` to detect **keypoint** and **compute descrptor** at one command.

In [None]:
# EXAMPLE CUDA Harris Corner Detection

# load image
img = cv2.imread('chessboard.png')
h, w, c = img.shape

# GPU memory initialization
img_GpuMat = cv2.cuda_GpuMat() # Create GpuMat object 
img_GpuMat.create((w, h), cv2.CV_8UC3) # cv2.CV_8UC3 -> 8 bit image 3 channel
gray_GpuMat = cv2.cuda_GpuMat() # Create GpuMat object 
gray_GpuMat.create((w, h), cv2.CV_8UC1) # cv2.CV_8UC1 -> 8 bit image 1 channel
dst_GpuMat = cv2.cuda_GpuMat() # Create GpuMat object 
dst_GpuMat.create((w, h), cv2.CV_32FC1) # cv2.CV_32FC1 -> 32 bit float image 1 channel

# create CUDA Harris COrner Detector object
HarrisCorner = cv2.cuda.createHarrisCorner(srcType=cv2.CV_8UC1, blockSize=2, ksize=3, k=0.04)

# upload to GPU memory
img_GpuMat.upload(img)

# convert to grayscale using CUDA
cv2.cuda.cvtColor(img_GpuMat, cv2.COLOR_BGR2GRAY, gray_GpuMat)

# apply CUDA Harris Corner Detector
HarrisCorner.compute(gray_GpuMat, dst_GpuMat)

# download to host memory
dst = dst_GpuMat.download() 

# -----------------------------------------------------------------------------------
# build mask image to store local maxima coordinate
mask = np.zeros((h,w), np.uint8)
mask[dst>0.05*dst.max()] = 255

# find contour from detected image corner 
contours, hierarchy = cv2.findContours(mask, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_NONE)

# draw cicle on detected contour
for cnt in contours :
    x, y, w, h = cv2.boundingRect(cnt)
    cv2.circle(img, (x + w//2, y + h//2), int(0.02*img.shape[0]), (0, 255, 0), 2)

# show result
plt.figure(figsize=(14,7))
plt.subplot(1,2,1)
plt.imshow(img[:,:,::-1])
plt.title("Detected Corner")

plt.subplot(1,2,2)
plt.imshow(mask, cmap="gray")
plt.title("Corner Haris Image after Dillating (type CV_32FU1)")

plt.show()

In [None]:
# EXAMPLE CUDA Shi-Tomasi Corner Detection

# load image
img = cv2.imread('chessboard.png')
h, w, c = img.shape

# GPU memory initialization
img_GpuMat = cv2.cuda_GpuMat() # Create GpuMat object 
img_GpuMat.create((w, h), cv2.CV_8UC3) # cv2.CV_8UC3 -> 8 bit image 3 channel
gray_GpuMat = cv2.cuda_GpuMat() # Create GpuMat object 
gray_GpuMat.create((w, h), cv2.CV_8UC1) # cv2.CV_8UC1 -> 8 bit image 1 channel

# create CUDA SURF
GoodFeature = cv2.cuda.createGoodFeaturesToTrackDetector(srcType=cv2.CV_8UC1, maxCorners=100, qualityLevel=0.001, minDistance=20)

# upload to GPU memory
img_GpuMat.upload(img)

# convert to grayscale using CUDA
cv2.cuda.cvtColor(img_GpuMat, cv2.COLOR_BGR2GRAY, gray_GpuMat)

# apply CUDA Shi-Tomasi Corner Detector
corners_GpuMat = GoodFeature.detect(gray_GpuMat)

# download to host memory
corners = corners_GpuMat.download() 

# -----------------------------------------------------------------------------------
# convert to int 64
corners = corners.astype(np.int0)

# draw circel for all detected corners (1, n, 2)
for x, y in corners[0, :]:
    cv2.circle(img, (x,y), int(0.02*img.shape[0]), (0, 0, 255), 2)

# show result
plt.figure(figsize=(14,7))
plt.imshow(img[:,:,::-1])

In [None]:
# EXAMPLE CUDA SURF (Speeded-Up Robust Features)

# load image
img = cv2.imread('butterfly.jpg')
h, w, c = img.shape

# GPU memory initialization
img_GpuMat = cv2.cuda_GpuMat() # Create GpuMat object 
img_GpuMat.create((w, h), cv2.CV_8UC3) # cv2.CV_8UC3 -> 8 bit image 3 channel
gray_GpuMat = cv2.cuda_GpuMat() # Create GpuMat object 
gray_GpuMat.create((w, h), cv2.CV_8UC1) # cv2.CV_8UC1 -> 8 bit image 1 channel

# create CUDA SURF (Speeded-Up Robust Features) object
SURF_Detector = cv2.cuda.SURF_CUDA_create(_hessianThreshold=40000, _upright=True)

# upload to GPU memory
img_GpuMat.upload(img)

# convert to grayscale using CUDA
cv2.cuda.cvtColor(img_GpuMat, cv2.COLOR_BGR2GRAY, gray_GpuMat)

# apply CUDA SURF (Speeded-Up Robust Features) to find keypoint and descriptor
kp_GpuMat, des_GpuMat = SURF_Detector.detectWithDescriptors(gray_GpuMat, None)

# download to host memory
# Keypoint GPU Mat need to use `.downloadKeypoints()` from SURF object, 
# because it needs to deserialize a GPUMat int a std::vector<KeyPoint>
kp = SURF_Detector.downloadKeypoints(kp_GpuMat)
des = des_GpuMat.download()
# ----------------------------------------------------------------------------

# dray keypoints (detected corners by SIFT)
img = cv2.drawKeypoints(img, kp, img, (255,0,0), 4)

print("descriptor shape ", des.shape)

# show result
plt.figure(figsize=(14,7))
plt.imshow(img[:,:,::-1])

plt.show()

In [None]:
# EXAMPLE CUDA FAST (Features from Accelerated Segment Test) 

# load image
img = cv2.imread('butterfly.jpg')
h, w, c = img.shape

# GPU memory initialization
img_GpuMat = cv2.cuda_GpuMat() # Create GpuMat object 
img_GpuMat.create((w, h), cv2.CV_8UC3) # cv2.CV_8UC3 -> 8 bit image 3 channel
gray_GpuMat = cv2.cuda_GpuMat() # Create GpuMat object 
gray_GpuMat.create((w, h), cv2.CV_8UC1) # cv2.CV_8UC1 -> 8 bit image 1 channel

# create CUDA FAST (Features from Accelerated Segment Test) object
FAST_Detector = cv2.cuda.FastFeatureDetector_create(threshold=40)

# upload to GPU memory
img_GpuMat.upload(img)

# convert to grayscale using CUDA
cv2.cuda.cvtColor(img_GpuMat, cv2.COLOR_BGR2GRAY, gray_GpuMat)

# apply CUDA FAST (Features from Accelerated Segment Test) to find keypoint and descriptor
kp_GpuMat, des_GpuMat = FAST_Detector.detectAndCompute(gray_GpuMat, None)

# download to host memory
# Keypoint GPU Mat need to use `.convert()` from FAST object, 
# because it needs to deserialize a GPUMat int a std::vector<KeyPoint>
kp = FAST_Detector.convert(kp_GpuMat)
des = des_GpuMat.download()
# ----------------------------------------------------------------------------

# dray keypoints (detected corners by SIFT)
img = cv2.drawKeypoints(img, kp, img, (255,0,0), 4)

print("descriptor shape ", des.shape)

# show result
plt.figure(figsize=(14,7))
plt.imshow(img[:,:,::-1])

plt.show()

____
# 2. CUDA Feature Matching
## 2.1 CUDA Brute Force Matcher (BFMatcher)
- Use method `cv2.cuda.DescriptorMatcher_createBFMatcher(normType)`
- Where : 
    - `normType` : One of `cv2.NORM_L1`, `cv2.NORM_L2`, `cv2.NORM_HAMMING`. L1 and L2 norms are preferable choices for **SIFT** and **SURF** descriptors.
- Then use `cv2.cuda.knnMatch((queryDescriptors, trainDescriptors, k, mask)`,
- Where : 
    - `queryDescriptors` : Query set of descriptors.
    - `trainDescriptors` : Train set of descriptors. This set is not added to the train descriptors collection stored in the class object.
    - `k` : Count of best matches found per each query descriptor or less if a query descriptor has less than k possible matches in total.
    - `mask` : Mask specifying permissible matches between an input query and train matrices of descriptors.
## 2.2 NO CUDA FLANN (Fast Library for Approximate Nearest Neighbors) Implemented yet!
- There is no CUDA FLANN Matcher availbale until now in OpenCV (v4.5.3)
- if we still want to use FLANN matcher, just download descriptior into host memory and apply method `.knnMatch(des1, des2, k)` in `cv2.FlannBasedMatcher()`



In [None]:
# EXAMPLE CUDA SURF + CUDA Brute Force Matcher (BFMatcher)

# load image
img1 = cv2.imread('box.png')
img2 = cv2.imread('box_in_scene.png')
h1, w1, c1 = img1.shape
h2, w2, c2 = img2.shape

# GPU memory initialization
img1_GpuMat = cv2.cuda_GpuMat() # Create GpuMat object 
img1_GpuMat.create((w1, h1), cv2.CV_8UC3) # cv2.CV_8UC3 -> 8 bit image 3 channel
img2_GpuMat = cv2.cuda_GpuMat() # Create GpuMat object 
img2_GpuMat.create((w2, h2), cv2.CV_8UC3) # cv2.CV_8UC3 -> 8 bit image 3 channel
gray1_GpuMat = cv2.cuda_GpuMat() # Create GpuMat object 
gray1_GpuMat.create((w1, h1), cv2.CV_8UC1) # cv2.CV_8UC1 -> 8 bit image 1 channel
gray2_GpuMat = cv2.cuda_GpuMat() # Create GpuMat object 
gray2_GpuMat.create((w2, h2), cv2.CV_8UC1) # cv2.CV_8UC1 -> 8 bit image 1 channel

# create CUDA SURF (Speeded-Up Robust Features) object
SURF_Detector = cv2.cuda.SURF_CUDA_create(_hessianThreshold=700, _upright=True)

# create CUDA BF Matcher object
BFMatcher = cv2.cuda.DescriptorMatcher_createBFMatcher()

# upload to GPU memory
img1_GpuMat.upload(img1)
img2_GpuMat.upload(img2)

# convert to grayscale using CUDA
cv2.cuda.cvtColor(img1_GpuMat, cv2.COLOR_BGR2GRAY, gray1_GpuMat)
cv2.cuda.cvtColor(img2_GpuMat, cv2.COLOR_BGR2GRAY, gray2_GpuMat)

# apply CUDA SURF (Speeded-Up Robust Features) to find keypoint and descriptor
kp1_GpuMat, des1_GpuMat = SURF_Detector.detectWithDescriptors(gray1_GpuMat, None)
kp2_GpuMat, des2_GpuMat = SURF_Detector.detectWithDescriptors(gray2_GpuMat, None)

# apply BF Matcher via KNN (output is list data in host memory, doesn't need to do .download() from device memory)
matches = BFMatcher.knnMatch(des1_GpuMat, des2_GpuMat, k=2)

# download to host memory
# Keypoint GPU Mat need to use `.downloadKeypoints()` from SURF object, 
# because it needs to deserialize a GPUMat int a std::vector<KeyPoint>
kp1 = SURF_Detector.downloadKeypoints(kp1_GpuMat)
kp2 = SURF_Detector.downloadKeypoints(kp2_GpuMat)
# ----------------------------------------------------------------------------

print("number of keypoint 1:", len(kp1))
print("number of keypoint 2:", len(kp2))

# Apply ratio test
good = []
for m, n in matches:
    if m.distance < 0.75*n.distance:
        good.append([m])

# cv.drawMatchesKnn expects list of lists as matches.
result = cv2.drawMatchesKnn(img1, kp1, img2, kp2, good, None, flags=cv2.DrawMatchesFlags_NOT_DRAW_SINGLE_POINTS)

#show result
plt.figure(figsize=(14,7))
plt.imshow(result)
plt.show()

In [None]:
# EXAMPLE CUDA SURF + FLANN Matcher (Fast Library for Approximate Nearest Neighbors)

# load image
img1 = cv2.imread('box.png')
img2 = cv2.imread('box_in_scene.png')
h1, w1, c1 = img1.shape
h2, w2, c2 = img2.shape

# GPU memory initialization
img1_GpuMat = cv2.cuda_GpuMat() # Create GpuMat object 
img1_GpuMat.create((w1, h1), cv2.CV_8UC3) # cv2.CV_8UC3 -> 8 bit image 3 channel
img2_GpuMat = cv2.cuda_GpuMat() # Create GpuMat object 
img2_GpuMat.create((w2, h2), cv2.CV_8UC3) # cv2.CV_8UC3 -> 8 bit image 3 channel
gray1_GpuMat = cv2.cuda_GpuMat() # Create GpuMat object 
gray1_GpuMat.create((w1, h1), cv2.CV_8UC1) # cv2.CV_8UC1 -> 8 bit image 1 channel
gray2_GpuMat = cv2.cuda_GpuMat() # Create GpuMat object 
gray2_GpuMat.create((w2, h2), cv2.CV_8UC1) # cv2.CV_8UC1 -> 8 bit image 1 channel

# create CUDA SURF (Speeded-Up Robust Features) object
SURF_Detector = cv2.cuda.SURF_CUDA_create(_hessianThreshold=700, _upright=True)

# upload to GPU memory
img1_GpuMat.upload(img1)
img2_GpuMat.upload(img2)

# convert to grayscale using CUDA
cv2.cuda.cvtColor(img1_GpuMat, cv2.COLOR_BGR2GRAY, gray1_GpuMat)
cv2.cuda.cvtColor(img2_GpuMat, cv2.COLOR_BGR2GRAY, gray2_GpuMat)

# apply CUDA SURF (Speeded-Up Robust Features) to find keypoint and descriptor
kp1_GpuMat, des1_GpuMat = SURF_Detector.detectWithDescriptors(gray1_GpuMat, None)
kp2_GpuMat, des2_GpuMat = SURF_Detector.detectWithDescriptors(gray2_GpuMat, None)

# download to host memory
# Keypoint GPU Mat need to use `.downloadKeypoints()` from SURF object, 
# because it needs to deserialize a GPUMat int a std::vector<KeyPoint>
kp1 = SURF_Detector.downloadKeypoints(kp1_GpuMat)
kp2 = SURF_Detector.downloadKeypoints(kp2_GpuMat)
des1 =  des1_GpuMat.download()
des2 =  des2_GpuMat.download()

# ----------------------------------------------------------------------------

# create FLANN parameters & object
FLANN_INDEX_KDTREE = 1
index_params = dict(algorithm = FLANN_INDEX_KDTREE, trees = 5)
search_params = dict(checks=50)   
FLANN = cv2.FlannBasedMatcher(index_params, search_params)

# apply FLANN via KNN (output is list data in host memory, doesn't need to do .download() from device memory)
matches = matches = FLANN.knnMatch(des1, des2, k=2)

print("number of keypoint 1:", len(kp1))
print("number of keypoint 2:", len(kp2))

# Apply ratio test
good = []
for m, n in matches:
    if m.distance < 0.75*n.distance:
        good.append([m])

# cv.drawMatchesKnn expects list of lists as matches.
result = cv2.drawMatchesKnn(img1, kp1, img2, kp2, good, None, flags=cv2.DrawMatchesFlags_NOT_DRAW_SINGLE_POINTS)

#show result
plt.figure(figsize=(14,7))
plt.imshow(result)
plt.show()

____
# 3. Feature Matching + Homography to Detect Object
# 3.1 Understanding Homography 
- **Homography** or **Matrix Homography** is relate to Geometric Transformation in the ch-8.
- Matrix Homography is a similar to **Perspective Transform**<br><br>
    ![](resources/homography_transformation.jpg)<br><br>
- in Perspective Transform we use method `cv2.getPerspectiveTransform()` to produce Perpective Matrix from the original point with a *known good set of point*.
- The two sets of 4 points in Perspective Transform must correspond to each other in the same order and each point must be co-planar.
- in Homography Thansform, this part much easier, 
- You can pass in 2 sets of >=4 points you suspect suspect to be “good” and it will use algorithms like **RANSAC** to find the best perspective transform between them. <br><br><br><br>
- In OpenCV we can use method `cv2.findHomography(srcPoints, dstPoints, method, ransacReprojThreshold, mask, maxIters, confidence)`
- Where : 
    - `srcPoints` Coordinates of the points in the original plane, a matrix of the type `CV_32FC2`.
    - `dstPoints` Coordinates of the points in the target plane, a matrix of the type `CV_32FC2`.
    - `method` Method used to compute a homography matrix. The following methods are possible:
        - `0` - a regular method using all the points, i.e., *the least squares method*.
        - `cv2.RANSAC` - RANSAC-based robust method.
        - `cv2.LMEDS` - Least-Median robust method.
        - `cv2.RHO` - PROSAC-based robust method.
    - `ransacReprojThreshold` Maximum allowed reprojection error to treat a point pair as an inlier (used in the RANSAC and RHO methods only).
    - `mask` Optional output mask set by a robust method ( RANSAC or LMeDS ). 
    - `maxIters` The maximum number of RANSAC iterations.
    - `confidence` Confidence level, between 0 and 1.

In [None]:
# EXAMPLE HOMOGRAPHY TRANSFORM

# Read source image.
im_src = cv2.imread('book2.jpg')
# Four corners of the book in source image
pts_src = np.array([[141, 131], [480, 159], [493, 630],[64, 601]])

# Read destination image.
im_dst = cv2.imread('book1.jpg')

# Four corners of the book in destination image.
pts_dst = np.array([[318, 256],[534, 372],[316, 670],[73, 473]])

# Calculate Homography
h, status = cv2.findHomography(pts_src, pts_dst)

# Warp source image to destination based on homography
im_out = cv2.warpPerspective(im_src, h, (im_dst.shape[1],im_dst.shape[0]))


#show result
plt.figure(figsize=(14,7))
plt.subplot(1,3,1)
plt.imshow(im_src)
plt.title("Source Image")

plt.subplot(1,3,2)
plt.imshow(im_dst)
plt.title("Destination Image")

plt.subplot(1,3,3)
plt.imshow(im_out)
plt.title("Warped Source Image")

plt.show()


## 3.2 Feature Matcher and Homography to Find Object
- In the last Feature Matching session, we found locations of some parts of an object in another cluttered image. 
- This information is sufficient to find the **object exactly** on the trainImage.
- For that, we can use a function  `cv2.findHomography()`.
- If we pass the set of points from both the images, it will find the perspective transformation of that object. 
- Then we can use `cv2.perspectiveTransform()` to find the object. 
- It needs atleast **four correct points** to find the transformation.
- We have seen that there can be some possible errors while matching which may affect the result. 
- To solve this problem, algorithm uses **RANSAC** or **LEAST_MEDIAN** (which can be decided by the flags). 
- So good matches which provide correct estimation are called inliers and remaining are called outliers.


In [None]:
# EXAMPLE DETECT OBJECT USING FLANN MATHCER + SIFT with HOMOGRAPHY

# define minimum of match found
MIN_MATCH_COUNT = 10

# load image
img1 = cv2.imread('box.png')          # queryImage
img2 = cv2.imread('box_in_scene.png') # trainImage

gray1 = cv2.cvtColor(img1, cv2.COLOR_BGR2GRAY)
gray2 = cv2.cvtColor(img2, cv2.COLOR_BGR2GRAY)

# Initiate SIFT detector
sift = cv2.SIFT_create()

# find the keypoints and descriptors with SIFT
kp1, des1 = sift.detectAndCompute(gray1,None)
kp2, des2 = sift.detectAndCompute(gray2,None)

print("number of keypoint 1:", len(kp1))
print("number of keypoint 2:", len(kp2))

# FLANN parameters
FLANN_INDEX_KDTREE = 1
index_params = dict(algorithm = FLANN_INDEX_KDTREE, trees = 5)
search_params = dict(checks=50)   # or pass empty dictionary
flann = cv2.FlannBasedMatcher(index_params, search_params)
matches = flann.knnMatch(des1,des2,k=2)

# store all the good matches as per Lowe's ratio test.
good = []
for m,n in matches:
    if m.distance < 0.7*n.distance:
        good.append(m)

# do a HOMOGRAPHY transform for all good keypoint 
if len(good)>MIN_MATCH_COUNT:
    src_pts = np.float32([ kp1[m.queryIdx].pt for m in good ]).reshape(-1,1,2)
    dst_pts = np.float32([ kp2[m.trainIdx].pt for m in good ]).reshape(-1,1,2)

    # find Homography Matrix with method RANSAC
    M, mask = cv2.findHomography(src_pts, dst_pts, cv2.RANSAC, 5.0)
    matchesMask = mask.ravel().tolist()
    
    # apply perspective transform
    h,w,d = img1.shape
    pts = np.float32([[0,0], [0,h-1], [w-1, h-1], [w-1,0] ]).reshape(-1,1,2) #tl, bl, br, tr
    dst = cv2.perspectiveTransform(pts,M)
    
    img2 = cv2.polylines(img2, [np.int32(dst)], True, (0, 0, 255), 2)

else:
    print( "Not enough matches are found - %d/%d" % (len(good), MIN_MATCH_COUNT) )
    matchesMask = None


# draw matches keypoint and homography area
draw_params = dict(matchColor = (0,255,0), # draw matches in green color
                   singlePointColor = None,
                   matchesMask = matchesMask, # draw only inliers
                   flags = 2)
img3 = cv2.drawMatches(img1, kp1, img2, kp2, good, None, **draw_params)

plt.figure(figsize=(14,7))
plt.imshow(img3[:,:,::-1])
plt.show()

In [None]:
# EXAMPLE VIDEO STREAM - DETECT OBJECT USING FLANN MATHCER + SURF with HOMOGRAPHY

# define minimum of match found
MIN_MATCH_COUNT = 4

# Initiate SURF detector
surf = cv2.xfeatures2d.SURF_create(200)

# FLANN parameters & Object
FLANN_INDEX_KDTREE = 1
index_params = dict(algorithm = FLANN_INDEX_KDTREE, trees = 5)
search_params = dict(checks=50)   # or pass empty dictionary
flann = cv2.FlannBasedMatcher(index_params, search_params)

# load image queryImage
img1 = cv2.imread('nemo_template.jpg')         
gray1 = cv2.cvtColor(img1, cv2.COLOR_BGR2GRAY)

# find the keypoints and descriptors with SURF
kp1, des1 = surf.detectAndCompute(gray1, None)

cap = cv2.VideoCapture("nemo_video.mp4")

times = []
while cap.isOpened() :
    e1 = cv2.getTickCount()
    ret, img2 = cap.read() # trainImage
    if not ret : 
        break
    img2 = cv2.resize(img2, (0,0), fx=0.5, fy=0.5)
    gray2 = cv2.cvtColor(img2, cv2.COLOR_BGR2GRAY)

    # find the keypoints and descriptors with SURF
    kp2, des2 = surf.detectAndCompute(gray2, None)

    # apply FLANN Matcher
    matches = flann.knnMatch(des1, des2, k=2)

    # store all the good matches as per Lowe's ratio test.
    good = []
    for m,n in matches:
        if m.distance < 0.7*n.distance:
            good.append(m)

    # do a HOMOGRAPHY transform for all good keypoint 
    try :
        if len(good)>MIN_MATCH_COUNT:
            src_pts = np.float32([ kp1[m.queryIdx].pt for m in good ]).reshape(-1,1,2)
            dst_pts = np.float32([ kp2[m.trainIdx].pt for m in good ]).reshape(-1,1,2)

            # find Homography Matrix with method RANSAC
            M, mask = cv2.findHomography(src_pts, dst_pts, cv2.RANSAC, 5.0)
            matchesMask = mask.ravel().tolist()
            
            # apply perspective transform
            h,w,d = img1.shape
            pts = np.float32([[0,0], [0,h-1], [w-1, h-1], [w-1,0] ]).reshape(-1,1,2) #tl, bl, br, tr
            dst = cv2.perspectiveTransform(pts,M) # object box 
            
            # draw object box (red color)
            img2 = cv2.polylines(img2, [np.int32(dst)], True, (0, 0, 255), 2)
            #print( "Matches found - %d/%d" % (len(good), MIN_MATCH_COUNT) )

        else:
            #print( "Not enough matches are found - %d/%d" % (len(good), MIN_MATCH_COUNT) )
            matchesMask = None
    except Exception as e:
        print(e)

    # show frame
    cv2.imshow("detected object", img2)

    if (cv2.waitKey(1) == ord("q")):
        break
    e2 = cv2.getTickCount()
    times.append((e2 - e1)/ cv2.getTickFrequency())

avg_time = np.array(times).mean()
print("Average processing time : %.4fs" % avg_time)
print("Average FPS : %.2f" % (1/avg_time))
cv2.destroyAllWindows()

____
## [NOTE] Create Template at Runtime
- On above implementation, we need to create a image template as a query image who want to detect on video frame.
- In asddition, there is OpenCV method to help as creating ROI on video frame and use them as an image template.
- We can use `cv2.selectROI(window_name, img, showCrosshair=False)`
- Where : 
    - `windowName` name of the window where selection process will be shown.
    - `img` image to select a ROI.
    - `showCrosshair` if true crosshair of selection rectangle will be shown.

In [None]:
# EXAMPLE (CUDA) VIDEO STREAM - DETECT OBJECT USING BF MATHCER + SURF with HOMOGRAPHY
# + GSreamer (NVIDIA Accelerated element)
# + OpenGL (Optimized Image Rendering)

# create window with OpenGL enable
window_name = "Detected Object"
cv2.namedWindow(window_name, flags=cv2.WINDOW_OPENGL)    # with OpenGL
#cv2.namedWindow(window_name)         # without OpenGL


# load GStreamer File Loader 
from gst_file import gst_file_loader

# load video file using GStreamer
cap = cv2.VideoCapture(gst_file_loader("nemo_video.mp4"), cv2.CAP_GSTREAMER)    
#cap = cv2.VideoCapture("nemo_video.mp4")

# define minimum of match found
MIN_MATCH_COUNT = 10

# load image queryImage
__, img1 = cap.read()

#img1 = cv2.imread("nemo_template.jpg")  
bbox = cv2.selectROI("Crop for Template", img1, False)
x ,y ,w ,h  = np.int0(bbox)
img1 = img1[x:x+w, y:y+h]

img1 = cv2.resize(img1, (0,0), fx=2, fy=2)
h1, w1, c1 = img1.shape 

__, img2 = cap.read()
h2, w2, c2 = img2.shape
h2, w2 = h2//2, w2//2

# GPU memory initialization
img1_GpuMat = cv2.cuda_GpuMat() # Create GpuMat object 
img1_GpuMat.create((w1, h1), cv2.CV_8UC3) # cv2.CV_8UC3 -> 8 bit image 3 channel
img2_GpuMat = cv2.cuda_GpuMat() # Create GpuMat object 
img2_GpuMat.create((w2, h2), cv2.CV_8UC3) # cv2.CV_8UC3 -> 8 bit image 3 channel
img2_GpuMat = cv2.cuda_GpuMat() # Create GpuMat object 
img2_GpuMat.create((w2, h2), cv2.CV_8UC3) # cv2.CV_8UC3 -> 8 bit image 3 channel
gray1_GpuMat = cv2.cuda_GpuMat() # Create GpuMat object 
gray1_GpuMat.create((w1, h1), cv2.CV_8UC1) # cv2.CV_8UC1 -> 8 bit image 1 channel
gray2_GpuMat = cv2.cuda_GpuMat() # Create GpuMat object 
gray2_GpuMat.create((w2, h2), cv2.CV_8UC1) # cv2.CV_8UC1 -> 8 bit image 1 channel


# create CUDA SURF (Speeded-Up Robust Features) object
SURF_Detector = cv2.cuda.SURF_CUDA_create(_hessianThreshold=200, _upright=True)

# create CUDA BF Matcher object
BFMatcher = cv2.cuda.DescriptorMatcher_createBFMatcher()


# upload to GPU memory
img1_GpuMat.upload(img1)

# convert to grayscale using CUDA
cv2.cuda.cvtColor(img1_GpuMat, cv2.COLOR_BGR2GRAY, gray1_GpuMat)

# apply CUDA SURF (Speeded-Up Robust Features) to find keypoint and descriptor
kp1_GpuMat, des1_GpuMat = SURF_Detector.detectWithDescriptors(gray1_GpuMat, None)

# download to host memory
kp1 = SURF_Detector.downloadKeypoints(kp1_GpuMat)

times =[]
while cap.isOpened() :
    e1 = cv2.getTickCount()
    ret, img2 = cap.read() # trainImage
    if not ret : 
        break
    img2 = cv2.resize(img2, (0,0), fx=0.5, fy=0.5)

    img2_GpuMat.upload(img2)
    
    cv2.cuda.cvtColor(img2_GpuMat, cv2.COLOR_BGR2GRAY, gray2_GpuMat)

    # apply CUDA SURF (Speeded-Up Robust Features) to find keypoint and descriptor
    kp2_GpuMat, des2_GpuMat = SURF_Detector.detectWithDescriptors(gray2_GpuMat, None)

    # apply BF Matcher via KNN (output is list data in host memory, doesn't need to do .download() from device memory)
    matches = BFMatcher.knnMatch(des1_GpuMat, des2_GpuMat, k=2)

    # download to host memory
    kp2 = SURF_Detector.downloadKeypoints(kp2_GpuMat)


    # store all the good matches as per Lowe's ratio test.
    good = []
    for m,n in matches:
        if m.distance < 0.7*n.distance:
            good.append(m)

    # do a HOMOGRAPHY transform for all good keypoint 
    try :
        if len(good)>MIN_MATCH_COUNT:
            src_pts = np.float32([ kp1[m.queryIdx].pt for m in good ]).reshape(-1,1,2)
            dst_pts = np.float32([ kp2[m.trainIdx].pt for m in good ]).reshape(-1,1,2)

            # find Homography Matrix with method RANSAC
            M, mask = cv2.findHomography(src_pts, dst_pts, cv2.RANSAC, 5.0)
            matchesMask = mask.ravel().tolist()
            
            # apply perspective transform
            h,w,d = img1.shape
            pts = np.float32([[0,0], [0,h-1], [w-1, h-1], [w-1,0] ]).reshape(-1,1,2) #tl, bl, br, tr
            dst = cv2.perspectiveTransform(pts,M) # object box 
            
            # draw object box (red color)
            img2 = cv2.polylines(img2, [np.int32(dst)], True, (0, 0, 255), 2)
            #print( "Matches found - %d/%d" % (len(good), MIN_MATCH_COUNT) )

        else:
            #print( "Not enough matches are found - %d/%d" % (len(good), MIN_MATCH_COUNT) )
            matchesMask = None
    except Exception as e:
        print(e)

    # show frame
    #cv2.imshow(window_name, img2)

    if (cv2.waitKey(1) == ord("q")):
        break

    e2 = cv2.getTickCount()
    times.append((e2 - e1)/ cv2.getTickFrequency())

avg_time = np.array(times).mean()
print("Average processing time : %.4fs" % avg_time)
print("Average FPS : %.2f" % (1/avg_time))
cv2.destroyAllWindows()

___
# 4. Object Tracker


In [30]:
# load GStreamer File Loader 
from gst_file import gst_file_loader

# load video file using GStreamer 
cap = cv2.VideoCapture(gst_file_loader("nemo_video.mp4"), cv2.CAP_GSTREAMER)  

# Choose tracker
#tracker = cv2.TrackerCSRT_create()
tracker = cv2.TrackerKCF_create()

___, img = cap.read()
img = cv2.resize(img, (0,0), fx=0.5, fy=0.5)

# create initial bounding box
bbox = cv2.selectROI("Tracking",img,False)

tracker.init(img, bbox)

while cap.isOpened():
    e1 = cv2.getTickCount()
    ret, img = cap.read()
    
    if not ret : 
        break

    img = cv2.resize(img, (0,0), fx=0.5, fy=0.5)
    success, bbox = tracker.update(img)

    if success:
        # draw bounding box
        x ,y ,w ,h = np.int0(bbox)
        cv2.rectangle(img, (x, y), (x+w, y+h), (255,0,255), 3, 1)
        cv2.putText(img, "Tracking", (75, 75), cv2.FONT_HERSHEY_SIMPLEX, 0.7, (0,255,0), 1)
    else:
        cv2.putText(img,"Lost", (75, 75), cv2.FONT_HERSHEY_SIMPLEX, 0.7, (0,0,255), 1)
    
    e2 = cv2.getTickCount()
    fps = cv2.getTickFrequency()/(e2-e1)
    
    cv2.putText(img,"%d FPS " % fps, (75,50), cv2.FONT_HERSHEY_SIMPLEX, 0.7, (0,0,255), 1)
    cv2.imshow("Tracking",img)

    if cv2.waitKey(1) == ord('q'):
        break

cv2.destroyAllWindows()

___
# Source
- [https://vivekseth.com/computer-vision-matrix-differences/](https://vivekseth.com/computer-vision-matrix-differences/)
- [https://docs.opencv.org/4.5.3/d1/de0/tutorial_py_feature_homography.html](https://docs.opencv.org/4.5.3/d1/de0/tutorial_py_feature_homography.html)
- [https://docs.opencv.org/master/d9/dab/tutorial_homography.html](https://docs.opencv.org/master/d9/dab/tutorial_homography.html)
- [https://learnopencv.com/homography-examples-using-opencv-python-c/](https://learnopencv.com/homography-examples-using-opencv-python-c/)
- [https://docs.opencv.org/4.5.3/d2/d0a/tutorial_introduction_to_tracker.html](https://docs.opencv.org/4.5.3/d2/d0a/tutorial_introduction_to_tracker.html)
- [https://docs.opencv.org/4.5.0/d9/df8/group__tracking.html](https://docs.opencv.org/4.5.0/d9/df8/group__tracking.html)