# Project - Realtime Restructure

## introduction：

The project is to combine two input frames under the same scene to generate a large landscape model view.<br>
The program will take two device video inputs as the source, then perform feature matching based on similar parts of the input source, and after that rotate the input frame by using the matched feature points to achieve the function of recombination.

<br>

More details can be visited through:
- GitHub: https://github.com/D18130495/Video-Restructure
- Yuhong He Blog: https://imageprocessing.yuhong.me/
- Yushun Zeng Blog: https://imageprocessing.huaruoyumu.com/
- Xinyu Zhang Blog: https://imageprocessing633246182.wordpress.com/

## User guide:

Setup the equipments:
1. Two video input equipment connected to the program run machine.

<br>

Run the program:
1. Make sure that the two video input device input views have some similarity under the same scenes.
2. Run the program.
3. Move two video input devices wider or narrower to find the best angle you want the output to be.
4. Click on the "result" window to recombine two input videos again.

## Project workflow and used algorithms:

1. Ask system open two videos to capture input videos:<br>
Use cv2.VideoCapture to get two webcam inputs and use while loop to continuously show the frame.

<br>

2. Detect key points of two video inputs with OpenCV SIFT algorithm and calculate feature description information:<br>
Use cv2.xfeatures2d.SIFT_create() to create SIFT with .detectAndCompute() to calculate the key points and features.
After having the result of the key points, perusing each point to get the X, Y for each point and store them in an array.

<br>

3. Match the detected keypoints:<br>
Use cv2.BFMatcher() to create the Brute-force matcher, with cv2.knnMatch to match the detected keypoints.

<br>

4. Filtering out the best-matched keypoints:<br>
Use cv2.drawMatchesKnn() to draw the relationship between two input frames for the not filtered keypoints, and use the appropriate parameter to filter out the best-matched keypoint, after that draw the relationship again with the filtered keypoints.

<br>

5. Using filtered keypoints with homography matrix to rotate input frame:<br>
Use cv2.findHomography() with filtered keypoints to rotate one of the input frames, and after that put another frame on the rotated frame, and the two frames will be combined.

<br>

6. Stabilize the output and add mouse click event to make the program can restructure two frames again:<br>
Use variables to store whether the result has been calculated once, if the result already exists program will not calculate again, and use the mouse click call back function to make the program can calculate the result again to restructure the two frames again.

## Citation of peer-reviewed research:

Comparison with other design peer-reviewed research:

[1] Rosebrock, A. (2021) Image inpainting with opencv and python, PyImageSearch. Available at: https://pyimagesearch.com/2020/05/18/image-inpainting-with-opencv-and-python/ [Accessed: 7 October, 2022].


[2] Rosebrock, A. (2021) Multi-class object detection and bounding box regression with Keras, tensorflow, and Deep Learning, PyImageSearch. Available at: https://pyimagesearch.com/2020/10/12/multi-class-object-detection-and-bounding-box-regression-with-keras-tensorflow-and-deep-learning/ [Accessed: 7 October, 2022].


[3] Rosebrock, A. (2021) OpenCV text detection (East text detector), PyImageSearch. Available at: https://pyimagesearch.com/2018/08/20/opencv-text-detection-east-text-detector/ [Accessed: 7 October, 2022].


<br>

Project peer-reviewed research:

[1] Introduction to SIFT (Scale-Invariant Feature Transform).
Available at: https://docs.opencv.org/4.x/da/df5/tutorial_py_sift_intro.html [accessed 16 October, 2022].

[2] Feature Matching. Available at: https://opencv24-python-tutorials.readthedocs.io/en/latest/py_tutorials/py_feature2d/py_matcher/py_matcher.html [accessed 16 October, 2022].

[3] Evaluating OpenCV’s new RANSACs.
Available at: https://opencv.org/evaluating-opencvs-new-ransacs/ [accessed 16 October, 2022].

[4] Basic concepts of the homography explained with code.
Available at: https://docs.opencv.org/4.x/d9/dab/tutorial_homography.html [accessed 16 October, 2022].

[5] Feature Detection OpenCV.
Available at: https://docs.opencv.org/4.x/d7/d66/tutorial_feature_detection.html [accessed 31 October, 2022].

[6] Feature Matching OpenCV.
Available at: https://docs.opencv.org/4.x/dc/dc3/tutorial_py_matcher.html [accessed 31 October, 2022].

[7] Pless, R. and Let (no date) Lecture 3: Camera rotations and homographies - ppt download, SlidePlayer. Available at: https://slideplayer.com/slide/14746373/ [Accessed: 16 November, 2022].

[8] Feature Matching + Homography to find Objects. Available at: https://docs.opencv.org/3.4/d1/de0/tutorial_py_feature_homography.html
[accessed 16 November, 2022].

In [1]:
import cv2 # OpenCV
import numpy as np # numpy use to handle Homography rotation parameter type

In [6]:
# ask system open two videos to capture input videos
view_one = cv2.VideoCapture(1)
view_two = cv2.VideoCapture(0)

# set the cached to None when first time run the program and global for the mouse click function callback method
global cachedH
cachedH = None

# continuously showing the frame
while True:
    ret, frame_one = view_one.read()
    ret, frame_two = view_two.read()
    
    # resize two input frames
    frame_one = cv2.resize(frame_one, (600, 400))
    frame_two = cv2.resize(frame_two, (600, 400))
    
    # cachedH can stabilize the result, by not always calculate rotate parameter H for each loop
    # cachedH is None means this is the first time run the program
    # or the user clicks to let the program recombine the input frames again
    if cachedH is None:
        # First step: detect the keypoints
        # convert input frames to gray color space for better keypoint detection 
        frame_one_gray = cv2.cvtColor(frame_one, cv2.COLOR_BGR2GRAY)
        frame_two_gray = cv2.cvtColor(frame_two, cv2.COLOR_BGR2GRAY)

        # create SIFT calculator
        calculator = cv2.xfeatures2d.SIFT_create()
        
        # detect key points and calculate feature description information
        frame_one_keypoints, frame_one_features = calculator.detectAndCompute(frame_one_gray, None)
        frame_two_keypoints, frame_two_features = calculator.detectAndCompute(frame_two_gray, None)

        # demo of detected keypoints drawn on the frame
#         keypoints_frame_one = cv2.drawKeypoints(frame_one, frame_one_keypoints, None, color=(0, 255, 0), 
#                                                 flags=cv2.DRAW_MATCHES_FLAGS_DRAW_RICH_KEYPOINTS)
#         keypoints_frame_two = cv2.drawKeypoints(frame_two, frame_two_keypoints, None, color=(0, 255, 0), 
#                                                 flags=cv2.DRAW_MATCHES_FLAGS_DRAW_RICH_KEYPOINTS)
        
        
#         cv2.imshow('Keypoints frame one', keypoints_frame_one)
#         cv2.imshow('Keypoints frame two', keypoints_frame_two)
    

        # create two arrays to store detected keypoints
        frame_one_keypoints_array = []
        frame_two_keypoints_array = []

        # traverse the detected keypoints and store in the array
        for keypoint_one in frame_one_keypoints:
            frame_one_keypoints_array.append(keypoint_one.pt)

        for keypoint_two in frame_two_keypoints:
            frame_two_keypoints_array.append(keypoint_two.pt)
    
    
        # Second step: match the keypoints
        # create Brute-force descriptor matcher
        descriptor_matcher = cv2.BFMatcher()

        # use BFMatcher.knnMatch() to get k best matches
        matches = descriptor_matcher.knnMatch(frame_one_features, frame_two_features, k = 2)
        
        # demo of matched keypoints drawn on the frame
#         matched_line = cv2.drawMatchesKnn(frame_one, frame_one_keypoints, frame_two, frame_two_keypoints, matches, None, 
#                                   flags=cv2.DrawMatchesFlags_NOT_DRAW_SINGLE_POINTS)

#         cv2.imshow("Matched line", matched_line)
        
        # create the array use to store the best-matched keypoints
        matched_points = []

        # traverse the matched keypoints and store the matched keypoints
        # if m.distance < 0.75 * n.distance means best matched
        for m, n in matches:
            if m.distance < 0.75 * n.distance:
                matched_points.append([m])

        # demo of the best matched keypoints drawn on the frame     
#         filtered_matched_line = cv2.drawMatchesKnn(frame_one, frame_one_keypoints, frame_two, frame_two_keypoints,
#                                                    matched_points, None, flags=cv2.DrawMatchesFlags_NOT_DRAW_SINGLE_POINTS)

#         cv2.imshow("Filtered matched line", filtered_matched_line)


        # create two arrays to store the best matched keypoints on the original frames
        frame_one_best_keypoints_array = []
        frame_two_best_keypoints_array = []

        # cv2.findHomography() need at least four kaypoints to rotate the frame
        # if there are more then 4 best matched keypoints, find each keypoint on the original frame then store them in arrays
        if len(matched_points) > 4:
            for best_keypoint_one in matched_points:
                point = best_keypoint_one[0].queryIdx
                frame_one_best_keypoints_array.append(frame_one_keypoints_array[point])

            for best_keypoint_two in matched_points:
                point = best_keypoint_two[0].trainIdx
                frame_two_best_keypoints_array.append(frame_two_keypoints_array[point])
        else:
            continue
        
        # use the found best matched keypoints on the original frame with Homography to calculate rotate parameter H
        H, status = cv2.findHomography(np.float32(frame_one_best_keypoints_array), 
                                       np.float32(frame_two_best_keypoints_array), cv2.RANSAC, 4.0)

        # cached the calculated rotate parameter H, which can make the result Stable
        cachedH = H

    # apply the rotation on one of the frame with parameter H, and overlap another frame on the rotated frame
    rotated = cv2.warpPerspective(frame_one, cachedH, (frame_one.shape[1] + frame_two.shape[1], frame_one.shape[0]))
    result = rotated.copy()
    result[0:frame_two.shape[0], 0:frame_two.shape[1]] = frame_two
    
    # mouse click callback function which use to clear cachedH
    # make the program calculate rotate parameter H again and reapply the rotation on the input frame
    def draw(event, x, y, flags, param):
        if event == cv2.EVENT_LBUTTONDOWN:
            global cachedH
            cachedH = None

    cv2.namedWindow("result") # mouse click window
    cv2.setMouseCallback("result", draw) # mouse click callback
    
    # show the original two video input
    cv2.imshow("view one", frame_one)
    cv2.imshow("view two", frame_two)
    
    # show the combined result
    cv2.imshow("result", result)
    
    # stop the videos by pressing q  
    if cv2.waitKey(1) & 0xFF == ord('q'):
        break

# release two webcams and destroy windows
view_one.release()
view_two.release()
cv2.destroyAllWindows() 