# Lab 2 Homographies

## GUI programing in OpenCV

The methods of showing images in openCV are in <code>highgui.hpp</code>. The important methods are:
 - <code>imshow()</code>: show an image in a window. If there is the window which has the same name to show, the image will replace the old image in the same window.
 - <code>waitKey()</code>: wait a key input until time up or until the key is pressed.
 - <code>destroyWindow()</code>: destroy a target window
 - <code>destroyAllWindows()</code>: destroy all windows in the program
 
Let's start with a simple version of our solution.

### Tip before start
**For Visual Studio in Window users in C++**: If you want to open files by using only filename (does not include path of the file), you must put the files such as image file or video file at the same path which cpp file (which has main function) stored.

For example: I created a project name Samplelab2 in default drive <code>C:\Users\alisa\source\repos</code>. The cpp file which contain main function is in path <code>C:\Users\alisa\source\repos\Samplelab2\Samplelab2</code>, so I put the image file named "lena.png" at the path as below:

<img src="img/lab02-1.PNG" width="800"/>

After that, you can use the file without add the folder path.

<code>Mat srcImage = imread("lenna.png");</code>

However, if you open the execute file outside visual studio, you must put the files at the same path as execution file.

**In python**, you can do the same as above.

**For Linux users**: The image file must put at the same as build file path (.o file).

### Show an image

### C++

In [None]:
#include <opencv2/opencv.hpp> // you can use the include library for call "all" functions in opencv
#include <iostream>

using namespace cv;  // You can use any function in cv class by not typing cv::
using namespace std; // You can use any function in std class by not typing std::

int main( int argc, char** argv )
{
    cout << "Show image" << endl;
    
    int key = -1;
    string img = "lenna.png";   // set the image file as string variable
    Mat srcImage = imread(img); // read an image from file
    if (!srcImage.data) {
        cout << "No image to show" << endl;
        return 1;
    }
    imshow("srcImage", srcImage);  // Show the image in window named "srcImage"
    
    // wait a key up to 5000 millisecond, it will continue to next step and the key variable will get -1
    // If you press any key before 5000 millisecond, it will return the key variable as ASCII code of the key and continue to next step
    // If use waitKey(0), the program will wait forever until a user press any key.
    key = waitKey(5000);
    cout << "Key output value: " << char(key) << endl;
    
    return 0;
}

### Python

In [None]:
import cv2

VIDEO_FILE = "robot.mp4"   # define VIDEO_FILE as string, the value is "robot.mp4"
ROTATE = False             # define ROTATE as boolean, the value is true

def main():
    # set the image file as string variable
    path = r'lefna.png'
  
    # read an image from file
    img = cv2.imread(path)
    if img is None:
        print("No image to show")
        return -1
  
    # Show the image in window named "srcImage"
    cv2.imshow('image', img)
    # wait a key up to 5000 millisecond, it will continue to next step and the key variable will get -1
    # If you press any key before 5000 millisecond, it will return the key variable as ASCII code of the key and continue to next step
    # If use waitKey(0), the program will wait forever until a user press any key.
    cv2.waitKey(5000);

main()

### Show an video

### C++

In [None]:
#include <opencv2/opencv.hpp> // you can use the include library for call "all" functions in opencv
#include <iostream>

using namespace cv;  // You can use any function in cv class by not typing cv::
using namespace std; // You can use any function in std class by not typing std::

// In C++, you can define a constant variable by using #define
#define VIDEO_FILE "robot.mp4"   // define VIDEO_FILE as string, the value is "robot.mp4"
#define ROTATE false              // define ROTATE as boolean, the value is true

int main(int argc, char** argv)
{
    Mat matFrameCapture;
    Mat matFrameDisplay;
    int key = -1;

    // Open input video file
    VideoCapture videoCapture(VIDEO_FILE);
    if (!videoCapture.isOpened()) {
        cerr << "ERROR! Unable to open input video file " << VIDEO_FILE << endl;
        return -1;
    }

    // Capture loop
    while (key != int(' '))        // play video until press 'spacebar'
    {
        // Get the next frame
        videoCapture.read(matFrameCapture);
        if (matFrameCapture.empty()) {   // no more frame capture from the video
            // End of video file
            break;
        }

        // Rotate if needed, some video has output like top go down, so we need to rotate it
#if ROTATE
        rotate(matFrameCapture, matFrameDisplay, RotateFlags::ROTATE_180);   //rotate 180 degree and put the image to matFrameDisplay
#else
        matFrameDisplay = matFrameCapture;
#endif

        float ratio = 640.0 / matFrameDisplay.cols;
        resize(matFrameDisplay, matFrameDisplay, cv::Size(), ratio, ratio, INTER_LINEAR); // resize image to 480 * 640 for showing

        // Display
        imshow(VIDEO_FILE, matFrameDisplay); // Show the image in window named "robot.mp4"
        key = waitKey(30);
    }

    return 0;
}

### Python

In [None]:
import cv2
import numpy as np

VIDEO_FILE = "robot.mp4"   # define VIDEO_FILE as string, the value is "robot.mp4"
ROTATE = False             # define ROTATE as boolean, the value is true

def main():
    key = -1;

    # Open input video file
    videoCapture = cv2.VideoCapture(VIDEO_FILE);
    if not videoCapture.isOpened():
        print("ERROR! Unable to open input video file ", VIDEO_FILE)
        return -1

    width  = videoCapture.get(cv2.CAP_PROP_FRAME_WIDTH)   # float `width`
    height = videoCapture.get(cv2.CAP_PROP_FRAME_HEIGHT)  # float `height`

    # Capture loop 
    while (key != ord(' ')):        # play video until press 'spacebar'
        # Get the next frame
        _, matFrameCapture = videoCapture.read()
        if matFrameCapture is None:   # no more frame capture from the video
            # End of video file
            break

        # Rotate if needed, some video has output like top go down, so we need to rotate it
        if ROTATE:
            _, matFrameDisplay = cv2.rotate(matFrameCapture, cv2.ROTATE_180)   #rotate 180 degree and put the image to matFrameDisplay
        else:
            matFrameDisplay = matFrameCapture;

        ratio = 640.0 / width
        dim = (int(width * ratio), int(height * ratio))
        # resize image to 480 * 640 for showing
        matFrameDisplay = cv2.resize(matFrameDisplay, dim)

        # Show the image in window named "robot.mp4"
        cv2.imshow(VIDEO_FILE, matFrameDisplay)
        key = cv2.waitKey(30)

main()

First, let's get <code>waitKey()</code> to wait for the user to press <space> to advance to the next frame or 'q' to quit. Check the <link>[documentation for waitKey()](https://docs.opencv.org/4.3.0/d7/dfc/group__highgui.html#ga5628525ad33f52eab17feebcfba38bd7)</link>, change the delay parameter to 0 for an infinite wait, and perform the necessary action on a 'spacebar' or 'q' key.

The next thing we'd want to do is make sure the image window size is within the user's desktop display size. Currently, the display window is probably too big for your desktop. Take a look at the <link>[documentation for namedWindow()](https://docs.opencv.org/4.3.0/d7/dfc/group__highgui.html#ga5afdf8410934fd099df85c75b2e0888b)</link> and figure out which flags you should use to set up your display window to be resizable but keep the aspect ratio and display the expanded GUI.

Next we probably want to give the user some useful information. Check out the <link>[documentation for displayOverlay()](https://docs.opencv.org/4.3.0/dc/d46/group__highgui__qt.html#ga704e0387318cd1e7928e6fe17e81d6aa)</link> and add some explanatory information for the user about frame number, total frames, and user control actions.

Last little detail: currently, if the user closes the window, the program doesn't exit. Modify your program to exit when image display window is closed. Hint: try <code>cv::getWindowProperty(VIDEO_FILE, cv::WND_PROP_VISIBLE)</code>.

## Getting four points for a homography

Now, we'd like to allow the user to select four points comprising a square in the real world then compute a rectifying homography for that square, thus rectifying the entire ground plane from the robot's point of view.

Check out the <link>[documentation for setMouseCallback()](https://docs.opencv.org/4.3.0/d7/dfc/group__highgui.html#ga89e7806b0a616f6f1d502bd8c183ad3e)</link>. Experiment with it until you can get four mouse clicks without interfering with the other GUI functions such as pan/tilt/zoom. Check that you are getting the image coordinates rather than the window coordinates of the mouse clicks.

The steps of getting 4 points for homography are:
 1. Do the video captures loop
 2. press any key to pause the screen and show a new pop-up for showing the pausing image
 3. Use the mouse click 4 points.
 
The example code is as below:

### Import library and define some global variables

In this step, define a mouseHandler function.

### C++

In [None]:
#include <opencv2/opencv.hpp> // you can use the include library for call "all" functions in opencv
#include <iostream>

using namespace cv;  // You can use any function in cv class by not typing cv::
using namespace std; // You can use any function in std class by not typing std::

// In C++, you can define a constant variable by using #define
#define VIDEO_FILE "robot.mp4"   // define VIDEO_FILE as string, the value is "robot.mp4"
#define ROTATE false              // define ROTATE as boolean, the value is true

// Create gloabal variables for transfer data to mouse event 
Mat matPauseScreen, matResult, matFinal;
Point point;
vector<Point> pts;
int var = 0;
int drag = 0;

// Create mouse handler function
void mouseHandler(int, int, int, int, void*);

### Python

In [1]:
import cv2
import numpy as np

VIDEO_FILE = "robot.mp4"   # define VIDEO_FILE as string, the value is "robot.mp4"
ROTATE = False             # define ROTATE as boolean, the value is true

matResult = None
matFinal = None
matPauseScreen = None

point = (-1, -1)
pts = []
var = 0 
drag = 0

### Make a mouse handler function when the mouse event is called

### C++

In [None]:
// Mouse handler function has 5 parameters input (no matter what)
void mouseHandler(int event, int x, int y, int, void*)
{
    if (var >= 4) //if homography points are more than 4 points, do nothing
        return;
    if (event == EVENT_LBUTTONDOWN) // When Press mouse left down
    {
        drag = 1; // Set it that the mouse is in pressing down mode
        matResult = matFinal.clone(); // copy final image to draw image
        point = Point(x, y); // memorize current mouse position to point var
        if (var >= 1) // if the point has been added more than 1 points, draw a line
        {
            line(matResult, pts[var - 1], point, Scalar(0, 255, 0, 255), 2); // draw a green line with thickness 2
        }
        circle(matResult, point, 2, Scalar(0, 255, 0), -1, 8, 0); // draw a current green point
        imshow("Source", matResult); // show the current drawing
    }
    if (event == EVENT_LBUTTONUP && drag) // When Press mouse left up
    {
        drag = 0; // no more mouse drag
        pts.push_back(point);  // add the current point to pts
        var++; // increase point number
        matFinal = matResult.clone(); // copy the current drawing image to final image
        if (var >= 4) // if the homograpy points are done
        {
            line(matFinal, pts[0], pts[3], Scalar(0, 255, 0, 255), 2); // draw the last line
            fillPoly(matFinal, pts, Scalar(0, 120, 0, 20), 8, 0); // draw polygon from points
            
            setMouseCallback("Source", NULL, NULL); // remove mouse event handler
        }
        imshow("Source", matFinal);
    }
    if (drag) // if the mouse is dragging
    {
        matResult = matFinal.clone(); // copy final images to draw image
        point = Point(x, y); // memorize current mouse position to point var
        if (var >= 1) // if the point has been added more than 1 points, draw a line
        {
            line(matResult, pts[var - 1], point, Scalar(0, 255, 0, 255), 2); // draw a green line with thickness 2
        }
        circle(matResult, point, 2, Scalar(0, 255, 0), -1, 8, 0); // draw a current green point
        imshow("Source", matResult); // show the current drawing
    }
}

### Python

In [None]:
# Mouse handler function has 5 parameters input (no matter what)
def mouseHandler(event, x, y, flags, param):
    global point, pts, var, drag, matFinal, matResult   # call global variable to use in this function

    if (var >= 4):                           # if homography points are more than 4 points, do nothing
        return
    if (event == cv2.EVENT_LBUTTONDOWN):     # When Press mouse left down
        drag = 1                             # Set it that the mouse is in pressing down mode
        matResult = matFinal.copy()          # copy final image to draw image
        point = (x, y)                       # memorize current mouse position to point var
        if (var >= 1):                       # if the point has been added more than 1 points, draw a line
            cv2.line(matResult, pts[var - 1], point, (0, 255, 0, 255), 2)    # draw a green line with thickness 2
        cv2.circle(matResult, point, 2, (0, 255, 0), -1, 8, 0)             # draw a current green point
        cv2.imshow("Source", matResult)      # show the current drawing
    if (event == cv2.EVENT_LBUTTONUP and drag):  # When Press mouse left up
        drag = 0                             # no more mouse drag
        pts.append(point)                    # add the current point to pts
        var += 1                             # increase point number
        matFinal = matResult.copy()          # copy the current drawing image to final image
        if (var >= 4):                                                      # if the homograpy points are done
            cv2.line(matFinal, pts[0], pts[3], (0, 255, 0, 255), 2)   # draw the last line
            cv2.fillConvexPoly(matFinal, np.array(pts, 'int32'), (0, 120, 0, 20))        # draw polygon from points
        cv2.imshow("Source", matFinal);
    if (drag):                                    # if the mouse is dragging
        matResult = matFinal.copy()               # copy final images to draw image
        point = (x, y)                   # memorize current mouse position to point var
        if (var >= 1):                            # if the point has been added more than 1 points, draw a line
            cv2.line(matResult, pts[var - 1], point, (0, 255, 0, 255), 2)    # draw a green line with thickness 2
        cv2.circle(matResult, point, 2, (0, 255, 0), -1, 8, 0)         # draw a current green point
        cv2.imshow("Source", matResult)           # show the current drawing

### Do the main function for run video and setup mouse hander in window

### C++

In [None]:
int main(int argc, char** argv)
{
    Mat matFrameCapture;
    Mat matFrameDisplay;
    int key = -1;

    // --------------------- [STEP 1: Make video capture from file] ---------------------
    // Open input video file
    VideoCapture videoCapture(VIDEO_FILE);
    if (!videoCapture.isOpened()) {
        cerr << "ERROR! Unable to open input video file " << VIDEO_FILE << endl;
        return -1;
    }

    // Capture loop
    while (key < 0)        // play video until press any key
    {
        // Get the next frame
        videoCapture.read(matFrameCapture);
        if (matFrameCapture.empty()) {   // no more frame capture from the video
            // End of video file
            break;
        }
        cvtColor(matFrameCapture, matFrameCapture, COLOR_BGR2BGRA);

        // Rotate if needed, some video has output like top go down, so we need to rotate it
#if ROTATE
        rotate(matFrameCapture, matFrameCapture, RotateFlags::ROTATE_180);   //rotate 180 degree and put the image to matFrameDisplay
#endif

        float ratio = 640.0 / matFrameCapture.cols;
        resize(matFrameCapture, matFrameDisplay, cv::Size(), ratio, ratio, INTER_LINEAR);

        // Display
        imshow(VIDEO_FILE, matFrameDisplay); // Show the image in window named "robot.mp4"
        key = waitKey(30);
        
        // --------------------- [STEP 2: pause the screen and show an image] ---------------------
        if (key >= 0)
        {
            matPauseScreen = matFrameCapture;  // transfer the current image to process
            matFinal = matPauseScreen.clone(); // clone image to final image
        }
    }

    // --------------------- [STEP 3: use mouse handler to select 4 points] ---------------------
    if (!matFrameCapture.empty())
    {
        var = 0;   // reset number of saving points
        pts.clear(); // reset all points
        namedWindow("Source", WINDOW_AUTOSIZE);  // create a windown named source
        setMouseCallback("Source", mouseHandler, NULL); // set mouse event handler "mouseHandler" at Window "Source"
        imshow("Source", matPauseScreen); // Show the image
        waitKey(0); // wait until press anykey
        destroyWindow("Source"); // destroy the window
    }
    else
    {
        cout << "You did not pause the screen before the video finish, the program will stop" << endl;
        return 0;
    }
    
    return 0;
}

### Python

In [None]:
def main():
    global matFinal, matResult, matPauseScreen         # call global variable to use in this function
    key = -1;

    # --------------------- [STEP 1: Make video capture from file] ---------------------
    # Open input video file
    videoCapture = cv2.VideoCapture(VIDEO_FILE);
    if not videoCapture.isOpened():
        print("ERROR! Unable to open input video file ", VIDEO_FILE)
        return -1

    width  = videoCapture.get(cv2.CAP_PROP_FRAME_WIDTH)   # float `width`
    height = videoCapture.get(cv2.CAP_PROP_FRAME_HEIGHT)  # float `height`

    # Capture loop 
    while (key < 0):        # play video until press any key
        # Get the next frame
        _, matFrameCapture = videoCapture.read()
        if matFrameCapture is None:   # no more frame capture from the video
            # End of video file
            break

        # Rotate if needed, some video has output like top go down, so we need to rotate it
        if ROTATE:
            _, matFrameDisplay = cv2.rotate(matFrameCapture, cv2.ROTATE_180)   #rotate 180 degree and put the image to matFrameDisplay
        else:
            matFrameDisplay = matFrameCapture;

        ratio = 640.0 / width
        dim = (int(width * ratio), int(height * ratio))
        # resize image to 480 * 640 for showing
        matFrameDisplay = cv2.resize(matFrameDisplay, dim)

        # Show the image in window named "robot.mp4"
        cv2.imshow(VIDEO_FILE, matFrameDisplay)
        key = cv2.waitKey(30)

        # --------------------- [STEP 2: pause the screen and show an image] ---------------------
        if (key >= 0):
            matPauseScreen = matFrameCapture     # transfer the current image to process
            matFinal = matPauseScreen.copy()     # copy image to final image

    # --------------------- [STEP 3: use mouse handler to select 4 points] ---------------------
    if (matFrameCapture is not None):
        var = 0                                             # reset number of saving points
        pts.clear()                                         # reset all points
        cv2.namedWindow("Source", cv2.WINDOW_AUTOSIZE)      # create a windown named source
        cv2.setMouseCallback("Source", mouseHandler)        # set mouse event handler "mouseHandler" at Window "Source"
        cv2.imshow("Source", matPauseScreen)                # Show the image
        cv2.waitKey(0)                                      # wait until press anykey
        cv2.destroyWindow("Source")                         # destroy the window
    else:
        print("You did not pause the screen before the video finish, the program will stop")
        return 0

main()

### The result should be like this

<img src="img/lab02-2.PNG" width="800"/>

### Calculate Homography

Now, given the four points that you've collected from the user, calculate a homography to a rectified square with a desired number of pixels per meter, e.g., 1000. The tiles in the video from last week are 60cm x 60cm.

Note that OpenCV doesn't have a <code>null()</code> function like Matlab and Octave. Instead, you'll have to use the SVD operation to get the row of V associated with the smallest singular value of the design matrix. Test that you get the same result from OpenCV's SVD operation and Octave's <code>null()</code> function.

Once you've got a homography that works for the selected quadrilateral, you'll want to adjust it by incorporating a translation that maps the bounding box of the transformed image to a valid range starting at 0 for the uppermost Y coordinate and leftmost X coordinate.

### Display rectified image

Once you've got that working, you'll want to display a rectified version of the original image in a second HighGUI window as we step through the video. There are two ways to do this: directly (manually) and using <code>cv::warpPerspective()</code>. For this lab's learning outcomes, it would be better for you to do it directly/manually using bilinear interpolation. This will give you a better understanding of how to render image transforms. In your own work later, go ahead and use <code>warpPerspective()</code> or whatever suits you.

### Display original and rectified optical flows
Once you have the display of the original and ground-plane-rectified images working, add the optical flows from Lab 01 and render them in both images. This will be really useful.

One of the example is using <code>getPerspectiveTransfrom()</code> to get homograpy:

### C++

In [None]:
// Put the code in the main function

    if (pts.size() == 4)
    {
        Point2f src[4];
        for (int i = 0; i < 4; i++)
        {
            src[i].x = pts[i].x * 1.0;
            src[i].y = pts[i].y * 1.0;
        }
        Point2f reals[4];
        reals[0] = Point2f(800.0, 800.0);
        reals[1] = Point2f(1000.0, 800.0);
        reals[2] = Point2f(1000.0, 1000.0);
        reals[3] = Point2f(800.0, 1000.0);

        Mat homography_matrix = getPerspectiveTransform(src, reals);
        std::cout << "Estimated Homography Matrix is:" << std::endl;
        std::cout << homography_matrix << std::endl;

        // perspective transform operation using transform matrix
        cv::warpPerspective(matPauseScreen, matResult, homography_matrix, matPauseScreen.size(), cv::INTER_LINEAR);
        imshow("Source", matPauseScreen);
        imshow("Result", matResult);

        waitKey(0);
    }

### Python

In [None]:
    if (len(pts) == 4):
        src = np.array(pts).astype(np.float32)

        reals = np.array([(800, 800),
                          (1000, 800),
                          (1000, 1000),
                          (800, 1000)], np.float32)

        homography_matrix = cv2.getPerspectiveTransform(src, reals);
        print("Estimated Homography Matrix is:")
        print(homography_matrix)

        # perspective transform operation using transform matrix

        h, w, ch = matPauseScreen.shape
        matResult = cv2.warpPerspective(matPauseScreen, homography_matrix, (w, h), cv2.INTER_LINEAR)
        matPauseScreen = cv2.resize(matPauseScreen, dim)
        cv2.imshow("Source", matPauseScreen)
        matResult = cv2.resize(matResult, dim)
        cv2.imshow("Result", matResult)

        cv2.waitKey(0)

### The result should be like this

<img src="img/lab02-3.PNG" width="800"/>

### Reuse the homography from your last run
Learn how OpenCV stores data files in YML format using the FileStorage class. When the user selects four points in a frame, output the resulting homography to a data file and re-read that file when the program starts again. That way, the use only has to do the "calibration" once.

### What to turn in
Write a brief report and turn in to the instructor before the next lab. Make a video showing the frames of the original video with optical flows side-by-side with the rectified image and rectified optical flows, put the video online, and point to it on the Piazza discussion board.