# **Computer Vision 2023 Assignment 2: Image Matching & Retrieval**

In this prac, you will experiment with image feature detectors, descriptors and matching. **There are 3 main parts to the prac:**

- Matching an object in a pair of images
- Searching for an object in a collection of images
- Analysis and discussion of results

<br/>

## **General Instructions**

As before, you will use this notebook to run your code and display your results and analysis. Again we will mark a PDF conversion of your notebook, referring to your code if necessary, so you should ensure your code output is formatted neatly.

***When converting to PDF, include the outputs and analysis only, not your code.*** You can do this from the command line using the `nbconvert` command (installed as part of Jupyter) as follows:

    jupyter nbconvert Assignment2.ipynb --to pdf --no-input --TagRemovePreprocessor.remove_cell_tags 'remove-cell'

This will also remove the preamble text from each question. It has been packaged into a small notebook you can run in colab, called notebooktopdf.ipynb


We will use the `OpenCV` library to complete the prac. It has several built in functions that will be useful. You are expected to consult documentation and use them appropriately.

As with the last assignment it is somewhat up to you how you answer each question. Ensure that the outputs and report are clear and easy to read so that the markers can rapidly assess what you have done, why, and how deep is your understanding. This includes:

- Sizing, arranging and captioning image outputs appropriately
- Explaining what you have done clearly and concisely
- Clearly separating answers to each question

<br/>

## **Data**

We have provided some example images for this assignment, available through a link on the MyUni assignment page. The images are organised by subject matter, with one folder containing images of book covers, one of museum exhibits, and another of urban landmarks. You should copy these data into a directory A2_smvs, keeping the directory structure the same as in the zip file.  

Within each category (within each folder), there is a “Reference” folder containing a clean image of each object and a “Query” folder containing images taken on a mobile device. Within each category, images with the same name contain the same object (so 001.jpg in the Reference folder contains the same book as 001.jpg in the Query folder).
The data is a subset of the Stanford Mobile Visual Search Dataset which is available at

<http://web.cs.wpi.edu/~claypool/mmsys-dataset/2011/stanford/index.html>.

The full data set contains more image categories and more query images of the objects we have provided, which may be useful for your testing!

**Do not submit your own copy of the data or rename any files or folders!** For marking, we will assume the datasets are available in subfolders of the working directory using the same folder names provided.

Here is some general setup code, which you can edit to suit your needs.

In [1]:
# Numpy is the Main Package for Scientific Computing with Python
import numpy as np
import cv2

# Matplotlib is a Useful Plotting Library for Python
import matplotlib.pyplot as plt

# This Code is to Make Matplotlib Figures Appear Inline in the Notebook Rather than in a New Window
%matplotlib inline
plt.rcParams['figure.figsize'] = (10.0, 8.0)    # Set Default Size of Plots, Can Be Changed
plt.rcParams['image.interpolation'] = 'nearest'
plt.rcParams['image.cmap'] = 'gray'

# Some More Magic so that the Notebook Will Reload External Python Modules. see http://stackoverflow.com/questions/1907993/autoreload-of-modules-in-ipython
%load_ext autoreload
%autoreload 2
%reload_ext autoreload

In [2]:
def draw_outline(ref, query, model):
    """
    Draws the outline of the reference image projected onto the query image using a given perspective transformation matrix.
    Parameters:
        ref (ndarray): The reference image (grayscale or color).
        query (ndarray): The query image in which to draw the outline.
        model (ndarray): The homography matrix from query to reference image.
    """
    # Get Dimensions of the Reference Image
    h, w = ref.shape[:2]

    # Define the Four Corners of the Reference Image
    pts = np.float32([[0, 0], [0, h-1], [w-1, h-1], [w-1, 0]]).reshape(-1, 1, 2)

    # Transform the Corners Using the Homography Matrix
    dst = cv2.perspectiveTransform(pts, model)

    # Copy the Query Image and Draw the Projected Outline
    img = query.copy()
    img = cv2.polylines(img, [np.int32(dst)], isClosed=True, color=255, thickness=3, lineType=cv2.LINE_AA)

    # Display the Result
    plt.imshow(img, 'gray'), plt.show()


def draw_inliers(img1, img2, kp1, kp2, matches, matchesMask):
    """
    Draws the inlier keypoint matches between two images.
    Parameters:
        img1, img2 (ndarray): The reference and query images.
        kp1, kp2 (list): Keypoints from both images.
        matches (list): Filtered list of good matches.
        matchesMask (ndarray): Mask indicating which matches are inliers (from cv2.findHomography).
    """
    # Flatten the Mask Array
    matchesMask = matchesMask.ravel().tolist()

    # Set Drawing Parameters: Green Colour for Inliers
    draw_params = dict(
        matchColor=(0, 255, 0),      # Green for Inlier Matches
        singlePointColor=None,       # No Circles for Unmatched Keypoints
        matchesMask=matchesMask,     # Only Show Inliers
        flags=2                      # Use Default Drawing Style
    )

    # Draw Matches
    img3 = cv2.drawMatches(img1, kp1, img2, kp2, matches, None, **draw_params)

    # Show the Image with Matches
    plt.imshow(img3, 'gray'), plt.show()


def drawlines(img1, img2, lines, pts1, pts2):
    """
    Draws epipolar lines on the first image and the corresponding points on both images to visualise epipolar geometry.
    Parameters:
        img1 (ndarray): First (reference) image for drawing lines.
        img2 (ndarray): Second image with points.
        lines (ndarray): Epilines corresponding to points in img2.
        pts1 (ndarray): Points in img1.
        pts2 (ndarray): Corresponding points in img2.
    Returns:
        img1 (ndarray): Image with epipolar lines.
        img2 (ndarray): Image with marked points.
    """
    # Get Image Dimensions (assuming grayscale input)
    r, c = img1.shape

    # Convert to Colour for Drawing
    img1 = cv2.cvtColor(img1, cv2.COLOR_GRAY2BGR)
    img2 = cv2.cvtColor(img2, cv2.COLOR_GRAY2BGR)

    # Draw Each Epipolar Line and its Corresponding Points
    for r, pt1, pt2 in zip(lines, pts1, pts2):
        color = tuple(np.random.randint(0, 255, 3).tolist())  # Random Colour for Each Line

        # Compute Two Points on the Epipolar Line
        x0, y0 = map(int, [0, -r[2]/r[1]])
        x1, y1 = map(int, [c, -(r[2] + r[0]*c)/r[1]])

        # Draw the Epipolar Line on img1
        img1 = cv2.line(img1, (x0, y0), (x1, y1), color, 1)

        # Draw the Corresponding Points on BothIimages
        img1 = cv2.circle(img1, tuple(pt1), 5, color, -1)
        img2 = cv2.circle(img2, tuple(pt2), 5, color, -1)

    return img1, img2

<br/><br/>

# **Question 1: Matching an Object in a Pair of Images *(60%)***


In this question, the aim is to accurately locate a reference object in a query image, for example:

![Books](book.png "Books")

**0. Download and read through the paper [ORB: an efficient alternative to SIFT or SURF](https://www.researchgate.net/publication/221111151_ORB_an_efficient_alternative_to_SIFT_or_SURF) by Rublee et al.** You don't need to understand all the details, but try to get an idea of how it works. ORB combines the FAST corner detector and the BRIEF descriptor. BRIEF is based on similar ideas to the SIFT descriptor we covered week 3, but with some changes for efficiency.

**1. [Load images]** Load the first (reference, query) image pair from the "book_covers" category using opencv (e.g. `img=cv2.imread()`). Check the parameter option in "
cv2.imread()" to ensure that you read the gray scale image, since it is necessary for computing ORB features.

**2. [Detect features]** Create opencv ORB feature extractor by `orb=cv2.ORB_create()`. Then you can detect keypoints by `kp = orb.detect(img,None)`, and compute descriptors by `kp, des = orb.compute(img, kp)`. You need to do this for each image, and then you can use `cv2.drawKeypoints()` for visualisation.

**3. [Match features]** As ORB is a binary feature, you need to use HAMMING distance for matching, e.g., `bf = cv2.BFMatcher(cv2.NORM_HAMMING)`. Then you are requried to do KNN matching (k=2) by using `bf.knnMatch()`. After that, you are required to use "ratio_test". Ratio test was used in SIFT to find good matches and was described in the lecture. By default, you can set `ratio=0.8`.

**4. [Plot and analyse]** You need to visualise the matches by using the `cv2.drawMatches()` function. Also you can change the ratio values, parameters in `cv2.ORB_create()`, and distance functions in `cv2.BFMatcher()`. Please discuss how these changes influence the match numbers.


In [15]:
# Load Images as Grey Scale
# BookCover Images
img1 = cv2.imread('A2_smvs/book_covers/Reference/001.jpg', 0)
if not np.shape(img1):
  # Error Message & Print Current Working Dir
  print("Could not load img1. Check the path, filename and current working directory\n")

img2 = cv2.imread("A2_smvs/book_covers/Query/001.jpg", 0)
if not np.shape(img2):
  # Error Message & Print Current Working Dir
  print("Could not load img2. Check the path, filename and current working directory\n")


# Landmark Images
img3 = cv2.imread('A2_smvs/landmarks/Reference/001.jpg', 0)
if not np.shape(img1):
  # Error Message & Print Current Working Dir
  print("Could not load img1. Check the path, filename and current working directory\n")

img4 = cv2.imread("A2_smvs/landmarks/Query/001.jpg", 0)
if not np.shape(img2):
  # Error Message & Print Current Working Dir
  print("Could not load img2. Check the path, filename and current working directory\n")


# Museum Paintings Images
img3 = cv2.imread('A2_smvs/landmarks/Reference/001.jpg', 0)
if not np.shape(img1):
  # Error Message & Print Current Working Dir
  print("Could not load img1. Check the path, filename and current working directory\n")

img4 = cv2.imread("A2_smvs/landmarks/Query/001.jpg", 0)
if not np.shape(img2):
  # Error Message & Print Current Working Dir
  print("Could not load img2. Check the path, filename and current working directory\n")

In [4]:
# Your code for descriptor matching tests here

# compute detector and descriptor, see (2) above


# find the keypoints and descriptors with ORB, see (2) above

# draw keypoints, see (2) above


# create BFMatcher object, see (3) above


# Match descriptors, see (3) above


# Apply ratio test, see (3) above
#good = []
#for m,n in matches:


# draw matches, see (4) above



***Your explanation of what you have done, and your results, here***

---

<br/>

**5. Estimate a homography transformation based on the matches, using `cv2.findHomography()`. Display the transformed outline of the first reference book cover image on the query image, to see how well they match.**
- We provide a function `draw_outline()` to help with the display, but you may need to edit it for your needs.
- Try the 'least squre method' option to compute homography, and visualise the inliers by using `cv2.drawMatches()`. Explain your results.
- Again, you don't need to compare results numerically at this stage. Comment on what you observe visually.

In [5]:
# Create src_pts and dst_pts as float arrays to be passed into cv2.,findHomography


# using cv2 standard method, see (3) above

# draw frame

# draw inliers



***Your explanation of results here***

<br/>

**Try the RANSAC option to compute homography. Change the RANSAC parameters, and explain your results. Print and analyse the inlier numbers.**

In [6]:
# Your code to display book location after RANSAC here

# using RANSAC

# draw frame


# draw inliers


# inlier number


***Your explanation of what you have tried, and results here***

---

<br/>

**6. Finally, try matching several different image pairs from the data provided, including at least one success and one failure case. For the failure case, test and explain what step in the feature matching has failed, and try to improve it. Display and discuss your findings.**
1. **Hint 1:** In general, the book covers should be the easiest to match, while the landmarks are the hardest.
2. **Hint 2:** Explain why you chose each example shown, and what parameter settings were used.
3. **Hint 3:** Possible failure points include the feature detector, the feature descriptor, the matching strategy, or a combination of these.

In [7]:
# Your results for other image pairs here

***Your explanation of results here***

---

<br/><br/>

# **Question 2: What Am I Looking At? *(40%)***

One application of feature matching is image retrieval. The goal of image retrieval is, given a query image of an object, to find all images in a database containing the same object, and return the results in ranked order (like a Google search). This is a huge research area but we will implement a very basic version of the problem based on the small dataset provided.

In this question, the aim is to identify an "unknown" object depicted in a query image, by matching it to multiple reference images, and selecting the highest scoring match. Since we only have one reference image per object, there is at most one correct answer. This is useful for example if you want to automatically identify a book from a picture of its cover, or a painting or a geographic location from an unlabelled photograph of it.

**The steps are as follows:**
**1.** Select a set of reference images and their corresponding query images.
- **Hint 1:** Start with the book covers, or just a subset of them.
- **Hint 2:** This question can require a lot of computation to run from start to finish, so cache intermediate results *(e.g. feature descriptors)* where you can.
    
**2.** Choose one query image corresponding to one of your reference images. Use RANSAC to match your query image to each reference image, and count the number of inlier matches found in each case. This will be the matching score for that image.

**3.** Identify the query object. This is the identity of the reference image with the highest match score, or "not in dataset" if the maximum score is below a threshold.

**4.** Repeat steps 2-3 for every query image and report the overall accuracy of your method *(that is, the percentage of query images that were correctly matched in the dataset).* Discussion of results should include both overall accuracy and individual failure cases.
- **Hint 1:** In case of failure, what ranking did the actual match receive? If we used a *"top-k"* accuracy measure, where a match is considered correct if it appears in the top k match scores, would that change the result?

***Code to implement this algorithm should mostly be written in a supporting file such as a2code.py. Call your code and display outputs in the notebook below.***


In [8]:
# Your code to iddntify query objects and measure search accuracy for data set here



***Your explanation of what you have done, and your results, here***

---

<br/>

**5.** Choose some extra query images of objects that do not occur in the reference dataset. Repeat step 4 with these images added to your query set. Accuracy is now measured by the percentage of query images correctly identified in the dataset, or correctly identified as not occurring in the dataset. Report how accuracy is altered by including these queries, and any changes you have made to improve performance.

In [9]:
# Your code to run extra queries and display results here

***Your explanation of results and any changes made here***

---

<br/>

**6.** Repeat step 4 and 5 for at least one other set of reference images from museum_paintings or landmarks, and compare the accuracy obtained. Analyse both your overall result and individual image matches to diagnose where problems are occurring, and what you could do to improve performance. Test at least one of your proposed improvements and report its effect on accuracy.


In [10]:
# Your code to search images and display results here

***Your description of what you have done, and explanation of results, here***

---

<br/><br/>

## **Question 3: Fundametal Matrix, Epilines and Retrival *(Optional: Assesed for Granting up to 25% Bonus Marks for the A2)***

In this question, the aim is to accuractely estimate the fundamental matrix given two views of a scene, visualise the corresponding epipolar lines and use the inlier count of fundametal matrix for retrival.

**The steps are as follows:**

**1.** Select two images of the same scene *(query and reference)* from the landmark dataset and find the matches as you have done in Question 1 *(1.1-1.4).*

**2.** Compute fundametal metrix with good matches (after appying ratio test) using the opencv function cv.findFundamentalMat(). Use both 8 point algorithm and RANSAC assisted 8 point algorithm to compute fundamental matrix.
- **Hint:** You need minimum 8 matches to be able to use the function. Ignore pairs where 8 mathes are not found.

**3.** Visualise the epipolar lines for the matched features and analyse the results. You can use openCV function cv.computeCorrespondEpilines() to estimate the epilines. We have provided the code for drawing these epilines in function drawlines() that you can modify as required.

In [11]:
# Your code to find fundamental matrices with and without RANSAC goes here

In [12]:
# Your code to visualise the epipolar lines for both cases for an example goes here

***Your visualization for epilines goes here***

---

<br/>

**4.** Repeat the steps for some examples from the landmarks datasets.

In [13]:
 # Your code for additional images goes here

***Your visualization for additional epillines goes here***

---

<br/>

**5.** Find a query from landmarks data for which the retrival in Q2 failed. Attempt the retrival with replacing the Homography + RANSAC method of Q2 to Fundamental Matrix + RANSAC method using code written above. Does the change of the model makes retraival successful? Analyse and comment.

In [14]:
# Your code for retrival goes here

***Your analysis goes here***

---
