Write code to apply non-maximum suppression to a NumPy array called v that you should
treat as a circle, so that that last entry in v is “next to” the 0th.

In [4]:
import numpy as np

v = np.array([8,6,7,8,11,15,18,17,8,6,4,8,5,8,14])

# non-maximum suppression (we are allowed to use a for-loop here)
def getPrev(v,index):
    if index==0:
        return v[len(v)-1]
    return v[index-1]
def getNext(v,index):
    if(index==len(v)-1):
        return v[0]
    return v[index+1]

v2 = np.copy(v)
for i in range(len(v)):
    if getPrev(v,i)>v[i] or getNext(v,i)>v[i]:
        v2[i]=0
v=v2
print(v)

[ 0  0  0  0  0  0 18  0  0  0  0  8  0  0 14]


Then, as a second step, write additional code to eliminate any local maximum in v that is less than 70% of the maximum across the entire array (no for loops allowed)

In [8]:
max_val = np.max(v)
v[v<max_val*0.7] = 0
print(v)

[ 0  0  0  0  0  0 18  0  0  0  0  0  0  0 14]


Suppose you have an M×N image and you have detected a SIFT keypoint at location (x, y),
orientation θ, and scale σ. Now suppose you resize the image to 2M × 2N. Where will there
be a SIFT keypoint in the new image and what will be its orientation and scale?

Answer: (2x,2y), 2σ, θ (theta remains unchanged)

After SIFT keypoint detection, orientation estimation, and the calculation of the 4 × 4 grid of 8-component orientation histograms, you must convert the 16 histograms to a 128 component descriptor vector. Each of these vectors must be normalized to make them unit vectors. After normalization, any value greater than 0.3 must be truncated to 0.3. (This is an experimentally-derived heuristic.) The final step is to normalize again so the vectors are once again unit vectors.

Implement these operations assuming you are starting from a 4 dimensional NumPy array called B, which has dimensions N × 4 × 4 × 8. The final descriptor vector should be stored in a N × 128 array.

In [None]:
N = 1  # Replace with your desired value for N
B = np.random.rand(N, 4, 4, 8)  # Random values from uniform distribution [0, 1)

# Step 1: Flatten B into shape (N, 128)
descriptors = B.reshape(N, 128)

# Step 2: Normalize each descriptor to unit length (L2 normalization)
norms = np.linalg.norm(descriptors, axis=1, keepdims=True)
descriptors = descriptors / (norms + 1e-10)  # Adding small epsilon to prevent division by zero

# Step 3: Threshold values greater than 0.3
descriptors = np.clip(descriptors, None, 0.3)

# Step 4: Normalize again to make unit vectors
norms = np.linalg.norm(descriptors, axis=1, keepdims=True)
descriptors = descriptors / (norms + 1e-10)

#print(descriptors.shape)  # Should print (N, 128)

(1, 128)


True or False: If you are given two images of the same scene taken by two different cameras
(no lens distortions), there will exist a homography H that accurately maps one image onto
the other, and there will exist a fundamental matrix F that accurately maps points from one
image onto lines that contain their corresponding points in the other image. Justify your
answer.

False, homography only can capture a rotating camera, not a translating camera.

Suppose camera i, for i = 1, 2, is described by the intrinsic parameter matrix Ki, rotation
matrix Ri and translation vector t = (0, 0, 0)^T. This would give camera matrix
    Mi=(KiRi | 0)

Note that in this notation we have dropped the transpose on the rotation matrix. For this
exercise, we’ll assume all pixel locations are written in the form (x, y), where x is across the
image and y goes down.


Write Python code that computes the bounds of the mapping of an image from camera 1
onto camera 2 and on the mapping of camera 2 onto camera 1. Given four 3x3 matrices, K1,
R1, K2 and R2, the code should

(a) Compute the homography mapping image 1 onto image 2,
(b) Map the four corners of image 1 onto image 2 using this homography,
(c) Compute the corners of the rectangle (in image 2’s coordinate system) bounding these
points,
(d) Output the upper left and lower right corners for this bounding rectangle, accurate to
the nearest integer, and
(e) Repeat steps 2 through 5, reversing the roles of camera 1 and 2

You may assume the corners of the images are (0,0), (6000, 0), (0, 4000), and (6000, 4000).
For simplicity there should just be four lines of output
• the x and y coordinates of the upper left corner of the first mapping,
• the x and y coordinates of the lower right corner of the first mapping,
• the x and y coordinates of the upper left corner of the second mapping, and
• the x and y coordinates of the lower right corner of the second mapping.
All output values should be rounded to the nearest integer.

In [None]:
K1 = np.random.rand(3,3)
K2 = np.random.rand(3,3)
R1 = np.random.rand(3,3)
R2 = np.random.rand(3,3)

# Compute the inverse of K1
K1_inv = np.linalg.inv(K1)

H = K2 @ R2 @ R1.T @ K1_inv
print(H)

corner1 = H @ np.array([0,0,0])
corner2 = H @ np.array([6000,0,0])
corner3 = H @ np.array([0,4000,0])
corner4 = H @ np.array([6000,4000,0])

print(corner1)
print(corner2)
print(corner3)
print(corner4)

# camera matrix 2
M2 = np.hstack((K2@R2, np.zeros((3,1))))
#print(M2)



[[5.27139677 0.72228557 0.37163059]
 [4.3263142  0.66744523 0.30463621]
 [1.20571336 0.36721405 0.07629303]]
[0. 0. 0.]
[31628.38061547 25957.88517285  7234.28017152]
[2889.14229785 2669.78091681 1468.85620833]
[34517.52291332 28627.66608966  8703.13637985]
[[1.08942623 1.22307049 1.39991746 0.        ]
 [0.9829921  1.07681129 1.19476974 0.        ]
 [0.41182661 0.49538443 0.44564035 0.        ]]
