1.
Write a k-nearest neighbor classifier for the general case of k > 1. Given are
• M data vectors each of length N, stored in an M × N NumPy array called P.
• The class labels for the vectors stored in a length M array of integers called L. In
particular, L[i] == c means the i-th vector in P is from class c.
• A length N query vector called q.
Solve this in following steps: (a) find the distance from q to each data vector, (b) find the
class labels of the k smallest distances, (c) form a histogram of these labels and find the most
common class label. To keep things simple, if there is a tie for the most commonly occuring
class, you may output any of these classes.

In [2]:
import numpy as np

def generate_knn_data(M=100, N=5, num_classes=3, seed=None):
    if seed is not None:
        np.random.seed(seed)
    
    # Generate M data vectors, each of length N
    P = np.random.randn(M, N)  # Random normal distribution
    
    # Generate M class labels from num_classes
    L = np.random.randint(0, num_classes, size=M)
    
    # Generate a query vector of length N
    q = np.random.randn(N)
    
    return P, L, q

# Generate the inputs to the problem
P, L, q = generate_knn_data(M=10,N=3)
print("P (data vectors):\n", P)
print("L (class labels):\n", L)
print("q (query vector):\n", q)

k=5

# Calculate the distance of every point in P to q
distances = np.sqrt(np.sum((P - q) ** 2, axis=1))

# Find the indices of the k-nearest points
k_smallest_indices = np.argpartition(distances, k)[:k]

# Convert indices to class-labels
k_nearest_labels = L[k_smallest_indices]

# Count occurrences of each class label (histogram)
label_counts = np.bincount(k_nearest_labels)

# Find the most common class label
most_common_label = np.argmax(label_counts)

print("Most common class label:", most_common_label)



P (data vectors):
 [[ 0.32503281  0.56042878  0.56521208]
 [-1.51516333  0.46575668 -0.10277681]
 [ 0.50339889 -1.49906747  0.28052312]
 [-2.18605869  0.7299838  -0.76484322]
 [-2.1527459   0.0451252  -1.58575558]
 [-0.19327578 -1.22803545  0.03522589]
 [-1.00707187 -0.18196184  0.01974054]
 [ 0.60108362 -0.19632986  0.10206801]
 [ 0.82090131  0.69888559 -1.32087717]
 [-1.08578787 -0.49248876  1.22220264]]
L (class labels):
 [2 2 0 2 0 1 2 1 0 0]
q (query vector):
 [ 1.22258137 -1.29621628  1.23755377]
Most common class label: 0


2.
The symmetric matching test is an alternative to the ratio test, where two keypoints, i from
image I0 and j from I1, are said to match each other if
• keypoint i’s descriptor is closer to the descriptor for j than any other descriptor found
in image I0, AND
• keypoint j’s descriptor is closer to the descriptor for i than any other descriptor found
in image I1.
Suppose the M0 descriptors for image I0 are stored in an M0 × k NumPy array D0 and the
M1 descriptors for image I1 are stored in an M1 × k NumPy array D1. Write Python code to
find all symmetric matches, outputting the idex pairs (i, j), one per line. Do this with as few
for loops as possible.

In [None]:
import cv2
def generate_random_descriptors(M0, M1, k):
    """
    Generate random keypoint descriptors for two images. This isn't a part of the problem, it just generates random input
    
    Parameters:
    M0 (int): Number of keypoints in image I0
    M1 (int): Number of keypoints in image I1
    k (int): Dimensionality of each descriptor
    
    Returns:
    D0 (numpy.ndarray): M0 x k array of descriptors for image I0
    D1 (numpy.ndarray): M1 x k array of descriptors for image I1
    """
    D0 = np.random.rand(M0, k).astype(np.uint8)  # Random descriptors for image I0
    D1 = np.random.rand(M1, k).astype(np.uint8)  # Random descriptors for image I1
    
    return D0, D1

# Example usage
M0, M1, k = 100, 120, 128  # Example sizes
D0, D1 = generate_random_descriptors(M0, M1, k)

# crossCheck literally does everything for us
bf = cv2.BFMatcher(cv2.NORM_L2, crossCheck=True)  # Use L2 norm (for SIFT/SURF)
# also, crossCheck removes one for loop (the long implementation has to use a for loop), so it's technically the "right" answer
    
matches = bf.match(D0, D1)

for match in matches:
    print(f"D0 index: {match.queryIdx}, D1 index: {match.trainIdx}, Distance: {match.distance:.4f}")

D0 index: 0, D1 index: 0, Distance: 0.0000


3. Is the ratio test symmetric? In other words, if keypoint i from image I0 matches keypoint j
from image I1 by passing the ratio test, will keypoint j from image I1 match keypoint i from
image I0 acccording to the ratio test? Justify your answer.

No
Counterexample: Suppose keypoint i from image 1 matches keypoint j from image 2. Image J may have several keypoints in image 1 that are a "nearer."
This is often in the case of images where image 1 has many keypoints and image 2 has few keypoints. Many keypoints from image 1 will all map to the same keypoint in image 2. However, the "popular" keypoint in image 2 won't be able to match to more than k keypoints in image 1.

4. Suppose F is the fundamental matrix mapping points (x, y) from image 0 onto “epipolar”
lines in image 1. If (u, v) is any point in image 2, write the distance from (u, v) to the
epiploar line as a function of u, v, z, y and F.

Define $$u_1=(x,y,1)^T$$ and $$u_2=(u,v,1)^T$$
Epipolar line: $$ax+by+c=0\textnormal{ where }[a,b,c]=Fu_1$$
Now use the distance formula for a line in standard form:
$$d=\frac{|u_2^TFu_1|}{\sqrt{a^2+b^2}}$$



5. Here is a simple implementation of a fully-connected, feed-forward network. Given are two
Python lists, w and b, each of length L. For each i in the range 0 to L-1, w[i] holds the
two-dimensional NumPy weight matrix for level i of the network and b[i] holds the one-dimensional (NumPy) bias vector.

(a) Write code to check whether or not the dimensions of the weight arrays and bias vectors
are consistent.

(b) If x is an input vector, and sig is the activation function, write code to compute the
output of the network. At the last layer, the activation should be the soft-max, which
you must write yourself.

In [None]:
np.random.seed(42)
w = [np.random.randn(3, 2), np.random.randn(2, 3), np.random.randn(1, 2)]
b = [np.random.randn(3), np.random.randn(2), np.random.randn(1)]

def check_dimensions(w, b, input_dim):
    prev_dim = input_dim  # Start with the input dimension

    for i in range(len(w)):
        # Check weight input matches output from the previous layer
        if w[i].shape[1] != prev_dim:
            return False

        # Check that every weight has a bias
        if b[i].shape[0] != w[i].shape[0]:
            return False

        prev_dim = w[i].shape[0]  # Update for the next layer

    return True  # If no errors were raised, dimensions are consistent

def forward_pass(x, w, b, activation=np.sigmoid):
    """
    Perform a forward pass through a fully connected neural network.

    Parameters:
    - x: Input numpy array (1D)
    - w: List of NumPy weight matrices
    - b: List of NumPy bias vectors
    - activation: Activation function to use (default: sigmoid)

    Returns:
    - Output of the final layer
    """
    for i in range(len(w)):
        x = np.dot(w[i], x) + b[i]  # Linear transformation
        if i < len(w) - 1:          # Apply activation except for the last layer
            x = activation(x)
    x=np.softmax(x)
    return x