## Instructions

- There are $17$ NAT questions with marks distributed as $1 \times 2 + 16 \times 3 = 50$.
- Answers to all questions are going to be integers.
- Solve the problem in the colab and enter the answer in the portal.
- Always run the data-cell before running the solution cell.
- In questions where a data-matrix $\mathbf{X}$ and label-vector $\mathbf{y}$ are mentioned, the data-cell will have the corresponding entries as `X` and `y`.
- All other vectors, matrices and scalars necessary for solving the problem will be given in the data-cell. They will have the same name as the ones mentioned in the problem statement.

## Notation

- The data-matrix in all problems will be of shape $d \times n$, where $d$ is the number of features and $n$ is the number of data-points.
- If $\mathbf{x} = (1, 2, 3)$ is a vector, $1, 2$ and $3$ are termed the components of $\mathbf{x}$. The sum of the components of $\mathbf{x}$ is $6$.
- The norm of a vector $\mathbf{x}$ is the Euclidean norm ($L_2$) by default. This is the only norm used in this exam.
- All vectors will be represented as one-dimensional NumPy arrays. All matrices will be represented as two-dimensional NumPy arrays.

## Useful functions

- `np.linalg.norm` can be used to compute the norm of a vector.
- Use the question mark to get more info about a function. For example, `np.linalg.norm?` will give you more details about this method.
- `round` is a function that can be used to find the nearest integer. For example, `round(1.2) == 1` and `round(1.9) == 2`.

In [117]:
import numpy as np
import matplotlib.pyplot as plt

## Question-1 [2 marks]

$\mathbf{X}$ is a data-matrix. If $\boldsymbol{\mu}$ is the mean of the data-points, find the norm of $\boldsymbol{\mu}$ and enter the nearest integer as your answer.

In [118]:
# DATA CELL
# DO NOT EDIT THIS
rng = np.random.default_rng(seed = 1001)
d = rng.integers(2, 10)
n = rng.integers(40, 50)
X = rng.integers(-5, 10, (d, n))

In [119]:
# Solution
# Run the data cell before running this
mu = np.mean(X, axis=1)
norm_mu = np.linalg.norm(mu)
print(round(norm_mu))

5


## Question-2 [3 marks]

$\mathbf{X}$ is a data-matrix. Find the number of data-points whose norm is at least $k$.

In [120]:
# DATA CELL
# DO NOT EDIT THIS
rng = np.random.default_rng(seed = 1002)
k = rng.integers(10, 20)
d = rng.integers(5, 10)
n = rng.integers(50, 100)
X = rng.integers(-10, 10, (d, n))

In [121]:
# Solution
# Run the data cell before running this
count = 0
for i in range(X.shape[1]):
  norm_x = np.linalg.norm(X[:, i])
  if norm_x >= k:
    count += 1
print(count)

17


## Question-3 [3 marks]

Consider a matrix $\mathbf{A}$ of shape $m \times n$. Find the trace of the matrix $\mathbf{A}^T \mathbf{A}$, where the trace is the sum of the diagonal elements. The diagonal here is the main diagonal (top-left to bottom-right).

In [122]:
# DATA CELL
# DO NOT EDIT THIS
rng = np.random.default_rng(seed = 1003)
m, n = rng.integers(50, 100, 2)
A = rng.integers(-1, 2, (m, n))

In [123]:
# Solution
# Run the data cell before running this
ATA = np.matmul(A.T, A)
trace_ATA = np.trace(ATA)
print(trace_ATA)

2517


## Question-4 [3 marks]

Consider the curves:

$$
\begin{aligned}
y &= xe^{x}\\\\
y &= \sin(10 \pi x)
\end{aligned}
$$

Find the number of points at which these two curves intersect in the interval $0.15 \leqslant x \leqslant 0.5$.

In [124]:
# Solution
def f(x):
  return x * np.exp(x)

def g(x):
  return np.sin(10 * np.pi * x)

x_values = np.linspace(0.15, 0.5, 1000)
count = 0
for x in x_values:
  if abs(f(x) - g(x)) < 1e-6:
    count += 1

print(count)


0


## Question-5 [3 marks]

Consider the system of equations $\mathbf{Ax} = \mathbf{b}$, where:

$$
\mathbf{A} = \begin{bmatrix}
1 & 2 & -1 & 3\\
0 & 1 & 2 & -1\\
1 & -2 & 3 & 1\\
0 & -1 & -1 & 2
\end{bmatrix}, \quad \mathbf{b} = \begin{bmatrix}
1\\
2\\
-1\\
0
\end{bmatrix}
$$

If $A$ is invertible, find the solution to this system, and enter the sum of the components of the solution as the answer. Your answer should be an integer.

In [125]:
# Solution

A = np.array([[1, 2, -1, 3],
              [0, 1, 2, -1],
              [1, -2, 3, 1],
              [0, -1, -1, 2]])

b = np.array([1, 2, -1, 0])

x = np.linalg.solve(A, b)
print(int(np.sum(x)))

0


## Question-6 [3 marks]

Let $\mathbf{w}$ be the weight vector of a linear classifier trained on a dataset for a binary classification problem with data-matrix $\mathbf{X}$ and true label vector $\mathbf{y}$. The predicted label for a data-point $\mathbf{x}$ defined as:

$$
\widehat{y} = \begin{cases}
1,& \mathbf{w}^{T} \mathbf{x} \geqslant 0,\\
0,& \text{otherwise}
\end{cases}
$$

Find the predicted label vector for the given dataset and enter the sum of the components of the predicted label vector as the answer.



In [126]:
# DATA CELL
# DO NOT EDIT THIS
rng = np.random.default_rng(seed = 1006)
d = rng.integers(3, 6)
n = rng.integers(40, 60)
if n % 2 != 0: n += 1
X = rng.uniform(-2, 2, (d, n))
y = np.concatenate(
    (np.ones(n // 2),
    np.zeros(n // 2))
)
w = rng.uniform(-5, 5, d)

In [127]:
# Solution
# Run the data cell before running this

y_pred = []
for i in range(X.shape[1]):
  if np.dot(w, X[:, i]) >= 0:
    y_pred.append(1)
  else:
    y_pred.append(0)

print(sum(y_pred))


19


## Common data for questions (7) to (10)

Consider the following dataset for a binary classification problem. The data-matrix $\mathbf{X}$ and the label vector $\mathbf{y}$ are given below.

In [128]:
# DATA CELL
# DO NOT EDIT THIS
X = np.array([
    [2, 2, 2, 3, -2, -2, -5, -3],
    [0, 1, -1, 0, 0, -3, 3, -1]
])
y = np.array(
    [-1, -1, -1, -1, 1, 1, 1, 1]
)

## Question-7 [3 marks]

Train a perceptron algorithm on this dataset. Cycle through the data-points from left to right, that is, $i = 0$ to $i = n - 1$. Make sure to strictly follow this cycle. If you make an update at data-point $i$, proceed to data-point $i + 1$ in the next iteration. Once you reach $n - 1$, cycle back to $0$.


Find the sum of the components of the weight vector and enter the nearest integer as your answer.

In [129]:
# Solution
# Run the data cell before running this

w = np.zeros(X.shape[0])
b = 0
epochs = 10

for _ in range(epochs):
  for i in range(X.shape[1]):
    x_i = X[:, i]
    y_i = y[i]

    if y_i * (np.dot(w, x_i) + b) <= 0:
      w = w + y_i * x_i
      b = b + y_i

print(round(np.sum(w)))

-2


## Question-8 [3 marks]

In the context of hard-margin SVM, find the optimal $\boldsymbol{\alpha}^{*}$ by solving the following dual optimization problem.

$$
\begin{equation*}
\underset{\boldsymbol{\alpha} \geqslant \mathbf{0}}{\max}\ \ \  \  \boldsymbol{\alpha}^{T}\mathbf{1} -\frac{1}{2}\boldsymbol{\alpha}^{T}\mathbf{Y}^{T}\mathbf{X}^{T}\mathbf{XY} \boldsymbol{\alpha}
\end{equation*}
$$

If the sum of the components of $\boldsymbol{\alpha}^{*}$ is $s$, find $\cfrac{1}{s}$ and enter the nearest integer to $\cfrac{1}{s}$ as the answer.

In [130]:
# Solution
# Run the data cell before running this

Y = np.diag(y)
gram_matrix = np.dot(np.dot(Y, X.T), np.dot(X, Y))

alpha = np.zeros(X.shape[1])

for _ in range(100):
  gradient = np.ones(X.shape[1]) - np.dot(gram_matrix, alpha)
  alpha += 0.01 * gradient
  alpha = np.maximum(0, alpha)

s = np.sum(alpha)

if s != 0:
  print(round(1 / s))
else:
  print(0)

4


## Question-9 [3 marks]

Find the optimal weight vector $\mathbf{w}^{*}$. If the sum of the components of $\mathbf{w}^{*}$ is $s$, find $\cfrac{1}{s}$ and enter the nearest integer to $\cfrac{1}{s}$ as the answer.

In [131]:
# Solution

Y = np.diag(y)
gram_matrix = np.dot(np.dot(Y, X.T), np.dot(X, Y))

alpha = np.zeros(X.shape[1])

for _ in range(100):
  gradient = np.ones(X.shape[1]) - np.dot(gram_matrix, alpha)
  alpha += 0.01 * gradient
  alpha = np.maximum(0, alpha)

w_star = np.dot(X, np.dot(Y, alpha))

s = np.sum(w_star)

if s != 0:
  print(round(1 / s))
else:
  print(0)


-2


## Question-10 [3 marks]

Find the number of support vectors.

In [132]:
# Solution

Y = np.diag(y)
gram_matrix = np.dot(np.dot(Y, X.T), np.dot(X, Y))

alpha = np.zeros(X.shape[1])

for _ in range(100):
  gradient = np.ones(X.shape[1]) - np.dot(gram_matrix, alpha)
  alpha += 0.01 * gradient
  alpha = np.maximum(0, alpha)

num_support_vectors = np.sum(alpha > 1e-5)
print(num_support_vectors)


5


## Common data for questions (11) and (12)

Consider a data-matrix $\mathbf{X}$. Mean center it and perform linear PCA.

In [133]:
# DATA CELL
# DO NOT EDIT THIS
X = np.array([
    [7, 5, -6, -11, 14.4, -2.8],
    [10.5, 7.5, -9, -16.5, 21.6, -4.2],
    [21, 15, -18, -33, 43.2, -8.4]
]).astype(np.float64)

In [134]:
# Solution
# Run the data cell before running this

X_mean = np.mean(X, axis=1, keepdims=True)
X_centered = X - X_mean

covariance_matrix = np.cov(X_centered)

eigenvalues, eigenvectors = np.linalg.eig(covariance_matrix)

sorted_indices = np.argsort(eigenvalues)[::-1]
eigenvalues = eigenvalues[sorted_indices]
eigenvectors = eigenvectors[:, sorted_indices]

print("Eigenvalues:", eigenvalues)
print("Eigenvectors:", eigenvectors)

Eigenvalues: [ 1.07540300e+03  2.42859249e-14 -2.42859249e-14]
Eigenvectors: [[-0.28571429 -0.57384592  0.57384592]
 [-0.42857143 -0.63982781 -0.79285339]
 [-0.85714286  0.51119588  0.20514472]]


## Question-11 [3 marks]

If the first PC is $\mathbf{w}_1$, let the sum of the components of $\mathbf{w}_1$ be $a$. Find the absolute value of $7a$ and enter the nearest integer as your answer.

In [135]:
# Solution

X_mean = np.mean(X, axis=1, keepdims=True)
X_centered = X - X_mean

covariance_matrix = np.cov(X_centered)

eigenvalues, eigenvectors = np.linalg.eig(covariance_matrix)

sorted_indices = np.argsort(eigenvalues)[::-1]
eigenvalues = eigenvalues[sorted_indices]
eigenvectors = eigenvectors[:, sorted_indices]

w1 = eigenvectors[:, 0]
a = np.sum(w1)
print(round(abs(7 * a)))

11


## Question-12 [3 marks]

Find the variance along the first PC. Enter the nearest integer as your answer.

In [136]:
# Solution

X_mean = np.mean(X, axis=1, keepdims=True)
X_centered = X - X_mean

covariance_matrix = np.cov(X_centered)

eigenvalues, eigenvectors = np.linalg.eig(covariance_matrix)

sorted_indices = np.argsort(eigenvalues)[::-1]
eigenvalues = eigenvalues[sorted_indices]
eigenvectors = eigenvectors[:, sorted_indices]

variance_along_first_pc = eigenvalues[0]

print(round(variance_along_first_pc))

1075


## Question-13 [3 marks]

Find the sum of the variances along the second and third PC. Enter the nearest integer as your answer.

In [137]:
# Solution

X_mean = np.mean(X, axis=1, keepdims=True)
X_centered = X - X_mean

covariance_matrix = np.cov(X_centered)

eigenvalues, eigenvectors = np.linalg.eig(covariance_matrix)

sorted_indices = np.argsort(eigenvalues)[::-1]
eigenvalues = eigenvalues[sorted_indices]
eigenvectors = eigenvectors[:, sorted_indices]

variance_along_second_pc = eigenvalues[1]
variance_along_third_pc = eigenvalues[2]

print(round(variance_along_second_pc + variance_along_third_pc))

0


## Common data for questions (14) to (16)

Consider a dataset $\mathbf{X}$ for a clustering problem. Start with the initial means as $\boldsymbol{\mu}_1 = (2, 3, 5)$ and $\boldsymbol{\mu}_2 = (-3, -5, -7)$ and run K-means with $k = 2$.

**Note**: Set `mu_1` and `mu_2` as `np.float64` arrays. You can set `dtype = np.float64` while creating the array. Use `np.array?` if you are still unsure about this. This is important for your final answer to match the ones we have configured.


In [138]:
# DATA CELL
# DO NOT EDIT THIS
X = np.array([
    [-8.2, 3.2, -5.1, 4., -7, 8.1, -2],
    [-10.1, 4.1, -3.3, 2., -5, 6.2, -5],
    [-3.4, 4.9, -3, 2., -5, 6.5, -5.3]
])

In [139]:
# Solution
# Run the data cell before running this

## Question-14 [3 marks]

Find the norm of the final mean $\boldsymbol{\mu}_1$. Enter the nearest integer as your answer.



In [140]:
# Solution

mu_1 = np.array([2, 3, 5], dtype=np.float64)
mu_2 = np.array([-3, -5, -7], dtype=np.float64)

max_iterations = 100
for _ in range(max_iterations):
    cluster_1_indices = []
    cluster_2_indices = []
    for i in range(X.shape[1]):
        x_i = X[:, i]
        dist_to_mu_1 = np.linalg.norm(x_i - mu_1)
        dist_to_mu_2 = np.linalg.norm(x_i - mu_2)
        if dist_to_mu_1 <= dist_to_mu_2:
            cluster_1_indices.append(i)
        else:
            cluster_2_indices.append(i)

    new_mu_1 = np.mean(X[:, cluster_1_indices], axis=1) if cluster_1_indices else mu_1
    new_mu_2 = np.mean(X[:, cluster_2_indices], axis=1) if cluster_2_indices else mu_2

    if np.array_equal(mu_1, new_mu_1) and np.array_equal(mu_2, new_mu_2):
        break

    mu_1 = new_mu_1
    mu_2 = new_mu_2


print(round(np.linalg.norm(mu_1)))

8


## Question-15 [3 marks]

Find the norm of the final mean $\boldsymbol{\mu}_2$. Enter the nearest integer as your answer.



In [141]:
# Solution

print(round(np.linalg.norm(mu_2)))


9


## Question-16 [3 marks]

Find the cluster to which the data-point $(0, 1, 2)$ belongs. Enter $1$ if it is closer to $\boldsymbol{\mu}_1$ than $\boldsymbol{\mu}_2$ and $2$ otherwise.

In [142]:
# Solution

x_i = np.array([0, 1, 2])
dist_to_mu_1 = np.linalg.norm(x_i - mu_1)
dist_to_mu_2 = np.linalg.norm(x_i - mu_2)

if dist_to_mu_1 <= dist_to_mu_2:
  print(1)
else:
  print(2)

1


## Question-17 [3 marks]

Fit a linear regression model on the dataset $(\mathbf{X}, \mathbf{y})$ and find the optimal weight vector $\mathbf{w}^{*}$ using the normal equations. Enter the nearest integer to the sum of the components of $\mathbf{w}^{*}$ as the answer.

In [143]:
# DATA CELL
# DO NOT EDIT THIS
X = np.array([
    [1., -1., 3., -2., 1.],
    [0., -2., 1., 0., 3.],
    [-2., 1., 0., 1., 2.],
    [1., -2., 3., -1., 4.]
])
y = np.array(
    [3.1, 4.9, -2.5, 10.3, -4.2]
)

In [144]:
# Solution
# Run the data cell before running this

X = X.T
w_star = np.linalg.solve(np.dot(X.T, X), np.dot(X.T, y))
print(round(np.sum(w_star)))

-9
