Q1. How many multiplications and additions do you need to perform a matrix multiplication between a (n, k) and (k, m) matrix? Explain.

Q2. Write Python code to multiply the above two matrices. Solve using list of lists and then use numpy. Compare the timing of both solutions. Which one is faster? Why?

In [21]:
import numpy as np
import time

# Because NumPy is implemented in C and optimized for numerical operations it is much faster than python coded calculations. 
n, m, k = 100, 100, 1000

# Using List of list
matnk = [[j for j in range(k)] for _ in range(n)]
matkm = [[i for _ in range(m)] for i in range(k)]

start = time.time()

matnm = [[0 for _ in range(len(matkm[0]))] for _ in range(len(matnk))]
for i in range(len(matnk)):
    for j in range(len(matkm[0])):
        for k in range(len(matkm)):
            matnm[i][j] += matnk[i][k] * matkm[k][j]
# print(matnm)

end = time.time()
print("Time taken by using list of list:", end - start)

# Using numpy library
matnk = np.array(matnk)
matkm = np.array(matkm)

start = time.time()

matnm = np.dot(matnk, matkm)
# print(matnm)

end = time.time()
print("\nTime taken by using numpy:", end - start)

Time taken by using list of list: 3.127255439758301

Time taken by using numpy: 0.012060403823852539


Q3. Finding the highest element in a list requires one pass of the array. Finding the second highest element requires 2 passes of the the array. Using this method, what is the time complexity of finding the median of the array? Can you suggest a better method? Can you implement both these methods in Python and compare against numpy.median routine in terms of time?

In [25]:
import numpy as np
import time

arr = np.random.uniform(1, 10, size=(1000,))

# Using the above median method
start = time.time()

n = len(arr)
temp = arr
for i in range(n//2 + 1):
    ind = 0
    for j in range(1, len(temp)):
        if temp[j] > temp[ind]:
            ind = j
    if i == n//2:
        median1 = temp[ind]
    if i == n//2-1:
        median2 = temp[ind]
    temp = np.delete(temp, ind)

# print(f"Median Method\nArray:\n{np.sort(arr)}\n")
if n%2:
    print("The median of the above array is", median1)
else:
    print("The median of the above array is", (median1+median2)/2)

end = time.time()
print("Time taken by above median method:", end - start, end="\n\n")

# Using numpy library
start = time.time()

# print(f"Numpy Method\nArray:\n{np.sort(arr)}\n")
print("The median of the above array is", np.median(arr))

end = time.time()
print("Time taken by using numpy:", end - start)

The median of the above array is 5.474805043857562
Time taken by above median method: 0.11643815040588379

The median of the above array is 5.474805043857562
Time taken by using numpy: 0.0


Q4. What is the gradient of the following function with respect to x and y? (x^2)y + (y^3)sin(x)

Q5. Use JAX to confirm the gradient evaluated by your method matches the analytical solution corresponding to a few random values of x and y

In [3]:
import numpy as np
import jax
import jax.numpy as jnp

n = 3
def func(x, y):
    return (x**2)*y + (y**3)*jnp.sin(x)

# Analytical functions
def df_dx(x, y):
    return 2*x*y - (y**3)*jnp.cos(x)

def df_dy(x, y):
    return x**2 + 3*(y**2)*jnp.sin(x)

# JAX function
df = jax.grad(func, argnums=(0,1))

x_vals = np.random.uniform(0, 5, n)
y_vals = np.random.uniform(0, 5, n)

for i in range(n):
    gradx, grady = df(x_vals[i], y_vals[i])

    print(f"\nRandom Values {i + 1}: x = {x_vals[i]}, y = {y_vals[i]}")
    print("Analytical Gradient:")
    print("df/dx =", df_dx(x_vals[i], y_vals[i]))
    print("df/dy =", df_dy(x_vals[i], y_vals[i]))

    print("\nJAX Gradient:")
    print("df/dx =", gradx)
    print("df/dy =", grady)


Random Values 1: x = 1.5556410401862202, y = 0.7330552645789751
Analytical Gradient:
df/dx = 2.274772
df/dy = 4.0319443

JAX Gradient:
df/dx = 2.2867117
df/dy = 4.0319443

Random Values 2: x = 0.33285962077933884, y = 0.41202863649871246
Analytical Gradient:
df/dx = 0.20818566
df/dy = 0.2772087

JAX Gradient:
df/dx = 0.34040517
df/dy = 0.27720872

Random Values 3: x = 2.417274060528402, y = 4.389526839088069
Analytical Gradient:
df/dx = 84.56554
df/dy = 44.1455

JAX Gradient:
df/dx = -42.12278
df/dy = 44.1455


Q6. Use sympy to confirm that you obtain the same gradient analytically.

In [5]:
import sympy as sp

x, y = sp.symbols('x y')

df_dx = 2*x*y - (y**3)*sp.cos(x)
df_dy = x**2 + 3*(y**2)*sp.sin(x)

for i in range(n):
    gradx = df_dx.subs({x: x_vals[i], y: y_vals[i]})
    grady = df_dy.subs({x: x_vals[i], y: y_vals[i]})

    print(f"\nRandom Values {i + 1}: x = {x_vals[i]}, y = {y_vals[i]}")
    print("Sympy Analytical Gradient:")
    print("df/dx =", gradx)
    print("df/dy =", grady)


Random Values 1: x = 1.5556410401862202, y = 0.7330552645789751
Sympy Analytical Gradient:
df/dx = 2.27477193749641
df/dy = 4.03194397533027

Random Values 2: x = 0.33285962077933884, y = 0.41202863649871246
Sympy Analytical Gradient:
df/dx = 0.208185658030073
df/dy = 0.277208697757602

Random Values 3: x = 2.417274060528402, y = 4.389526839088069
Sympy Analytical Gradient:
df/dx = 84.5655410156099
df/dy = 44.1454965829047


Q7. Create a Python nested dictionary to represent hierarchical information. We want to store record of students and their marks. Something like:

1. 2022
    1. Branch 1
        1. Roll Number: 1, Name: N, Marks:
            1. Maths: 100, English: 70 …
    2. Branch 2
2. 2023
    1. Branch 1
    2. Branch 2
3. 2024
    1. Branch 1
    2. Branch 2
4. 2025
    1. Branch 1
    2. Branch 2

In [None]:
myDict = {
    2022: {
        "Branch 1": {
            1: {"Name": "N", "Marks": {"Maths": 100, "English": 70}},
        },
        "Branch 2": {},
    },
    2023: {
        "Branch 1": {},
        "Branch 2": {},
    },
    2024: {
        "Branch 1": {},
        "Branch 2": {},
    },
    2025: {
        "Branch 1": {},
        "Branch 2": {},
    },
}

Q8. Store the same information using Python classes. We have an overall database which is a list of year objects. Each year contains a list of branches. Each branch contains a list of students. Each student has some properties like name, roll number and has marks in some subjects.

In [None]:
class Student:
    def __init__(self, roll_no, name, marks):
        self.roll_no = roll_no
        self.name = name
        self.marks = marks

class Branch:
    def __init__(self, name):
        self.name = name
        self.students = []

    def add_student(self, student):
        self.students.append(student)
    
    def remove_student(self, student):
        self.students.remove(student)

class Year:
    def __init__(self, year):
        self.year = year
        self.branches = []

    def add_branch(self, branch):
        self.branches.append(branch)

# Creating instances and adding data
student1 = Student(1, "N", {"Maths": 100, "English": 70})
branch1 = Branch("Branch 1")
branch1.add_student(student1)
year2022 = Year(2022)
year2022.add_branch(branch1)

print(year2022.branches[0].students[0].roll_no)
print(year2022.branches[0].students[0].name)
print(year2022.branches[0].students[0].marks)

Q9. Using matplotlib plot the following functions on the domain: x = 0.5 to 100.0 in steps of 0.5.

1. y = x
2. y = x^2
3. y = (x^3)/100
4. y = sin(x)
5. y = sin(x)/x
6. y = log(x)
7. y = e^x

In [None]:
import numpy as np
import matplotlib.pyplot as plt

x = np.arange(0.5, 100.5, 0.5)

# Plot 1
plt.figure(figsize=(8, 5))
plt.plot(x, x)
plt.title("y = x")
plt.xlabel("x")
plt.ylabel("y")
plt.show()

# Plot 2
plt.figure(figsize=(8, 5))
plt.plot(x, x**2)
plt.title("y = x^2")
plt.xlabel("x")
plt.ylabel("y")
plt.show()

# Plot 3
plt.figure(figsize=(8, 5))
plt.plot(x, (x**3) / 100)
plt.title("y = (x^3)/100")
plt.xlabel("x")
plt.ylabel("y")
plt.show()

# Plot 4
plt.figure(figsize=(8, 5))
plt.plot(x, np.sin(x))
plt.title("y = sin(x)")
plt.xlabel("x")
plt.ylabel("y")
plt.show()

# Plot 5
plt.figure(figsize=(8, 5))
plt.plot(x, np.sin(x) / x)
plt.title("y = sin(x)/x")
plt.xlabel("x")
plt.ylabel("y")
plt.show()

# Plot 6
plt.figure(figsize=(8, 5))
plt.plot(x, np.log(x))
plt.title("y = log(x)")
plt.xlabel("x")
plt.ylabel("y")
plt.show()

# Plot 7
plt.figure(figsize=(8, 5))
plt.plot(x, np.exp(x))
plt.title("y = e^x")
plt.xlabel("x")
plt.ylabel("y")
plt.show()

Q10. Using numpy generate a matrix of size 20X5 containing random numbers drawn uniformly from the range of 1 to 2. Using Pandas create a dataframe out of this matrix. Name the columns of the dataframe as “a”, “b”, “c”, “d”, “e”. Find the column with the highest standard deviation. Find the row with the lowest mean.

In [17]:
import numpy as np
import pandas as pd

matrix = np.random.uniform(1, 2, size=(20, 5))
dataframe = pd.DataFrame(matrix, columns=["a", "b", "c", "d", "e"])
print(dataframe)

idx_std = dataframe.std().idxmax()
idx_mean = dataframe.mean(axis=1).idxmin()

print("\nColumn with the highest standard deviation:", idx_std)
print("\nRow with the lowest mean:", idx_mean)

           a         b         c         d         e
0   1.019019  1.238241  1.191206  1.818753  1.931990
1   1.421441  1.317777  1.682253  1.466280  1.387101
2   1.956290  1.631439  1.792911  1.492623  1.435476
3   1.275532  1.100105  1.966064  1.283422  1.149253
4   1.973688  1.145280  1.050439  1.838929  1.966177
5   1.704366  1.177241  1.343906  1.117690  1.638012
6   1.289707  1.993410  1.909164  1.130729  1.062088
7   1.415646  1.464829  1.957614  1.012360  1.981265
8   1.140199  1.669072  1.524210  1.882195  1.691139
9   1.661399  1.486167  1.337824  1.142010  1.007460
10  1.035255  1.450068  1.473599  1.791800  1.217783
11  1.354709  1.366573  1.969293  1.222725  1.556957
12  1.320973  1.449741  1.306355  1.785644  1.718468
13  1.251078  1.407828  1.873709  1.754001  1.220091
14  1.469908  1.599921  1.238640  1.883326  1.354486
15  1.628218  1.544638  1.438643  1.051106  1.097842
16  1.215149  1.039563  1.281455  1.692531  1.117518
17  1.922620  1.885746  1.304662  1.634517  1.

Q11. Add a new column to the dataframe called “f” which is the sum of the columns “a”, “b”, “c”, “d”, “e”. Create another column called “g”. The value in the column “g” should be “LT8” if the value in the column “f” is less than 8 and “GT8” otherwise. Find the number of rows in the dataframe where the value in the column “g” is “LT8”. Find the standard deviation of the column “f” for the rows where the value in the column “g” is “LT8” and “GT8” respectively.

In [18]:
dataframe["f"] = dataframe[["a", "b", "c", "d", "e"]].sum(axis=1)
print(dataframe, end="\n\n")

dataframe["g"] = np.where(dataframe["f"] < 8, "LT8", "GT8")
print(dataframe, end="\n\n")

lt8_rows = sum(dataframe["g"] == "LT8")
print("Number of rows where 'g' is 'LT8':", lt8_rows, end="\n\n")

std_lt8 = dataframe.loc[dataframe["g"] == "LT8", "f"].std()
std_gt8 = dataframe.loc[dataframe["g"] == "GT8", "f"].std()

print("Standard deviation of 'f' for 'LT8':", std_lt8, end="\n\n")
print("Standard deviation of 'f' for 'GT8':", std_gt8)

           a         b         c         d         e         f
0   1.019019  1.238241  1.191206  1.818753  1.931990  7.199209
1   1.421441  1.317777  1.682253  1.466280  1.387101  7.274852
2   1.956290  1.631439  1.792911  1.492623  1.435476  8.308738
3   1.275532  1.100105  1.966064  1.283422  1.149253  6.774376
4   1.973688  1.145280  1.050439  1.838929  1.966177  7.974513
5   1.704366  1.177241  1.343906  1.117690  1.638012  6.981215
6   1.289707  1.993410  1.909164  1.130729  1.062088  7.385100
7   1.415646  1.464829  1.957614  1.012360  1.981265  7.831714
8   1.140199  1.669072  1.524210  1.882195  1.691139  7.906814
9   1.661399  1.486167  1.337824  1.142010  1.007460  6.634860
10  1.035255  1.450068  1.473599  1.791800  1.217783  6.968505
11  1.354709  1.366573  1.969293  1.222725  1.556957  7.470258
12  1.320973  1.449741  1.306355  1.785644  1.718468  7.581180
13  1.251078  1.407828  1.873709  1.754001  1.220091  7.506706
14  1.469908  1.599921  1.238640  1.883326  1.354486  7

Q12. Write a small piece of code to explain broadcasting in numpy.

In [21]:
import numpy as np

matrix = np.random.uniform(0, 10, size=(3, 3))
print(f"Matrix before broadcasting:\n{matrix}\n")

matrix = matrix + 10
print(f"Matrix after broadcasting:\n{matrix}")

# In this example, the scalar 10 is broadcasted to each element of the original array A.
# This is achieved without explicitly creating an array of shape (2, 3) with all elements equal to 10.

Matrix before broadcasting:
[[5.61841807 0.380611   5.62350534]
 [9.4520257  8.74942013 2.14954005]
 [3.17236977 8.00546067 3.51965512]]

Matrix after broadcasting:
[[15.61841807 10.380611   15.62350534]
 [19.4520257  18.74942013 12.14954005]
 [13.17236977 18.00546067 13.51965512]]


Q13. Write a function to compute the argmin of a numpy array. The function should take a numpy array as input and return the index of the minimum element. You can use the np.argmin function to verify your solution.

In [32]:
import numpy as np

matrix = np.random.uniform(0, 10, size=(3, 3))
print(f"Matrix:\n{matrix}\n")

def compute_argmin(arr):
    arr = arr.ravel()
    min = 0
    for ind, value in enumerate(arr):
        if value < arr[min]:
            min = ind

    return min

argmin = compute_argmin(matrix)
print(f"Index of the minimum element: {argmin}\n")
print(f"Index of the minimum element: {np.argmin(matrix)}")

Matrix:
[[6.84384743 8.50832342 5.85092281]
 [1.29343371 0.81270386 5.16392949]
 [5.50503014 4.18527718 9.5772453 ]]

Index of the minimum element: 4

Index of the minimum element: 4
