## Subtask 1

### Why NumPy is better than Python lists:

Loops in Python can be inefficient for computing similarities between thousands of users and products in a recommendation system, because loops check each item one by one, whereas NumPy vector operations can do it at once.
Loops can make mistakes from wrong indexes and hard to read code, whereas NumPy has broadcasting feature that does not need loops


## Subtask 2
### Dot Product and Cosine Similarity




In [14]:
import numpy as np
print("NumPy imported successfully.")

NumPy imported successfully.


In [15]:
def calculate_dot_product(vec1, vec2):
    return np.dot(vec1, vec2)

def calculate_cosine_similarity(vec1, vec2):
    dot_product = calculate_dot_product(vec1, vec2)
    norm_vec1 = np.linalg.norm(vec1)
    norm_vec2 = np.linalg.norm(vec2)

    if norm_vec1 == 0 or norm_vec2 == 0:
        return 0.0

    return dot_product / (norm_vec1 * norm_vec2)

vec_a = np.array([1, 2, 3])
vec_b = np.array([4, 5, 6])
vec_c = np.array([0, 0, 0])

dot_prod_ab = calculate_dot_product(vec_a, vec_b)
cos_sim_ab = calculate_cosine_similarity(vec_a, vec_b)

dot_prod_ac = calculate_dot_product(vec_a, vec_c)
cos_sim_ac = calculate_cosine_similarity(vec_a, vec_c)

print(f"Vector A: {vec_a}")
print(f"Vector B: {vec_b}")
print(f"Dot product of A and B: {dot_prod_ab}")
print(f"Cosine similarity of A and B: {cos_sim_ab:.4f}\n")

print(f"Vector C (zero vector): {vec_c}")
print(f"Dot product of A and C: {dot_prod_ac}")
print(f"Cosine similarity of A and C: {cos_sim_ac:.4f}")

Vector A: [1 2 3]
Vector B: [4 5 6]
Dot product of A and B: 32
Cosine similarity of A and B: 0.9746

Vector C (zero vector): [0 0 0]
Dot product of A and C: 0
Cosine similarity of A and C: 0.0000


## Subtask 3
### Calculate Pairwise Euclidean Distances

In [16]:
def calculate_pairwise_distance_matrix(matrix1, matrix2):
    squared_norm1 = np.sum(matrix1**2, axis=1).reshape(-1, 1)
    squared_norm2 = np.sum(matrix2**2, axis=1).reshape(1, -1)

    dot_prod_matrix = np.dot(matrix1, matrix2.T)

    squared_distances = squared_norm1 + squared_norm2 - 2 * dot_prod_matrix

    squared_distances = np.maximum(squared_distances, 0)

    dist_matrix = np.sqrt(squared_distances)

    return dist_matrix

mat_a = np.array([
    [1, 2],
    [3, 4],
    [5, 6]
])

mat_b = np.array([
    [1, 1],
    [7, 8]
])

pairwise_dist_ab = calculate_pairwise_distance_matrix(mat_a, mat_b)
print("Pairwise Distance Matrix between mat_a and mat_b:")
print(pairwise_dist_ab)
print("\n")

pairwise_dist_aa = calculate_pairwise_distance_matrix(mat_a, mat_a)
print("Pairwise Distance Matrix for mat_a against itself:")
print(pairwise_dist_aa)

Pairwise Distance Matrix between mat_a and mat_b:
[[1.         8.48528137]
 [3.60555128 5.65685425]
 [6.40312424 2.82842712]]


Pairwise Distance Matrix for mat_a against itself:
[[0.         2.82842712 5.65685425]
 [2.82842712 0.         2.82842712]
 [5.65685425 2.82842712 0.        ]]


## Subtask 4
### Vectorization Benefits



### Why Vectorization is Better

**Performance Boost:**  
Vectorized NumPy operations run faster than loops. NumPy uses optimized C code that handles large arrays quickly.  

**Cleaner Code:**  
Since loops are not used code is easier to read and follow, shorter and less error-prone

**Bonus: Reproducibility**  
Same results every time code is run
