# NumPy

This shall give you an idea of the intricacies in this library and will make you appreciate why NumPy is soo useful. 

In [20]:
import numpy as np

## Vectorization

NumPy vectorization involves performing mathematical operations on entire
arrays, eliminating the need to loop through individual elements.

Let's compare the execution times of a non-vectorized program and a vectorized one.

Your goal is to multiply each element of the 2D arrays by 3. Implement this using both non-vectorized and vectorized approaches.



In [None]:
import time

arr_nonvectorized = np.random.rand(1000, 1000)
arr_vectorized = np.array(arr_nonvectorized) # making a deep copy of the array https://stackoverflow.com/questions/184710/what-is-the-difference-between-a-deep-copy-and-a-shallow-copy

start_nv = time.time()

# Non-vectorized approach
# <START>
for i in range(arr_nonvectorized.shape[0]):
    for j in range(arr_nonvectorized.shape[1]):
        arr_nonvectorized[i, j] *= 3
# <END>

end_nv = time.time()
print("Time taken in non-vectorized approach:", 1000*(end_nv-start_nv), "ms")

start_v = time.time()

# Vectorized approach
# <START>
arr_vectorized = arr_vectorized * 3

# <END>
end_v = time.time()
print("Time taken in vectorized approach:", 1000*(end_v-start_v), "ms")

# Verify both approaches yield the same result
print(np.allclose(arr_nonvectorized, arr_vectorized))

Time taken in non-vectorized approach: 284.8546504974365 ms
Time taken in vectorized approach: 1.7032623291015625 ms
True


Perform Matrix Multiplication of A and B using vectorized and non-vectorized means and observe the time difference.

In [22]:
# Generate two random 500x500 matrices
A = np.random.rand(500, 500)
B = np.random.rand(500, 500)

# Non-vectorized matrix multiplication
C_nonvectorized = np.zeros((500, 500))  # Initialize result matrix
start_nv = time.time()

# <START: Non-vectorized approach>
for i in range(500):
    for j in range(500):
        for k in range(500):
            C_nonvectorized[i, j] += A[i, k] * B[k, j]
# <END>

end_nv = time.time()
print("Time taken in non-vectorized approach:", 1000 * (end_nv - start_nv), "ms")

# Vectorized matrix multiplication
start_v = time.time()

# <START: Vectorized approach>
C_vectorized = A @ B
# <END>

end_v = time.time()
print("Time taken in vectorized approach:", 1000 * (end_v - start_v), "ms")

# Uncomment and execute the below line to verify both approaches give the same result
# print(np.allclose(C_nonvectorized, C_vectorized))

Time taken in non-vectorized approach: 77172.84226417542 ms
Time taken in vectorized approach: 6.3533782958984375 ms


Vectorization uses NuPy's low level operations to speed things up. Make sure you know why!

### :\\/: *no for loops alllowed  hereafter* :\\/:

## Broadcasting

You are given a set of 5 2D points as an ndarray and you want to compute the euclidean distance between each pair of points and store it into a 5*5 ndarray.

*Hint: use the* `np.linalg.norm()` *function for this*

In [None]:
import numpy as np

# Generate a random 5x2 array of points (values between 0 and 10)
points = np.random.rand(5, 2) * 10
print("2D Points:\n", points)

# Task: Compute pairwise Euclidean distances using broadcasting
# <START: Pairwise distance computation>
diff = points[:, None, :] - points[None, :, :]
distance_matrix = np.linalg.norm(diff, axis=2) 
# <END>

#  Print the distance matrix
print("Pairwise Euclidean Distance Matrix:\n", distance_matrix)


2D Points:
 [[9.35512595 1.18673139]
 [5.78135939 4.7596208 ]
 [2.17903632 9.96140667]
 [6.56409778 8.70361826]
 [6.42763212 1.09195227]]
Pairwise Euclidean Distance Matrix:
 [[ 0.          5.05344893 11.33539538  8.01831818  2.92902769]
 [ 5.05344893  0.          6.32734602  4.02091971  3.72417251]
 [11.33539538  6.32734602  0.          4.5618851   9.8345202 ]
 [ 8.01831818  4.02091971  4.5618851   0.          7.6128892 ]
 [ 2.92902769  3.72417251  9.8345202   7.6128892   0.        ]]


In [24]:
# 2 x 5 x 5
(9.71911312 - 4.1080346)**2 + (6.53592428-5.63094852)**2

32.30318328379296

## 