# NumPy

This shall give you an idea of the intricacies in this library and will make you appreciate why NumPy is soo useful. 

In [1]:
import numpy as np

## Vectorization

NumPy vectorization involves performing mathematical operations on entire
arrays, eliminating the need to loop through individual elements.

Let's compare the execution times of a non-vectorized program and a vectorized one.

Your goal is to multiply each element of the 2D arrays by 3. Implement this using both non-vectorized and vectorized approaches.



In [2]:
import time

arr_nonvectorized = np.random.rand(1000, 1000)
arr_vectorized = np.array(arr_nonvectorized) # making a deep copy of the array https://stackoverflow.com/questions/184710/what-is-the-difference-between-a-deep-copy-and-a-shallow-copy

start_nv = time.time()

# Non-vectorized approach
# <START>
for i in range(arr_nonvectorized.shape[0]):
    for j in range(arr_nonvectorized.shape[1]):
        arr_nonvectorized[i, j] *= 3    #multiply each element by 3

# <END>

end_nv = time.time()
print("Time taken in non-vectorized approach:", 1000*(end_nv-start_nv), "ms")

start_v = time.time()

# Vectorized approach
# <START>
arr_vectorized *= 3 
# <END>

end_v = time.time()
print("Time taken in vectorized approach:", 1000*(end_v-start_v), "ms")

# uncomment and execute the below line to convince yourself that both approaches are doing the same thing
# print(np.allclose(arr_nonvectorized, arr_vectorized))

Time taken in non-vectorized approach: 574.0635395050049 ms
Time taken in vectorized approach: 0.7131099700927734 ms


Perform Matrix Multiplication of A and B using vectorized and non-vectorized means and observe the time difference.

In [6]:
# Generate two random 500x500 matrices
A = np.random.rand(500, 500)
B = np.random.rand(500, 500)

# Non-vectorized matrix multiplication
C_nonvectorized = np.zeros((500, 500))  # Initialize result matrix
start_nv = time.time()


# <START: Non-vectorized approach>
for i in range(A.shape[0]):  #for multiplication - taking one by one element of matrix A
    for j in range(B.shape[1]): 
        for k in range(A.shape[1]): 
            C_nonvectorized[i, j] += A[i, k] * B[k, j]
# <END>

end_nv = time.time()
print("Time taken in non-vectorized approach:", 1000 * (end_nv - start_nv), "ms")

# Vectorized matrix multiplication
start_v = time.time()

# <START: Vectorized approach>
C_vectorized = np.dot(A, B) # in built function of numpy
# <END>

end_v = time.time()
print("Time taken in vectorized approach:", 1000 * (end_v - start_v), "ms")

# Uncomment and execute the below line to verify both approaches give the same result
# print(np.allclose(C_nonvectorized, C_vectorized))
print(np.allclose(C_nonvectorized, C_vectorized)) #comparing both approach answer  

Time taken in non-vectorized approach: 114874.7866153717 ms
Time taken in vectorized approach: 4.623889923095703 ms
True


Vectorization uses NuPy's low level operations to speed things up. Make sure you know why!

### :\\/: *no for loops alllowed  hereafter* :\\/:

## Broadcasting

You are given a set of 5 2D points as an ndarray and you want to compute the euclidean distance between each pair of points and store it into a 5*5 ndarray.

*Hint: use the* `np.linalg.norm()` *function for this*

In [4]:
import numpy as np

# Generate a random 5x2 array of points (values between 0 and 10)
points = np.random.rand(5, 2) * 10
print("2D Points:\n", points)

# Task: Compute pairwise Euclidean distances using broadcasting
# <START: Pairwise distance computation>
dif = points[:, np.newaxis, :] - points[np.newaxis, :, :]  #dif = difference
distance_matrix = np.linalg.norm(dif, axis=2) # Replace with broadcasting operation
# <END>

#  Print the distance matrix
print("Pairwise Euclidean Distance Matrix:\n", distance_matrix)


2D Points:
 [[4.904882   7.02216578]
 [8.33089315 7.02407187]
 [0.36919941 7.99228285]
 [3.01342308 3.06557339]
 [5.19301346 1.924154  ]]
Pairwise Euclidean Distance Matrix:
 [[0.         3.42601168 4.63826947 4.38545781 5.10614765]
 [3.42601168 0.         8.0203491  6.62911745 5.98794214]
 [4.63826947 8.0203491  0.         5.59145643 7.75186235]
 [4.38545781 6.62911745 5.59145643 0.         2.46037648]
 [5.10614765 5.98794214 7.75186235 2.46037648 0.        ]]


In [5]:
# 2 x 5 x 5
(9.71911312 - 4.1080346)**2 + (6.53592428-5.63094852)**2

32.30318328379296

## 