# NumPy

This shall give you an idea of the intricacies in this library and will make you appreciate why NumPy is soo useful. 

In [3]:
import numpy as np

## Vectorization

NumPy vectorization involves performing mathematical operations on entire
arrays, eliminating the need to loop through individual elements.

Let's compare the execution times of a non-vectorized program and a vectorized one.

Your goal is to multiply each element of the 2D arrays by 3. Implement this using both non-vectorized and vectorized approaches.



In [87]:
import time

arr_nonvectorized = np.random.rand(1000, 1000)
arr_vectorized = np.array(arr_nonvectorized) # making a deep copy of the array https://stackoverflow.com/questions/184710/what-is-the-difference-between-a-deep-copy-and-a-shallow-copy

start_nv = time.time()

# Non-vectorized approach
# <START>
for i in range(0,1000):
    for j in range(0,1000):
        arr_nonvectorized[i,j] = arr_nonvectorized[i,j]*3
        j+=1
    i+=1
# <END>

end_nv = time.time()
print("Time taken in non-vectorized approach:", 1000000*(end_nv-start_nv), "ns")

start_v = time.time()

# Vectorized approach
# <START>
arr_vectorized = arr_vectorized*3
# <END>

end_v = time.time()
print("Time taken in vectorized approach:", 1000000*(end_v-start_v), "ns")

# uncomment and execute the below line to convince yourself that both approaches are doing the same thing
print(np.allclose(arr_nonvectorized, arr_vectorized))

Time taken in non-vectorized approach: 1024173.4981536865 ns
Time taken in vectorized approach: 15623.807907104492 ns
True


Perform Matrix Multiplication of A and B using vectorized and non-vectorized means and observe the time difference.

In [89]:
# Generate two random 500x500 matrices
A = np.random.rand(500, 500)
B = np.random.rand(500, 500)

A_v = np.array(A)
B_v = np.array(B)

# Non-vectorized matrix multiplication
C_nonvectorized = np.zeros((500, 500))  # Initialize result matrix
start_nv = time.time()

# <START: Non-vectorized approach>
for i in range(0,500):
    for j in range(0,500):
        C_nonvectorized[i,j] = A[i,j]*B[i,j]
        j+=1
    i+=1
#print(C_nonvectorized)
# <END>

end_nv = time.time()
print("Time taken in non-vectorized approach:", 1000000 * (end_nv - start_nv), "ns")

C_vectorized = np.zeros((500, 500))

# Vectorized matrix multiplication
start_v = time.time()

# <START: Vectorized approach>
C_vectorized = A_v*B_v
# <END>

end_v = time.time()
print("Time taken in vectorized approach:", 1000000 * (end_v - start_v), "ns")

# Uncomment and execute the below line to verify both approaches give the same result
print(np.allclose(C_nonvectorized, C_vectorized))

Time taken in non-vectorized approach: 339752.197265625 ns
Time taken in vectorized approach: 4621.267318725586 ns
True


Vectorization uses NumPy's low level operations to speed things up. Make sure you know why!

### :\\/: *no for loops alllowed  hereafter* :\\/:

## Broadcasting

You are given a set of 5 2D points as an ndarray and you want to compute the euclidean distance between each pair of points and store it into a 5*5 ndarray.

*Hint: use the* `np.linalg.norm()` *function for this*

In [132]:
import numpy as np

# Generate a random 5x2 array of points (values between 0 and 10)
points = np.random.rand(5, 2) * 10
print("2D Points:\n", points)

# Task: Compute pairwise Euclidean distances using broadcasting
# <START: Pairwise distance computation>
A = np.reshape(points, (5, 1, 2))
B = np.reshape(points, (1, 5, 2))
print("A:\n",A)
print("B:\n",B)
M = A - B
print("M:\n",M)
distance_matrix = np.linalg.norm(M,axis=2) # Replace with broadcasting operation
# <END>

#  Print the distance matrix
print("Pairwise Euclidean Distance Matrix:\n", distance_matrix)


2D Points:
 [[1.29242763 0.65795273]
 [1.06185208 7.41115292]
 [6.62657219 8.23091168]
 [4.45854415 3.769711  ]
 [8.49707578 4.17196032]]
A:
 [[[1.29242763 0.65795273]]

 [[1.06185208 7.41115292]]

 [[6.62657219 8.23091168]]

 [[4.45854415 3.769711  ]]

 [[8.49707578 4.17196032]]]
B:
 [[[1.29242763 0.65795273]
  [1.06185208 7.41115292]
  [6.62657219 8.23091168]
  [4.45854415 3.769711  ]
  [8.49707578 4.17196032]]]
M:
 [[[ 0.          0.        ]
  [ 0.23057556 -6.75320019]
  [-5.33414456 -7.57295895]
  [-3.16611652 -3.11175828]
  [-7.20464815 -3.51400759]]

 [[-0.23057556  6.75320019]
  [ 0.          0.        ]
  [-5.56472012 -0.81975876]
  [-3.39669208  3.64144191]
  [-7.43522371  3.2391926 ]]

 [[ 5.33414456  7.57295895]
  [ 5.56472012  0.81975876]
  [ 0.          0.        ]
  [ 2.16802804  4.46120067]
  [-1.87050359  4.05895135]]

 [[ 3.16611652  3.11175828]
  [ 3.39669208 -3.64144191]
  [-2.16802804 -4.46120067]
  [ 0.          0.        ]
  [-4.03853163 -0.40224932]]

 [[ 7.2046

In [None]:
# 2 x 5 x 5
(9.71911312 - 4.1080346)**2 + (6.53592428-5.63094852)**2

## 